Difference between revisions of "Data Monitoring Procedures"

From GlueXWiki
Jump to: navigation, search
(Data Versions)
(Procedures)
Line 44: Line 44:
 
* [http://scicomp.jlab.org/farm2/walltime.html Walltime Distribution]
 
* [http://scicomp.jlab.org/farm2/walltime.html Walltime Distribution]
  
== Procedures ==
+
== Procedures: Overview ==
  
 
=== Offline Monitoring and Reconstruction: During Experimental Running ===
 
=== Offline Monitoring and Reconstruction: During Experimental Running ===
Line 80: Line 80:
 
* Browser png's: One tarball per launch
 
* Browser png's: One tarball per launch
  
=== Procedure Links ===
+
== Procedures: Details ==
  
 
* [[Offline_Monitoring_Incoming_Data | Offline Monitoring: Running Over Incoming Data]]
 
* [[Offline_Monitoring_Incoming_Data | Offline Monitoring: Running Over Incoming Data]]
 
* [[Offline_Monitoring_Archived_Data | Offline Monitoring: Running Over Archived Data]]
 
* [[Offline_Monitoring_Archived_Data | Offline Monitoring: Running Over Archived Data]]
 
* [[Offline_Monitoring_Post_Processing | Offline Monitoring: Post-Processing]]
 
* [[Offline_Monitoring_Post_Processing | Offline Monitoring: Post-Processing]]

Revision as of 12:39, 3 February 2016

Master List of File / Database / Webpage Locations

Run Conditions

  • Online Run-by-run condition files (B-field, current, etc.): /work/halld/online_monitoring/conditions/
  • Offline monitoring run conditions (software versions, jana config): /group/halld/data_monitoring/run_conditions/
  • Run Info vers. 1
  • Run Info vers. 2
  • RCDB

Monitoring Output Files

  • Run Periods 201Y-MM is for example 2015-03, launch ver verVV is for example ver15
  • Online monitoring histograms: /work/halld/online_monitoring/root/
  • Offline monitoring histogram ROOT files (merged): /work/halld/data_monitoring/RunPeriod-201Y-MM/verVV/rootfiles
  • individual files for each job (ROOT, REST, log, etc.): /volatile/halld/offline_monitoring/RunPeriod-201Y-MM/verVV/

Monitoring Database

  • Accessing monitoring database (on ifarm): mysql -u datmon -h hallddb.jlab.org data_monitoring

Monitoring Webpages

SciComp Job Links

Main

Documentation

Job Tracking

Procedures: Overview

Offline Monitoring and Reconstruction: During Experimental Running

During experimental running, the following offline monitoring procedures should be performed, each with a different gxprojN account, so that they don't interfere with each other:

  • Monitor the first 20 files of each newly-recorded run as soon as it hits the tape.
  • Every two weeks, do a monitoring launch over the first 20 files of all runs currently available on the tape.
  • As soon as a new group (e.g. ~100 runs) of data is initially semi-well calibrated, do a preliminary full reconstruction launch over all files in that group.
    • We can add user analysis plugins to this launch, including those with ROOT TTree output, provided that they work and don't take much memory.

Note that the monitoring is limited to the first 20 files of each run, because data is being recorded to tape at a faster rate than the monitoring can keep up with. Also, during the experimental run, each run will only be fully-reconstructed once, because it will be difficult enough to keep up with the incoming data.

Offline Monitoring and Reconstruction: After Experimental Running

After experimental running, the following offline monitoring procedures should be performed, each with a different gxprojN account, so that they don't interfere with each other:

  • Every two weeks, do a monitoring launch over the first 20 files of all runs currently available on the tape.
  • As soon as a new group (e.g. ~100 runs) of data is initially semi-well calibrated, do a preliminary full-reconstruction launch over all files in that group.
  • Every three months, if there have been significant improvements to the reconstruction / calibrations, do a new full-reconstruction launch over all of the data.
    • We can add user analysis plugins to this launch, including those with ROOT TTree output, provided that they work and don't take much memory.

Note that the monitoring is limited to the first 20 files of each run, since there will be a significant amount of data.

Saving to Tape (Write-thru Cache): Monitoring Launches

  • REST files: All files.
  • ROOT files: One merged file per run.
  • Job stdout/stderr: None
  • Browser png's: One tarball per launch

Saving to Tape (Write-thru Cache): Full Reconstruction Launches

  • REST files: All files.
  • ROOT files: All files, AND one merged file per run.
  • Job stdout/stderr: One tarball per run
  • Browser png's: One tarball per launch

Procedures: Details