Offline Monitoring Post Processing

From GlueXWiki
Revision as of 18:49, 31 October 2016 by Sdobbs (Talk | contribs)

Jump to: navigation, search


To visualize the monitoring data, we save images of selected histograms and store time series of selected quantities in a database, which are then displayed on the monitoring web pages. The results from different raw data files in a run are also combined in to single ROOT file per run, and other bookkeeping tasks are performed. This section describes how to generate the monitoring images and database information.

The post-processing scripts generally perform the following steps for each run:

  1. Summarize monitoring information from each EVIO file, store this information in a database
  2. Merge the monitoring ROOT files into a single file for the run
  3. Generate summary monitoring information for the run and store it in a database
  4. Generate summary monitoring plots and store these in a web-accessible location

The scripts used to generate this summary data are primarily run from /home/gxprojN/monitoring/process i.e. the same account from which the monitoring launch was performed. If you want a new copy of the scripts, e.g., for a new monitoring run, you should check the scripts out from SVN:

svn co

Note that these scripts depend on standard GlueX environment definitions to load the python modules needed to access MySQL databases and to process ROOT files.

Online Monitoring

When a DAQ run ends, the online monitoring system pushes two pieces of data to the lustre file system: ROOT files containing histograms from the online monitoring system and a text file containing some run condition information.

This data is processed by a cron script run under the "gluex" account that runs the following script:


This script runs the program which generates summary information. This python script automatically checks for new ROOT files, which it will then automatically process. It contains several configuration variables that must be correctly set, which contains the location of input/output directories, etc... The run meta-data processing is deprecated in favor of information from the RCDB.

IMPORTANT - When a new run period is started, a new data version must be created, and the scripts updated to reflect the new run period. You may want to update the run number range to scan as well.

Offline Monitoring


This sections gives instruction for post-processing different types of monitoring data

Incoming Data

Offline Monitoring Launch

Reconstruction Launch

Analysis Launch

Simulation Launch

This part of the system is out-of-date and will be updated after the next simulation launch.


After the data is run over, the results should be processed, so that summary data is entered into the monitoring database and plots are made for the monitoring webpages. Currently, this processing is controlled by a cronjob that runs the following script:


The default behavior of this script is as following: This script checks for new ROOT files, and only runs over those it hasn't processed yet. Since one monitoring ROOT file is produced for each EVIO file, whenever a new file is produced, the plots for the corresponding run are recreated and all the ROOT and REST files for a run are combined into single files. Information is stored in the database on a per-file basis and for the whole run.

This procedure has many options, and many of these steps can be toggled on and off. Look at the output of " -h" for more information.

Plots for the monitoring web page can be made from single histograms or multiple histograms using RootSpy macros. If you want to change the list of plots made, you must modify one of the following files:

  • histograms_to_monitor - specify either the name of the histogram or its the full ROOT path
  • macros_to_monitor - specify the full path to the RootSpy macro .C file

Note that the most time-consuming parts of this process are merging the ROOT and REST files.

Step-by-Step Instructions For Processing a New Offline Monitoring Run

The monitoring launches are currently run out of the gxproj1 and gxproj5 accounts. After an offline monitoring launch has been successfully started on the batch farm, the following steps should be followed to setup the post-processing for these runs.

  1. The post-processing scripts are stored in $HOME/monitoring/process and are automatically run by cron.
  2. Run "svn update" to bring any changes in. Be sure that the list of histograms and macros to plot are current.
  3. Add a new data version [as described below]
  4. Edit check_monitoring_data.csh to point to the current revisions/directories
    • ARGS
    • Note that the environment depends on a standard script - $HOME/setup_jlab.csh or $HOME/env_monitoring_launch
  5. Update files in the web directory, so that the results are displayed on the web pages: /group/halld/www/halldweb/html/data_monitoring/textdata

Check log files in $HOME/monitoring/process/log for more information on how each run went. If there are problems, check log files, and modify check_monitoring_data.csh to vary the verbosity of the output.

Data Versions

To document the conditions of the monitoring data that is created, for the sake of reproducability and further analysis we save several pieces of information. The format is intended to be comprehensive enough to document not just monitoring data, but versions of raw and reconstructed data, so that this database table can be used for the event database as well.

We store one record per pass through one run period, with the following structure:

Field Description
data_type The level of data we are processing. For the purposes of monitoring, "rawdata" is the online monitoring, "recon" is the offline monitoring
run_period The run period of the data
revision An integer specifying which pass through the run period this data corresponds to
software_version The name of the XML file that specifies the different software versions used
jana_config The name of the text file that specifies which JANA options were passed to the reconstruction program
ccdb_context The value of JANA_CALIB_CONTEXT, which specifies the version of calibration constants that were used
production_time The data at which monitoring/reconstruction began
dataVersionString A convenient string for identifying this version of the data

An example file used as as input to ./ is:

data_type           = recon
run_period          = RunPeriod-2014-10
revision            = 1
software_version    = soft_comm_2014_11_06.xml
jana_config         = jana_rawdata_comm_2014_11_06.conf
ccdb_context        = calibtime=2014-11-10
production_time     = 2014-11-10
dataVersionString   = recon_RunPeriod-2014-10_20141110_ver01