MCWrapper
Generating simulated data that correctly matches the "real" data you are analyzing requires coordinating several different pieces of software with different inputs, which can be very complex to do correctly! MCWrapper provides a framework to simulate events in GlueX that simplifies some of the complexity.
Contents
[hide]Generating Simulations
The MCWrapper software can be run on a local batch farm or on the Open Science Grid (OSG) using this website. There is a pdf manual, but it is not fully up-to-date.
For a recent tutorial on how to use MCWrapper check out this presentation.
- Generally you should start by submitting MC requests via the website.
- Usually you will be generating simulations to study the efficiency for some reaction that you are measuring with data that has already been taken. You want to use the latest versions of the simulation software, but the same software that was used to reconstruct the data, so that the simulated data matches the real data. We keep up-to-date suggestions for which software sets to use for different data sets, follow these instructions. We highly recommend making use of the pre-sets.
- Many data sets where reconstructed on a CentOS7 operating system. In order to make sure that there are no issues in reproducability and to build the reconstruction software that matches a specific beam time, we use a CentOS7 container. When running on the OSG the jobs will spawn in an Alma9 container and run the relevant bits of code in the CentOS7 container when needed (e.g. the reconstruction step).
- The efficiency depends on the beam intensity, and therefore is run-dependent. You want to generate events spread out over the range of runs from a given data set.
- Here are some common run ranges / RCDB requests: (GlueX Phase-I, Phase-II)
- 2017-01: runs 30274 - 31057, RCDB query: "@is_production and @status_approved"
- 2018-01: runs 40856 - 42559, RCDB query: "@is_2018production and @status_approved"
- 2018-08: runs 50685 - 51768, RCDB query: "@is_2018production and @status_approved and beam_on_current>49"
- 2019-11: runs 71350-73266, RCDB query: "@is_dirc_production and @status_approved"
- Here are some common run ranges / RCDB requests: (GlueX Phase-I, Phase-II)
- You also have to pass a configuration file for the event generator you are using. Some example configuration files can be found here. Another set of examples is here (should save some common examples). Note that there are some dummy values for things like beam parameters that get filled in by the system.
- After generating events, you can simulate additional particle decays using EvtGen. (more detailed documentation at the link)
- To do this, select "decay_evtgen" in the "Post-processing" pull-down menu
- There is room to specify a few different configuration files, but the only required one is the "decay_evtgen Config". This is the "user decay" file that specifies the details of the requested particles decays, as shown here.
- Always use Geant4 - the Geant3 option is kept for development purposes.
- You probably want to have some PART trees to analyze. Click the "Add Reaction" button to specify the same information that you supplied when requesting your analysis launch.
- The analysis ROOT trees are made in a separate step from the data reconstruction, though they use the same software package (halld_recon). Under the "analysis version set" tag, select the version listed for your analysis launch under the "Analysis Launches" tab in the Private Wiki.
- remember to only save what you need. If you want to save *.hddm files for specific studies set the ticks at the bottom. Be considerate in how many events you require for these studies as these files can get very large. You might want to submit two projects, one with few events where you save the *.hddm files, and one with more events where you just save the trees (and maybe REST files).
On the command line
MCWrapper can also be used on the command line. The tutorial mentioned above contains some pointers on how to get started.
In general, you will have to provide a MC.config file, specifying all job parameters, and give run number (range) and number of events you want to generate.
- These days MCwrapper is more and more optimised for running on the OSG.
- The version of MCwrapper used on the OSG is stored in /scigroup/mcwrapper/gluex_MCwrapper/. This is not necessarily the same as the latest release or the current master as it might contain some hotfixes needed to optimise OSG running.
- We recommend to use MCWrapper inside the standardised Alma9 container used on the OSG. Information on how to open a shell in a container can be found here (recommendation: use gxshell).
- You can send jobs off to the batch farm. Use the option BATCH_SYSTEM=swif2cont to launch the job in our standard container on the farm.
Managing Simulations
- When you submit a request through the form, you will get an email with information about your request
- Information about currently running jobs can be found on this page. Completed jobs can be found on this page.
- MC requests are listed in a table, one request per line, with various summary information available
- If you click normally on one of these table rows, additional tables below this main table will be filled with information on the individual jobs.
- If you right click on one of these tables, you can perform various actions on the workflow, including getting a copy of the MCWrapper configuration file for the request, copying the parameters of the request for a new simulation request, cancelling the request, and declaring it done
- There's usually a few jobs that won't finish for some reason. If your project is sitting at >98% complete for several days, just declare that it is done.
- All of the files are saved on the JLab farm under /cache/halld/gluex_simulations/REQUESTED_MC
Expert
As of June 2025, MCWrapper is maintained by Thomas Britton, Peter Hurck and Drew Smith.
- The offsite production on the OSG is done by user mcwrap. To get access to this account get in touch with the maintainers.
- the two main machines needed for MCWrapper OSG production are the submit host scosg2201 and the data transfer node dtn2303.
- The software used for offsite production is stored in /scigroup/mcwrapper/gluex_MCwrapper. Sometimes it might be necessary to develop or hotfix in-situ in that directory. Be mindful when you do this!
Random trigger
Random trigger files are stored in /work/osgpool/halld/random_triggers/. There is a cronjob rsync running on dtn2303 under user mcwrap, that keeps them up-to-date and copies new random trigger files over.
Testing
After jobs are submitted they are tested automatically on scosg2201.
Submission
Jobs are submitted from scosg2201 via different cronjobs running under user mcwrap.
Merging
Jobs send their data back to /work/osgpool/halld/REQUESTEDMC_OUTPUT/. On dtn2303 a cronjob is running MCBundle_wrapper.py when a project is done. The merging takes place in /export/halld/mcwrap/mergetemp/ and the results are stored in /work/osgpool/halld/REQUESTED_MC/.
Saving and Cleaning
A cronjob running jmigrate moves the merged files from /work/osgpool/halld/REQUESTED_MC/ to /cache/halld/gluex_simulations/REQUESTED_MC/. A scrubber removes the files from /work/osgpool/halld/REQUESTEDMC_OUTPUT/ and a cronjob cleanes empty directories from /work/osgpool/halld/REQUESTED_MC/ and /export/halld/mcwrap/mergetemp/ after the data has been moved to cache. Be aware, that in general directories in /export/halld/mcwrap/mergetemp/ WILL NOT be empty after completing the merge. They will contain hidden files which are used for checkpointing. Their total size is small but once in a while these should be cleared out.