Running jobs on the grid
This page is to be used for documenting the recent work done with getting jobs to run on the open science grid. The following are the steps and issues encountered. A HowTo exists for getting started running jobs on the grid. You should look there for instructions and examples of how to do what is being done here.
One of the goals for using the open science grid (OSG) is to have a large set of generic background MC that may be used for various analyses. Ideally, this MC set may be re-generated easily whenever a major update occurs in the hall D simulation and reconstruction code. The most efficient way to store this data right now is to keep only the reconstructed hddm files, which may be analyzed with hall D or your own analysis code on the grid. The resultant root files can then be downloaded to a local machine.
The following steps are general, but are used in the analysis of gamma p -> pi+ pi- pi+ n.
We begin by building the hall D source code and any analysis code on a designated space on the grid. In this particular instance, that space is /nfs/direct/apps/Gluex/pi-pi-pi-n. You should note that bggen and any necessary plugins (danahddm) must be built in addition to the hall D source code. This can be done using submission and executable files like those you can find on the HowTo.
Once the source code is built, we use an executable called run_sim.sh (plus two arguments - random number seed and number of generated events) to run bggen, hdgeant and mcsmear. The resultant hdgeant_smeared.hddm file is sent to the grid storage space. Afterwards, this executable links into the next, called run_ana.sh, which runs the hd_ana program with the danahddm plugin. This is an empty analysis, but the result is that the reconstruction of events is saved to a new file called dana_events.hddm, which is sent to storage. Again, this links into another executable, run_3pi.sh, which runs my specific analysis program and outputs a root file. All unnecessary output is destroyed.