GlueX Software Meeting, April 14, 2020
GlueX Software Meeting
Tuesday, April 14, 2020
3:30 pm EDT
BlueJeans: 968 592 007
- MCwrapper: Schoolly D release (Thomas)
- Review of Minutes from the Last Software Meeting (all)
- Random number seeds and the DL1MCTrigger_factory (Mark D.)
- Missing hypothesis timing (Justin)
- Python 2 vs. Python 3 (Mark I.)
- More halld_recon splitting?
- Review of recent issues and pull requests:
- Review of recent discussion on the GlueX Software Help List (all)
- Action Item Review (all)
Present: Alex Austregesilo, Edmundo Barriga, Thomas Britton, Mark Dalton, Sean Dobbs, Sergey Furlatov, Hovanes Egiyan, Mark Ito (chair), Igal Jaegle, Naomi Jarvis, David Lawrence, Hao Li, Keigo Mizutani, Justin Stevens, Simon Taylor, Beni Zihlmann
- Includes some changes to deal with the CCDB saga. No recent complaints about server overloading from MCwrapper.
- Added option to use Slurm at JLab directly. Sub-option to use Slurm with a container.
- Sean's fixes to skipping events seems to have worked.
- N.B., run 41106 has a corrupt random trigger file.
Review of Minutes from the Last Software Meeting
We went over the minutes from March 31.
- Mark I. got a prescription for how to generate a sample database for CCDB 2.0 from Dmitry Romanov.
- Thomas reported that he wrote a wiki page on Instructions for Generating Simulations as requested.
Random number seeds and the DL1MCTrigger_factory
Mark Dalton reported an issue with warnings he gets when running certain generators.
WARNING: Random seeds passed to DRandom2::SetSeeds have forbidden values: seed = 1 (must be at least 2) seed1 = 1064317597 (must be at least 8) seed1 = 48 (must be at least 16) See comments in source for TRandom2::SetSeed(int) The seeds will all be adjusted to be in range.
The problem has been traced to libraries/TRIGGER/DL1MCTrigger_factory.cc. The code is from Sascha Somov, in 2017. Richard prescribed a simple solution that does not really use the random number generator in the way it is supposed to be used, but will suppress warnings. There is also a "proper" solution, but that was a bit hard to glean from the email exchange Mark showed.
There is the question of whether there should be any random number generation in the reconstruction libraries at all. Our philosophy is to put any detector noise in mcsmear so that reconstruction, even if Monte Carlo data, is reproducible.
In the end we did come to a conclusion on a path forward. Choices are (a) leave it alone and ignore the warnings (b) implement the simple solution and suppress the warnings (c) fix it properly in DL1MCTrigger_factory.cc or (d) move the random noise generation to mcsmear. We may have to discuss this further.
Missing hypothesis timing
Justin has made a change (Pull Request #346) that improved the resolution of time-of-arrival for kaons at the DIRC by referencing it to a fast detector like the Time-of-Flight. This had a side-effect of also changing timing for protons, causing a drop in proton yield that Alex noticed in the in the reconstruction tests. Justin submitted a fix (Pull Request #353) that left the protons alone while improving kaon time.
[The secretary refers the reader to Justin, Simon, and Alex for a more complete description of the problem and solution.]
Python 2 vs. Python 3
Mark I. described recent work in getting a successful build of our software stack on Fedora 31. This distribution has Python 3 as the default and therefore demonstrates explicitly some of the problems we will have in making the transition from Python 2. His approach however was to make minimal changes so that our build system would explicitly choose to run Python 2 rather than going along with the default. Fedora comes with the packages to allow this choice in all cases of interest without having to build custom versions of Python or of Python-based code. He found a solution that builds everything, from scratch, and whose explicit Python 2 choices are supported on RHEL7. More testing is needed but he will submit a pull request in the next week or so. See his slides for details.
One piece that did not respond to this treatment is the set of Python modules built by halld_recon. These facilitate Python handling of HDDM files. They build under Python 3, but forcing a Python 2 build on Fedora is more complicated than for other components. (Note that on a Python-2-default system, they are building and will continue to build without modification.) The way that the Python module building is incorporated into the halld_recon SCons build system is complicated. This motivated him to look at splitting out the non-reconstruction-related pieces of the HDDM system from halld_recon, where the system would be simpler to work on. In addition, HDDM could then be versioned independently from changes in our reconstruction algorithms. HDDM is currently one of the major components inside of halld_recon that causes halld_recon to be a prerequisite for halld_sim and hdgeant4. An independent HDDM build would be a big first step in breaking the dependency, one that has been causing us a lot of fuss and bother over the years. Again, see his slides for details.
hd_root not filling the skims
Just before the meeting started Hao posted a question to the Software Help List on a crash he was seeing where the run number assigned during a simulation run depended on the node he was running on at CMU. We did not come to any conclusions about the source of his problem.
[Added in press: after the meeting and over the next three days (48 posts) there was a lot of discussion on this topic. In summary, we will be changing the way that the HDDM reader recognized the flavor of HDDM it is dealing with.]