Difference between revisions of "GlueX Software Meeting, April 14, 2020"

From GlueXWiki
Jump to: navigation, search
m (Agenda)
(added the minutes)
Line 29: Line 29:
 
# Review of [https://groups.google.com/forum/#!forum/gluex-software recent discussion on the GlueX Software Help List] (all)
 
# Review of [https://groups.google.com/forum/#!forum/gluex-software recent discussion on the GlueX Software Help List] (all)
 
# Action Item Review (all)
 
# Action Item Review (all)
 +
 +
== Minutes ==
 +
 +
Present: Alex Austregesilo, Edmundo Barriga, Thomas Britton, Mark Dalton, Sean Dobbs, Sergey Furlatov, Hovanes Egiyan, Mark Ito (chair), Igal Jaegle, Naomi Jarvis, David Lawrence, Hao Li, Keigo Mizutani, Justin Stevens, Simon Taylor, Beni Zihlmann
 +
<!-- There is a [https://bluejeans.com/s/7g3gz/ recording of his meeting] on the BlueJeans site. Use your JLab credentials to access it.-->
 +
=== Announcements ===
 +
 +
[https://github.com/JeffersonLab/gluex_MCwrapper/releases/tag/v2.4.0 MCwrapper: Schoolly D release].
 +
* Includes some changes to deal with the CCDB saga. No recent complaints about server overloading from MCwrapper.
 +
* Added option to use Slurm at JLab directly. Sub-option to use Slurm with a container.
 +
* Sean's fixes to skipping events seems to have worked.
 +
* N.B., run 41106 has a corrupt random trigger file.
 +
 +
=== Review of Minutes from the Last Software Meeting ===
 +
 +
We went over the [[GlueX Software Meeting, March 31, 2020#Minutes|minutes from March 31]].
 +
* Mark I. got a prescription for how to generate a sample database for CCDB 2.0 from Dmitry Romanov.
 +
* Thomas reported that he wrote a wiki page on [[Instructions for Generating Simulations]] as requested.
 +
 +
=== Random number seeds and the DL1MCTrigger_factory ===
 +
 +
Mark Dalton reported an issue with warnings he gets when running certain generators.
 +
 +
<pre>
 +
WARNING: Random seeds passed to DRandom2::SetSeeds have
 +
forbidden values:
 +
  seed = 1  (must be at least 2)
 +
  seed1 = 1064317597  (must be at least 8)
 +
  seed1 = 48  (must be at least 16)
 +
See comments in source for TRandom2::SetSeed(int)
 +
The seeds will all be adjusted to be in range.
 +
</pre>
 +
 +
The problem has been traced to libraries/TRIGGER/DL1MCTrigger_factory.cc. The code is fro Sascha Somov, in 2017. Richard prescribed a simple solution that does not really use the random number generator in the way it is supposed to be used, but will suppress warnings. There is also a "proper" solution, but that was a bit hard to glean from the email exchange Mark showed.
 +
 +
There is the question of whether there should be any random number generation in the reconstruction libraries at all. Our philosophy is to put any detector noise in mcsmear so that reconstruction, even if Monte Carlo data, is reproducible.
 +
 +
In the end we did come to a conclusion on a path forward. Choices are (a) leave it alone and ignore the warnings (b) implement the simple solution and suppress the warnings (c) fix it properly in DL1MCTrigger_factory.cc or (d) move the random noise generation to mcsmear. We may have to discuss this further.
 +
 +
=== Missing hypothesis timing (Justin)
 +
 +
#* Pull requests [https://github.com/JeffersonLab/halld_recon/pull/346 #346] and [https://github.com/JeffersonLab/halld_recon/pull/353 #353]
 +
 +
Justin has made a change ([https://github.com/JeffersonLab/halld_recon/pull/346 Pull Request #346]) that improved the resolution of time-of-arrival for kaons at the DIRC by referencing it to a fast detector like the Time-of-Flight. This had a side-effect of also changing timing for protons, causing a drop in proton yield that Alex noticed in the in the reconstruction tests. Justin submitted a fix ([https://github.com/JeffersonLab/halld_recon/pull/353 Pull Request #353) that left the protons alone while improving kaon time.
 +
 +
[The secretary refers the reader to Justin, Simon, and Alex for a more complete description of the problem and solution.]
 +
 +
=== Python 2 vs. Python 3 ===
 +
 +
Mark I. described recent work in getting a successful build of our software stack on Fedora 31. This distribution has Python 3 as the default and therefore demonstrates explicitly some of the problems we will have in making the transition from Python 2. His approach however was to make minimal changes so that our build system would explicitly choose to run Python 2 rather than going along with the default. Fedora comes with the packages to allow this choice in all cases of interest without having to build custom versions of Python or of Python-based code. He found a solution that builds everything, from scratch, and whose explicit Python 2 choices are supported on RHEL7. More testing is needed but he will submit a pull request in the next week or so. See [https://docs.google.com/presentation/d/13B8vi7qPFPDzz-9J599iBURg8AJ81GjOh2ZZIrJn0zc/edit?usp=sharing his slides] for details.
 +
 +
One piece that did not respond to this treatment is the set of Python modules built by halld_recon, to facilitate Python handling of HDDM files. They build under Python 3, but forcing a Python 2 build on Fedora is more complicated than for other components. (Note that on a Python-2-default system, they are building and will continue to build without modification.) This motivated him to look at splitting out the non-reconstruction-related pieces of the HDDM system from halld_recon, where they could be used by all packages that need them. For example, HDDM is currently one of the major components inside of halld_recon that causes halld_recon to be a prerequisite for halld_sim and hdgeant4. An independent HDDM build would be a big first step in breaking the dependency, one that has been causing us a lot of fuss and bother over the years. Again, see his slides for details.
 +
 +
=== hd_root not filling the skims ===
 +
 +
Just before the meeting started Hao [https://groups.google.com/forum/#!msg/gluex-software/9XS9CTgYRbY/dpf0suBYAAAJ posted a question] to the Software Help List on a crash he was seeing where the run number assigned during a simulation run depended on the node he was running on at CMU. We did not come to any conclusions about the source of his problem.
 +
 +
[Added in press: after the meeting and over the next three days (48 posts) there was a lot of discussion on this topic. In summary, we will be changing the way that the HDDM reader recognized the flavor of HDDM it is dealing with.]

Revision as of 14:39, 17 April 2020

GlueX Software Meeting
Tuesday, April 14, 2020
3:30 pm EDT
BlueJeans: 968 592 007

Agenda

  1. Announcements
    1. MCwrapper: Schoolly D release (Thomas)
  2. Review of Minutes from the Last Software Meeting (all)
  3. Random number seeds and the DL1MCTrigger_factory (Mark D.)
  4. Missing hypothesis timing (Justin)
  5. Python 2 vs. Python 3 (Mark I.)
    • More halld_recon splitting?
  6. Review of recent issues and pull requests:
    1. halld_recon
    2. halld_sim
    3. CCDB
    4. RCDB
  7. Review of recent discussion on the GlueX Software Help List (all)
  8. Action Item Review (all)

Minutes

Present: Alex Austregesilo, Edmundo Barriga, Thomas Britton, Mark Dalton, Sean Dobbs, Sergey Furlatov, Hovanes Egiyan, Mark Ito (chair), Igal Jaegle, Naomi Jarvis, David Lawrence, Hao Li, Keigo Mizutani, Justin Stevens, Simon Taylor, Beni Zihlmann

Announcements

MCwrapper: Schoolly D release.

  • Includes some changes to deal with the CCDB saga. No recent complaints about server overloading from MCwrapper.
  • Added option to use Slurm at JLab directly. Sub-option to use Slurm with a container.
  • Sean's fixes to skipping events seems to have worked.
  • N.B., run 41106 has a corrupt random trigger file.

Review of Minutes from the Last Software Meeting

We went over the minutes from March 31.

  • Mark I. got a prescription for how to generate a sample database for CCDB 2.0 from Dmitry Romanov.
  • Thomas reported that he wrote a wiki page on Instructions for Generating Simulations as requested.

Random number seeds and the DL1MCTrigger_factory

Mark Dalton reported an issue with warnings he gets when running certain generators.

WARNING: Random seeds passed to DRandom2::SetSeeds have
forbidden values:
  seed = 1  (must be at least 2)
  seed1 = 1064317597  (must be at least 8)
  seed1 = 48  (must be at least 16)
See comments in source for TRandom2::SetSeed(int)
The seeds will all be adjusted to be in range.

The problem has been traced to libraries/TRIGGER/DL1MCTrigger_factory.cc. The code is fro Sascha Somov, in 2017. Richard prescribed a simple solution that does not really use the random number generator in the way it is supposed to be used, but will suppress warnings. There is also a "proper" solution, but that was a bit hard to glean from the email exchange Mark showed.

There is the question of whether there should be any random number generation in the reconstruction libraries at all. Our philosophy is to put any detector noise in mcsmear so that reconstruction, even if Monte Carlo data, is reproducible.

In the end we did come to a conclusion on a path forward. Choices are (a) leave it alone and ignore the warnings (b) implement the simple solution and suppress the warnings (c) fix it properly in DL1MCTrigger_factory.cc or (d) move the random noise generation to mcsmear. We may have to discuss this further.

=== Missing hypothesis timing (Justin)

Justin has made a change (Pull Request #346) that improved the resolution of time-of-arrival for kaons at the DIRC by referencing it to a fast detector like the Time-of-Flight. This had a side-effect of also changing timing for protons, causing a drop in proton yield that Alex noticed in the in the reconstruction tests. Justin submitted a fix ([https://github.com/JeffersonLab/halld_recon/pull/353 Pull Request #353) that left the protons alone while improving kaon time.

[The secretary refers the reader to Justin, Simon, and Alex for a more complete description of the problem and solution.]

Python 2 vs. Python 3

Mark I. described recent work in getting a successful build of our software stack on Fedora 31. This distribution has Python 3 as the default and therefore demonstrates explicitly some of the problems we will have in making the transition from Python 2. His approach however was to make minimal changes so that our build system would explicitly choose to run Python 2 rather than going along with the default. Fedora comes with the packages to allow this choice in all cases of interest without having to build custom versions of Python or of Python-based code. He found a solution that builds everything, from scratch, and whose explicit Python 2 choices are supported on RHEL7. More testing is needed but he will submit a pull request in the next week or so. See his slides for details.

One piece that did not respond to this treatment is the set of Python modules built by halld_recon, to facilitate Python handling of HDDM files. They build under Python 3, but forcing a Python 2 build on Fedora is more complicated than for other components. (Note that on a Python-2-default system, they are building and will continue to build without modification.) This motivated him to look at splitting out the non-reconstruction-related pieces of the HDDM system from halld_recon, where they could be used by all packages that need them. For example, HDDM is currently one of the major components inside of halld_recon that causes halld_recon to be a prerequisite for halld_sim and hdgeant4. An independent HDDM build would be a big first step in breaking the dependency, one that has been causing us a lot of fuss and bother over the years. Again, see his slides for details.

hd_root not filling the skims

Just before the meeting started Hao posted a question to the Software Help List on a crash he was seeing where the run number assigned during a simulation run depended on the node he was running on at CMU. We did not come to any conclusions about the source of his problem.

[Added in press: after the meeting and over the next three days (48 posts) there was a lot of discussion on this topic. In summary, we will be changing the way that the HDDM reader recognized the flavor of HDDM it is dealing with.]