GlueX Software Meeting, October 25, 2021

From GlueXWiki
Jump to: navigation, search

GlueX Software Meeting
Monday, October 25, 2021
9:30 am EDT
BlueJeans: 968 592 007

Agenda

  1. Announcements
    1. Power off of scosg16
    2. Compute Farm Updates
  2. Review of Minutes from the Last Software Meeting (all)
  3. Review of Minutes from the Last HDGeant4 Meeting (all)
  4. FAQ of the Fortnight: What are "directory tags?"
  5. Package Structure (Mark)
  6. Review of recent issues and pull requests:
    1. halld_recon
    2. halld_sim
    3. CCDB
    4. RCDB
    5. MCwrapper
    6. gluex_root_analysis
  7. Review of recent discussion on the GlueX Software Help List (all)
  8. Meeting time change? (all)
  9. Action Item Review (all)

Minutes

Present: Alex Austregesilo, Thomas Britton, Sean Dobbs, Mark Ito (chair), Igal Jaegle, Richard Jones, David Lawrence, Simon Taylor, Beni Zihlmann

There is a recording of this meeting. Log into the BlueJeans site first to gain access (use your JLab credentials).

Announcements

  1. Power off of scosg16. scosg16.jlab.org is going away. It will be replaced by scosg20.jlab.org as our OSG submit node.
    • scosg20 is two-factored. Use your JLab two-factor login method (i.e., the same as you use to log into hallgw.jlab.org, e.g., MobilePASS) to login in.
    • scosg20 is not a GLOBUS endpoint, as scosg16 was. Use a Data Transfer Node (DTN) instead. Ask the computer center for access if you do not have it already.
  2. Compute Farm Updates Nodes on the JLab farm will transitioning from CentOS 7.7 to 7.9 over the next months. As announced last week, CentOS 7.9 nodes with be assigned the BMS_OSNAME corresponding to 7.7 for the time being.
  3. CentOS Stream 8. Richard alerted us to the news that CERN and Fermilab are heading in the direction of CentOS Stream 8 as the next generation scientific computing platform. From slides shown at a CERN IT Technical-Users Meeting: "Going forward, we propose to target CentOS Stream 8 as the standard distribution for experiments."
    • David expressed the concern that with CentOS Stream streaming ahead, there might be OS-dependent gotcha's not captured by our versioning scheme. Richard proposed containers as a way around that. He also mentioned that the CERN/Fermi folks have operational experience with Stream without major difficulties, evidence of the lack of gotcha's.
    • David endorsed the idea of running in containers on the JLab farm. He pointed out that we already run containerized on all of the off-site platforms we use. Richard remarked that an added benefit of standardizing on containers is that non-JLab folks (e.g., Richard) could reproduce the OS exactly when troubleshooting reported problems, and not have to log into JLab to get the right OS.

Review of Minutes from the Last Software Meeting

We went over the minutes from the meeting on October 11th.

Doxygen and Documentation

Beni had been exploring the features of Doxygen by adding comments to the TOF and start counter code. He thinks there is a lot of promise in using Doxygen, but needs to learn the system a bit more before rendering judgment. He will give us a summary of what he has learned once he is farther along.

Richard brought up the use of Doxygen at an EIC simulation meeting with Makoto Asai in attendance. The feeling of the participants was that a PDF note was the best choice for the primary documentation vehicle. Doxygen was useful, but mainly to a small core of developers. This despite Geant4 having extensive Doxygen-oriented comments.

Having a summary of purpose and methods at the top of source files, using Doxygen syntax seems to be a consensus best practice.

DL1MCTrigger Crashes With mysql rcdb/ccdb

We had more discussion of halld_recon Issue #81.

  • Alex had posted an example that will produce the error reliably. Interestingly, it only occurs for SQLite CCDB files on a raw data file. The original post reported the bug in simulation, and only when using a MySQL server for CCDB. Alex example only has trouble when running multi-threaded.
  • There may be two different issues here.
  • Richard speculated that there is some sort of race condition when running multi-threaded.
  • Simon tried to reproduce the error under Valgrind, without success (i.e., it worked).

Review of Minutes from the Last HDGeant4 Meeting

We went over the minutes from the meeting on October 18. Igal will enforce an upper limit on the single-block FCAL energy in mcsmear to avoid values in excess of 8 GeV.

FAQ of the Fortnight: What are "directory tags?"

Mark led us through the FAQ.

Package Structure

Mark led us through a series of diagrams illustrating the dependency structure of our software packages along with a proposal for a new structure. Find his explanation starting at 55:54 in the recording.

He proposes a structure such that the simulation code (halld_sim and hdgeant4) does not depend on the "reconstruction" code (halld_recon). This should be possible since simulation is performed before reconstruction, everywhere and always. The current structure has halld_sim and hdgeant4 dependent on halld_recon, not because of the details of reconstruction, but due to other support functions that happen to live in the halld_recon repository, in particular I/O routines and magnetic-field handling.

To obtain independence of simulation from reconstruction, he proposed a new package, hd_interface, that captures all code from halld_recon upon which simulation depends. The weakness in the scheme is that a naive capture of source files from halld_recon into hd_interface, while allowing a build of halld_sim and hdgeant4 independent of halld_recon, likely gives an interface library that impacts reconstruction. The impact is clearly less than that which comes with dependence on the whole of halld_recon, still it diminishes the utility of the new interface library. The extent of the diminution needs review going forward.

Action Item Review

  1. Create a Wiki page of Doxygen links. (Mark)
  2. Add Alex to the gluex_admin team on GitHub. (Mark)
  3. Add non-privileged pull-request-re-test procedure to the FAQ.
  4. Figure out why Beni's Doxygen-related pull request does not return a result from the automatic test.