GlueX Offline Meeting, April 6, 2018

From GlueXWiki
Jump to: navigation, search

GlueX Offline Software Meeting
Friday, April 6, 2018
10:00 am EST
JLab: CEBAF Center A110
BlueJeans: 968 592 007

Agenda

  1. Announcements
    1. New tag of sim-recon for simulation: recon-2017_01-ver02-sim_ver01
    2. Meeting on Containers, 11:30 today, A110
    3. Launches (Alex A.)
    4. Software Review, Summer 2018
  2. Review of minutes from the March 9 meeting (all)
  3. Report from the GlueX Containers meeting on March 30 (Mark)
  4. HDvis Update (Thomas)
  5. Review of recent pull requests (all)
  6. Review of recent discussion on the GlueX Software Help List (all)
  7. Action Item Review (all)

Communication Information

Remote Connection

Slides

Talks can be deposited in the directory /group/halld/www/halldweb/html/talks/2018 on the JLab CUE. This directory is accessible from the web at https://halldweb.jlab.org/talks/2018/ .

Minutes

Present:

  • JLab: Alex Austregesilo, Amber Boehnlein, Thomas Britton, Mark Ito (chair), David Lawrence, Simon Taylor, Beni Zihlmann

There is a recording of this meeting on the BlueJeans site. Use your JLab credentials to access it.

Announcements

  1. New tag of sim-recon for simulation: recon-2017_01-ver02-sim_ver01. This tag is meant for use by simulations.
  2. Meeting on Containers, 11:30 today, A110. Note that future containers meetings will be in A110 as well.
  3. Launches. Alex A. is starting a new analysis launch on 2016 data for Lubomir Pentchev.
  4. Software Review, Summer 2018. Mark reported that there are plans for another software review, perhaps early in the Summer.
  5. SQLiteCpp and SQLite libraries. Mark reports that there are still some residual problems with these libraries in the builds. He is working on a solution and a new version should be coming out soon with fixes. One holding item is to see if the recent fix to the memory leak problem in tracking does the job.

Review of minutes from the March 9 meeting

We went over the minutes without much comment.

Report from the GlueX Containers meeting on March 30

Mark reviewed his notes from the meeting. Two major topics from that meeting:

  1. Mark reviewed the talk he gave at the OSG All-Hands meeting on March ??? He emphasized the new user-friendly access to the OSG for GlueX collaborators and raised the idea of doing raw data reconstruction on the OSG.
  2. Richard reported on work he has done using XROOTD to stream data over the network as input to jobs running on the OSG. This solves the problem of getting our data to a remote grid node when the identity/location of the node is unknown until the job starts. It is the solution that Atlas and CMS have been using for years and relies on mature technology. We hope to set up a system on the Submit Host in the near future.

For all of the details, see the notes at the link given above.

Progress on Running Simulation on the OSG

Thomas described the latest go-around submitting jobs to the OSG using MCwrapper.

  • There was no wait for jobs to start up. They were running minutes after they were submitted.
  • We observed use of a large fraction (30-40%) of the 1 Gb interface on the Submit Node at multi-job-start-up time.
  • Amber mentioned that in the interim report to the Office of Nuclear Physics, the ability to plan large simulation campaigns, outside of peak demand relief was not mentioned. Neither was the ability to reconstruct raw data.
  • Amber reported that NERSC is committed devoting resourced to GlueX reconstruction as part of their strategic agenda. They need to have specific deliverables and so we need to identify appropriate data-sets/code-versions for running there. We hope to do this this summer.
  • Alex mentioned bookkeeping as an issue when running on the OSG. Indeed when the data challenges were run, the OSG component was not tracked with a database, only the JLab-resident jobs were. That used a database developed by Mark. On the batch farm, using SWIF, Alex is tracking jobs with the SWIF database (although we do not have SQL-query level access to the SWIF database). Mark thought that bookkeeping is actually where most of the GlueX development manpower will be spent in getting these large-scale campaigns under control. Functions include basic job info, resource use, job status, evaluation of job success, re-submission on failure, and permanent storage and cataloging of output files. Amber mentioned a Fermilab project that we might want to look at. She remarked that the need is global to all groups at the lab and beyond and a common solution would avoid duplication of effort.

HDvis Update

Thomas has added new features to HDvis:

  1. Scrubber bar: used to click and drag time-in-event.
  2. Camera snap-to's: Fixed views of the detector, such as "barrel top" or "TOF".
  3. Context menus for choosing mass hypothesis: right click and choose particle type for charged tracks. Tracks redrawn with chosen hypothesis.

He is working on a method for implementing a "next event" that will work for multiple users running independent browsers on the same node. See the demo displaying a J/ψ event at ??? in the recording.

Monitoring Dashboard

Sean sent email proposing a monitoring dashboard (with colors) that would display all relevant monitoring results at a glance with links for digging deeper when necessary.

  • Alex would like to see things like ρ yield.
  • We might want to add Monte Carlo data to the recon_test.
  • There is simulation test that runs twice a week. Not everyone is on the email list for results.
  • David is looking at InfluxDB as a way to look at selected quantities as a function of time or run number.
  • We will meet with Sean to flesh out issues when he is here next week.

Action Items

  1. Put MC into recon_test.
  2. Schedule a monitoring dashboard meeting.
  3. Put together a new release that fixes SQLite issue, MC treatment of the CDC, and the memory leak.