Difference between revisions of "GlueX Offline Meeting, September 2, 2015"

From GlueXWiki
Jump to: navigation, search
(Agenda)
m (Text replacement - "http://argus.phys.uregina.ca/cgi-bin/public" to "https://halldweb.jlab.org/doc-public")
 
(4 intermediate revisions by 2 users not shown)
Line 12: Line 12:
 
## Work disk expansion: 14 to 15 TB.
 
## Work disk expansion: 14 to 15 TB.
 
## Mailman.jlab.org blocked from offsite.
 
## Mailman.jlab.org blocked from offsite.
## [http://argus.phys.uregina.ca/cgi-bin/public/DocDB/ShowDocument?docid=2793 New note on GlueX builds] has been released.
+
## [https://halldweb.jlab.org/doc-public/DocDB/ShowDocument?docid=2793 New note on GlueX builds] has been released.
## Offline Monitoring (Kei)
+
## [https://halldweb1.jlab.org/wiki/images/1/10/2015-09-02-offline_monitoring.pdf Offline Monitoring] (Kei)
 
# Review of [[GlueX Offline Meeting, August 19, 2015#Minutes|minutes from August 19]] (all)
 
# Review of [[GlueX Offline Meeting, August 19, 2015#Minutes|minutes from August 19]] (all)
 
# [[GlueX-Collaboration-Oct-2015|Collaboration Meeting]] October 8-10, 2015 at Jefferson Lab
 
# [[GlueX-Collaboration-Oct-2015|Collaboration Meeting]] October 8-10, 2015 at Jefferson Lab
Line 22: Line 22:
 
#* You might want to start a discussion about how to set conditions for simulations for a possible fall run (or generic next running conditions).  We decided to require a separate CCDB variation name, but I don't think there was a decision on run numbers?
 
#* You might want to start a discussion about how to set conditions for simulations for a possible fall run (or generic next running conditions).  We decided to require a separate CCDB variation name, but I don't think there was a decision on run numbers?
 
# Geant4 Update (Richard, David)
 
# Geant4 Update (Richard, David)
# Some problems with f250 algorithms?
+
# [https://mailman.jlab.org/pipermail/halld-offline/2015-August/002138.html Some problems with f250 algorithms?]
 
# Action Item Review
 
# Action Item Review
  
Line 35: Line 35:
  
 
Talks can be deposited in the directory <code>/group/halld/www/halldweb/html/talks/2015</code> on the JLab CUE. This directory is accessible from the web at https://halldweb.jlab.org/talks/2015/ .
 
Talks can be deposited in the directory <code>/group/halld/www/halldweb/html/talks/2015</code> on the JLab CUE. This directory is accessible from the web at https://halldweb.jlab.org/talks/2015/ .
 +
==Minutes==
 +
 +
Present:
 +
* '''CMU''': Curtis Meyer, Mike Staib
 +
* '''FIU''': Mahmoud Kamel
 +
* '''FSU''': Brad Cannon, Aristeidis Tsaris
 +
* '''JLab''': Alex Barnes, Amber Boehnlein, Mark Dalton, Mark Ito (chair), Paul Mattione, Eric Pooser, Nathan Sparks, Justin Stevens, Simon Taylor, Beni Zihlmann
 +
* '''UConn''': Richard Jones, James McIntyre
 +
 +
There is a [https://bluejeans.com/s/8xII/ recording of this meeting] on the BlueJeans site.
 +
 +
===Announcements===
 +
 +
<ol>
 +
<li> The '''location of the nightly builds''' at JLab has changed.
 +
  <ul>
 +
  <li> Old location example:
 +
    <ul>
 +
    <li><code>/u/scratch/gluex/nightly/2015-09-02/sim-recon</code>
 +
    </ul>
 +
  <li> New location example:
 +
    <ul>
 +
    <li><code>/u/scratch/gluex/nightly/2015-09-02/Linux_RHEL6-x86_64-gcc4.4.7/sim-recon</code>
 +
    </ul>
 +
  </ul>
 +
See the [[Nightly Builds of GlueX Software|nightly builds wiki page]] for more on the system.
 +
<li> '''Sim-recon 1.5.1 has been released'''. The [https://github.com/JeffersonLab/sim-recon/releases/tag/1.5.1 release notes] are posted at GitHub.
 +
<li> We are now '''deleting CCDB SQLite databases older than one year'''. Formerly we were never deleting the monthly files. See the [[SQLite-form_of_the_CCDB_database#Archive_of_SQLite_Files|wiki page]] describing the system for more information.
 +
<li> After filling up last week, our '''work disk was expanded''' from 14 to 15 TB. Further expansion will require movement of the partition to a new file server.
 +
<li> '''Mailman.jlab.org has been blocked from off-site''' for the past week or two due to cyber attacks. This restricts access to the mail archives and list administration pages, but does not affect the operation of the email lists. Access is not restricted from inside the JLab firewall.
 +
<li> Mark drew our attention to his '''[https://halldweb.jlab.org/doc-public/DocDB/ShowDocument?docid=2793 new note on GlueX builds]'''.
 +
<li> '''Kei gave a report on the last launch'''. See [https://halldweb.jlab.org/wiki/images/1/10/2015-09-02-offline_monitoring.pdf his slides] for the details.
 +
  <ul>
 +
  <li> There were a lot of problems this time.
 +
  <li> Kei continues to feedback to Chris Larrieu on SWIF developments.
 +
  <li> The memory footprint of the jobs nearly doubled.
 +
  <li> The next launch is scheduled for September 12.
 +
  </ul>
 +
</ol>
 +
 +
===Review of minutes from August 19===
 +
 +
We went over [[GlueX Offline Meeting, August 19, 2015#Minutes|the minutes]].
 +
 +
====Hall D Package Manager====
 +
 +
Nathan made changes to support the use of a version.xml to create templates, as per feedback from the meeting.
 +
 +
====Flash ADC Pulse Processing====
 +
 +
David convened a meeting of the interested parties. We will ask for a report at a future meeting.
 +
 +
====Who is responsible for deleting Git branches?====
 +
 +
Our policy of having the branch creator delete branches after they have been merged is still in place, though compliance has been spotty.
 +
 +
====Clang on the CUE====
 +
 +
Nathan reported that Clang is already in stalled on the ifarm, but was compiled against GCC 4.8.3, whereas the default compiler there is GCC 4.4.7. That means that all packages might have to be compiled with the newer version. The system for supporting that for GlueX software has not been worked out.
 +
 +
===Geant4 Update===
 +
 +
* Richard has now configured the geometry to have the same magnetic field in all layers. This makes it insensitive to the algorithm Geant4 uses for picking the field-setting geometry layer. He is deferring an evaluation of the CPU cost of keeping the field on in the calorimeter volumes for a later stage of development work.
 +
* He looked at how the current implementation performs with multi-threading turned on in Geant4. Initially the program was crashing. He traced the problem to a bug in the way geometry information is shared by threads. David had previously seen this and a fix was attempted by the Geant4 team. Unfortunately, the fix leads to other serious problems. Richard has put together his own patch in user code so he can keep going. He has also communicated his ideas for a more fundamental fix to the Geant4 folks.
 +
* With multi-threading working properly, he sees perfect scaling of performance on an 8-core machine, up to 6 threads (as high as he tried).
 +
* The next step is to put together an end-to-end scheme for hit generation for a single detector system. That way others who might want to contribute will have a complete example to follow. He is planning on doing this with the CDC.
 +
 +
===Collaboration Meeting Agenda===
 +
 +
We discussed the agenda for the upcoming [[GlueX-Collaboration-Oct-2015|Collaboration Meeting]], October 8-10, 2015 at Jefferson Lab. Possible talks and speakers are:
 +
 +
* Overview talk in including data challenge 3 report (Mark)
 +
* Geant4 Development Update (Richard)
 +
* Version Management System (Mark)
 +
* Offline Monitoring and SWIF (Kei)
 +
* Calibration Overview (Sean Dobbs)
 +
 +
===Data Challenge 3 update===
 +
 +
Mark has run through the fake raw data set twice now. He showed [[Data Challenge 3|plots]] of the number of jobs in various stages of batch farm progress as a function of time. The fluctuations are large and depend largely on what other users on the farm are doing.
 +
 +
He is currently working on converting the job submission process to use SWIF to see if that helps throughput.
 +
 +
===Commissioning Simulations===
 +
 +
Mark transmitted some thoughts from Sean, sent in an email.
 +
 +
* '''[[Spring 2015 Commissioning Simulations]]'''. The plan is to generate a few thousand files with the latest tagged sim-recon version only for run 9306.
 +
* '''Fall 2015 Commissioning Simulations'''. Sean thought we should start a discussion about how to set conditions for simulations for a possible Fall run. The Calibration Group has already decided to require a separate CCDB variation name.
 +
** Justin thought that we should have production underway by the time of the collaboration meeting so people can estimate yields for the run.
 +
** Given the uncertainty in the conditions we might have, we thought that we will probably have to just pick some parameters and simulate those, say a 9 GeV coherent edge and 1300 A magnet current.
 +
 +
===Some problems with F125 algorithms?===
 +
 +
Mike led us through [https://mailman.jlab.org/pipermail/halld-offline/2015-August/002138.html his recent email] describing problems related to FADC hit in the BCAL with negative pulse integrals. The problem occurs for Mode 7 data and can be understood by looking at Mode 8 data where the read-out threshold is greater than 1/2 the peak amplitude. The result is early times for pulses late in the window.
 +
 +
We noted that there are two approaches on could take for Mode 8 data: emulated the Mode 7 algorithm, bugs and all, or modify the algorithm to give rational results in the problematic cases. There is a group meeting (see Review of Minutes above) on the emulation and we thought that they would very likely be working on an formulating an approach to the problem.
 +
 +
===Action Items===
 +
 +
* Convert DC3 to use SWIF (Mark)
 +
* Re-do Spring simulation with new release (Mark)
 +
* Implement a complete hit generation scheme for the CDC in Geant4 (Richard)

Latest revision as of 15:44, 24 February 2017

GlueX Offline Software Meeting
Wednesday, September 2, 2015
1:30 pm EDT
JLab: CEBAF Center F326/327

Agenda

  1. Announcements
    1. Location of nightly builds has changed.
    2. sim-recon 1.5.1 has been released.
    3. Now deleting CCDB SQLite databases older than one year.
    4. Work disk expansion: 14 to 15 TB.
    5. Mailman.jlab.org blocked from offsite.
    6. New note on GlueX builds has been released.
    7. Offline Monitoring (Kei)
  2. Review of minutes from August 19 (all)
  3. Collaboration Meeting October 8-10, 2015 at Jefferson Lab
  4. Data Challenge 3 update (Mark)
  5. Spring 2015 Commissioning Simulations (Mark for Sean)
    • The simulation plan to generate a few thousand files with the latest tagged sim-recon version only for run 9306.
  6. Fall 2015 Commissioning Simulations (all)
    • You might want to start a discussion about how to set conditions for simulations for a possible fall run (or generic next running conditions). We decided to require a separate CCDB variation name, but I don't think there was a decision on run numbers?
  7. Geant4 Update (Richard, David)
  8. Some problems with f250 algorithms?
  9. Action Item Review

Communication Information

Remote Connection

Slides

Talks can be deposited in the directory /group/halld/www/halldweb/html/talks/2015 on the JLab CUE. This directory is accessible from the web at https://halldweb.jlab.org/talks/2015/ .

Minutes

Present:

  • CMU: Curtis Meyer, Mike Staib
  • FIU: Mahmoud Kamel
  • FSU: Brad Cannon, Aristeidis Tsaris
  • JLab: Alex Barnes, Amber Boehnlein, Mark Dalton, Mark Ito (chair), Paul Mattione, Eric Pooser, Nathan Sparks, Justin Stevens, Simon Taylor, Beni Zihlmann
  • UConn: Richard Jones, James McIntyre

There is a recording of this meeting on the BlueJeans site.

Announcements

  1. The location of the nightly builds at JLab has changed.
    • Old location example:
      • /u/scratch/gluex/nightly/2015-09-02/sim-recon
    • New location example:
      • /u/scratch/gluex/nightly/2015-09-02/Linux_RHEL6-x86_64-gcc4.4.7/sim-recon

    See the nightly builds wiki page for more on the system.

  2. Sim-recon 1.5.1 has been released. The release notes are posted at GitHub.
  3. We are now deleting CCDB SQLite databases older than one year. Formerly we were never deleting the monthly files. See the wiki page describing the system for more information.
  4. After filling up last week, our work disk was expanded from 14 to 15 TB. Further expansion will require movement of the partition to a new file server.
  5. Mailman.jlab.org has been blocked from off-site for the past week or two due to cyber attacks. This restricts access to the mail archives and list administration pages, but does not affect the operation of the email lists. Access is not restricted from inside the JLab firewall.
  6. Mark drew our attention to his new note on GlueX builds.
  7. Kei gave a report on the last launch. See his slides for the details.
    • There were a lot of problems this time.
    • Kei continues to feedback to Chris Larrieu on SWIF developments.
    • The memory footprint of the jobs nearly doubled.
    • The next launch is scheduled for September 12.

Review of minutes from August 19

We went over the minutes.

Hall D Package Manager

Nathan made changes to support the use of a version.xml to create templates, as per feedback from the meeting.

Flash ADC Pulse Processing

David convened a meeting of the interested parties. We will ask for a report at a future meeting.

Who is responsible for deleting Git branches?

Our policy of having the branch creator delete branches after they have been merged is still in place, though compliance has been spotty.

Clang on the CUE

Nathan reported that Clang is already in stalled on the ifarm, but was compiled against GCC 4.8.3, whereas the default compiler there is GCC 4.4.7. That means that all packages might have to be compiled with the newer version. The system for supporting that for GlueX software has not been worked out.

Geant4 Update

  • Richard has now configured the geometry to have the same magnetic field in all layers. This makes it insensitive to the algorithm Geant4 uses for picking the field-setting geometry layer. He is deferring an evaluation of the CPU cost of keeping the field on in the calorimeter volumes for a later stage of development work.
  • He looked at how the current implementation performs with multi-threading turned on in Geant4. Initially the program was crashing. He traced the problem to a bug in the way geometry information is shared by threads. David had previously seen this and a fix was attempted by the Geant4 team. Unfortunately, the fix leads to other serious problems. Richard has put together his own patch in user code so he can keep going. He has also communicated his ideas for a more fundamental fix to the Geant4 folks.
  • With multi-threading working properly, he sees perfect scaling of performance on an 8-core machine, up to 6 threads (as high as he tried).
  • The next step is to put together an end-to-end scheme for hit generation for a single detector system. That way others who might want to contribute will have a complete example to follow. He is planning on doing this with the CDC.

Collaboration Meeting Agenda

We discussed the agenda for the upcoming Collaboration Meeting, October 8-10, 2015 at Jefferson Lab. Possible talks and speakers are:

  • Overview talk in including data challenge 3 report (Mark)
  • Geant4 Development Update (Richard)
  • Version Management System (Mark)
  • Offline Monitoring and SWIF (Kei)
  • Calibration Overview (Sean Dobbs)

Data Challenge 3 update

Mark has run through the fake raw data set twice now. He showed plots of the number of jobs in various stages of batch farm progress as a function of time. The fluctuations are large and depend largely on what other users on the farm are doing.

He is currently working on converting the job submission process to use SWIF to see if that helps throughput.

Commissioning Simulations

Mark transmitted some thoughts from Sean, sent in an email.

  • Spring 2015 Commissioning Simulations. The plan is to generate a few thousand files with the latest tagged sim-recon version only for run 9306.
  • Fall 2015 Commissioning Simulations. Sean thought we should start a discussion about how to set conditions for simulations for a possible Fall run. The Calibration Group has already decided to require a separate CCDB variation name.
    • Justin thought that we should have production underway by the time of the collaboration meeting so people can estimate yields for the run.
    • Given the uncertainty in the conditions we might have, we thought that we will probably have to just pick some parameters and simulate those, say a 9 GeV coherent edge and 1300 A magnet current.

Some problems with F125 algorithms?

Mike led us through his recent email describing problems related to FADC hit in the BCAL with negative pulse integrals. The problem occurs for Mode 7 data and can be understood by looking at Mode 8 data where the read-out threshold is greater than 1/2 the peak amplitude. The result is early times for pulses late in the window.

We noted that there are two approaches on could take for Mode 8 data: emulated the Mode 7 algorithm, bugs and all, or modify the algorithm to give rational results in the problematic cases. There is a group meeting (see Review of Minutes above) on the emulation and we thought that they would very likely be working on an formulating an approach to the problem.

Action Items

  • Convert DC3 to use SWIF (Mark)
  • Re-do Spring simulation with new release (Mark)
  • Implement a complete hit generation scheme for the CDC in Geant4 (Richard)