Difference between revisions of "GlueX Data Challenge Meeting, December 17, 2012"

From GlueXWiki
Jump to: navigation, search
(Minutes)
(Minutes)
Line 48: Line 48:
  
 
* '''CMU''': Paul Mattione
 
* '''CMU''': Paul Mattione
* '''JLab''': David Lawrence, Yi Qiang, Elton Smith, Simon Taylor, Beni Zihlmann
+
* '''JLab''': Mark Ito (chair), David Lawrence, Yi Qiang, Dmitry Romanov, Elton Smith, Simon Taylor, Beni Zihlmann
* '''UConn''':
+
* '''UConn''': Richard Jones
  
data challenge meeting notes
+
==Data Challenge 1 status==
12/17/12
+
  
paul
+
Production started at the three sites Wednesday, December 5, as planned.
david, simon, yi, elton, mark, beni, dmitry
+
 
richard
+
We updated progress at the various sites:
 +
 
 +
* JLab: 678 million events
 +
* Grid: 3.4 billion events
 +
* CMU: 270 million events
  
 
3.4 billion events on grid
 
3.4 billion events on grid

Revision as of 20:14, 17 December 2012

GlueX Data Challenge Meeting
Monday, December 17, 2012
1:30 pm, EDT
JLab: CEBAF Center, F326/327

Agenda

  1. Announcements
  2. Minutes from last time
  3. Data Challenge 1 status
    1. JLab
    2. Grid status
    3. CMU status
  4. Shutdown plan (or continuation plan?)
  5. Work list for post DC-1 period
    1. file archiving
    2. file distribution
    3.  ???
  6. Thoughts on DC-2
    1. What?
    2. How much?
    3. When?

Meeting Connections

To connect from the outside:

Videoconferencing

  1. ESNET:
    • Call ESNET Number 8542553 (this is the preferred connection method).
  2. EVO:
    • A conference has been booked under "GlueX" from 1:00pm until 3:30pm (EST).
    • Direct meeting link
    • To phone into an EVO meeting, from the U.S. call (626) 395-2112 and then enter the EVO meeting code, 13 9993
    • Skype Bridge to EVO

Telephone

  1. Phone: (should not be needed)
    • +1-866-740-1260 : US and Canada
    • +1-303-248-0285 : International
    • then use participant code: 3421244# (the # is needed when using the phone)
    • or www.readytalk.com
      • then type access code 3421244 into "join a meeting" (you need java plugin)

Minutes

Present:

  • CMU: Paul Mattione
  • JLab: Mark Ito (chair), David Lawrence, Yi Qiang, Dmitry Romanov, Elton Smith, Simon Taylor, Beni Zihlmann
  • UConn: Richard Jones

Data Challenge 1 status

Production started at the three sites Wednesday, December 5, as planned.

We updated progress at the various sites:

  • JLab: 678 million events
  • Grid: 3.4 billion events
  • CMU: 270 million events

3.4 billion events on grid some time correcting problems spared hazzards with crashes

mcsmear, reproduce hang take seeds and re-run on second try files look identical cause of hangs, deadlock due to exceeding 30 second time-out holds mutex lock hangs occur in mcsmear

24 hour jobs partial file, no files

jobs finished quickly 2-3% crashing resubmit on failure multiplie submimission, up to 30 changed to allow failed jobs to fail

submission node crashed, replaced with bigger memory machine peak out at 7k jobs running at once other host: user scheduler, maintains a daemon for each job, needed more memory srm that receives the results coming back, 20 TB of disk robust 100 MB, fills GB pipe

100 million events and go back to debug the code

10% being used right now only one person

archive all files to JLab tape library logs, histos, rest

distribution: ship all rest files to UConn, access via srm have all files spinning at JLab

SURA grid,

skims

srm plug-in

grid certificate, collaboration wide archive

set faujlts in hdgeant jana hangs relaunch random seed


--end of note--