Difference between revisions of "MIT/FutureGrid Data Challenge 2 Production"

From GlueXWiki
Jump to: navigation, search
 
(2 intermediate revisions by the same user not shown)
Line 33: Line 33:
 
* Back to running smoothly after maintenance last week
 
* Back to running smoothly after maintenance last week
 
* One FutureGrid site requested we slow production so other users could have some more cycles
 
* One FutureGrid site requested we slow production so other users could have some more cycles
* Slides from all hands meeting [https://indico.fnal.gov/getFile.py/access?contribId=13&sessionId=7&resId=0&materialId=slides&confId=7207 Jan Balewski @ OSG]
+
* Slides on cloud development from all hands meeting [https://indico.fnal.gov/getFile.py/access?contribId=13&sessionId=7&resId=0&materialId=slides&confId=7207 Jan Balewski @ OSG]
 +
* Data from all three run numbers (9001-9003) now available on Northwestern's SRM [https://mailman.jlab.org/pipermail/halld-offline/2014-April/001638.html Sean's e-mail]
  
 
[[File:MIT DC2 4.11.14.png]]
 
[[File:MIT DC2 4.11.14.png]]

Latest revision as of 11:02, 11 April 2014

Resources

  • MIT Reuse Cluster providing 17 blades x 8 = 136 cores currently.
  • FutureGrid project currently providing us with ~200 cores at various sites:
    • University of Chicago
    • Indiana University
    • UC San Diego
    • University of Texas - Austin
  • VMs launched using tools developed with the FutureGrid project using OpenStack technology, exploring the distributed cloud computing model.

Monitoring

Update 3/28/14

  • Running smoothly for the last week with 300+ cores

MIT DC2 3.28.14.png

  • Only running jobs for 9001 thus far:
    • ~5K jobs complete with 25K events each -> ~125M events produced in 1 week
  • May get access to ~100 more cores on FutureGrid
    • Will run jobs for 9002 and 9003 on those and/or adjust some of the nodes currently in use

Update 4/4/14

  • Some (monthly) maintenance on FutureGrid sites this past week slowed us down a bit.
  • We're up to a total of 344 cores now, running both 9001 and 9002.
  • Possibility of ~100 more next week. will run 9003 on these, or switch some of the current VMs.

MIT DC2 4.4.14.png

Update 4/11/14

  • Back to running smoothly after maintenance last week
  • One FutureGrid site requested we slow production so other users could have some more cycles
  • Slides on cloud development from all hands meeting Jan Balewski @ OSG
  • Data from all three run numbers (9001-9003) now available on Northwestern's SRM Sean's e-mail

MIT DC2 4.11.14.png