Difference between revisions of "OWG Meeting 27-Jul-2016"

From GlueXWiki
Jump to: navigation, search
(Created page with "= Location and Time = '''Room:''' '''<font size="+1">CC F326</font>''' '''Time:''' 2:00pm-3:00pm =Connection= <div class="mw-collapsible mw-collapsed"> You can connect us...")
 
 
Line 33: Line 33:
 
# '''DAQ'''
 
# '''DAQ'''
 
#* fADC125 Testing
 
#* fADC125 Testing
#* High Luminosity Data Rates
+
#* High Luminosity Data Rates ([[Media:20160722_L3status.pdf|L3 min-review slides]])
 
# '''Front-end Firmware Status'''
 
# '''Front-end Firmware Status'''
 
# '''L3 Status''' ([[Level-3 Trigger Meetings|meetings]])
 
# '''L3 Status''' ([[Level-3 Trigger Meetings|meetings]])
Line 39: Line 39:
  
  
== Background Info. for RAID Discussion ==
 
<div class="mw-collapsible mw-collapsed">
 
* Existing gluon RAID specs    ''(Click "Expand" to the right for details -->):''
 
<div class="mw-collapsible-content">
 
  
Here is the description of our existing RAID servers from the PO attached to PR #334477:
+
''Recharge Wednesday: hot caramel sundae''
+
ITEM 005B:
+
DATA STORAGE NODES NODES AS PER THE TABLE IN THE STATEMENT OF WORK ENTITLED "JEFFERSON LAB PHYSICS SERVERS 2013 DATED JULY 24, 2013”
+
  SANDY BRIDGE E5-2630 CPU
+
  30*3TB SATA ENTERPRISE
+
  4*500 GB SATA ENTERPRISE
+
  2*QDR CONNECT X3 (ONE MAY BE LANDED ON THE MOTHERBOARD, OR MAY USE DUAL PORT CARD)
+
  LSI RAID 9271-8I RAID CARD (OR BETTER) WITH BATTERY BACKUP.
+
</div>
+
</div>
+
<div class="mw-collapsible mw-collapsed">
+
* Chip e-mail describing options.    ''(Click "Expand" to the right for details -->):''
+
<div class="mw-collapsible-content">
+
 
+
David et al,
+
+
One thing we are doing new with file servers is configuring them as fault tolerant pairs.
+
For the last year we've been buying two computers plus two 44 disk chassis (front and
+
back disks, so a total of 4 back planes), and putting 2 disk controllers in each file server
+
(2 cables per controller, so 4 cables per file server).  Each server controls connects to
+
each back plane, and if a server dies, the other can control all of each back plane.  The
+
pair easily can move 1 GB/s in and 1 GB/s out concurrently, and if a node dies, the sole
+
survivor can manage 800 MB/s in and out.  We have been using 8 TB disks for a year now;
+
smaller disks are cheaper, performance scales with number of spindles.  Everything is
+
12 Gbps SAS3.
+
+
If you want to scale down cost and capacity but keep fault tolerance and performance,
+
here is a small version of the above:
+
+
* two file servers, each with one controller
+
+
* two 44 disk chassis each with 30 4-TB disks
+
+
This yields high bandwidth, 240 TB raw, 192 TB before file system, ~170 with file system,
+
135 TB at about 80% full.  In a pinch you could run 90% full so this gives 150 TB on top of
+
your current 50.
+
+
When you retire your two old servers, you could buy another 20 4 TB drives and add them
+
to the two disk chassis.  If you want more future growth potential then use 6 TB drives
+
instead of 4 TB drives.  If you want more bandwidth, start with 40+40 disks.  Maybe plan
+
to buy a matched pair every 2-3 years for both capacity and performance growth.  Don't
+
run anything longer than 5 years.
+
+
The pair loaded with 8 TB drives will be around $50K (direct), so under populating the
+
chassis and using smaller disks will make it a good bit cheaper.  For our pair, we get
+
380 TB at 80% full (to prevent fragmentation), $130 per usable TB.
+
+
Shoestring version: buy half a pair with 40*6 TB drives => 135 TB at lower performance,
+
perhaps $20K.  Add the mate a year later. They do still make the 36 disk all-in-one (no
+
active-active pairing) for a little bit less per TB, but I think going for active-active
+
would be best.
+
+
Chip
+
</div>
+
</div>
+
 
+
 
+
 
+
''Recharge Wednesday: hand crafted ice cream sandwiches''
+
  
 
= Minutes =
 
= Minutes =
  
 
''TBD''
 
''TBD''

Latest revision as of 13:04, 27 July 2016

Location and Time

Room: CC F326

Time: 2:00pm-3:00pm

Connection

You can connect using BlueJeans Video conferencing (ID: 120 390 084). (Click "Expand" to the right for details -->):

(if problems, call phone in conference room: 757-269-6460)

  1. To join via Polycom room system go to the IP Address: 199.48.152.152 (bjn.vc) and enter the meeting ID: 120390084.
  2. To join via a Web Browser, go to the page https://bluejeans.com/120390084.
  3. To join via phone, use one of the following numbers and the Conference ID: 120390084
    • US or Canada: +1 408 740 7256 or
    • US or Canada: +1 888 240 2560
  4. More information on connecting to bluejeans is available.


Previous Meeting

Agenda

  1. Announcements
    • 36-port IB switch (56GB/s) w/ cables : "Delivered"
    • gluon43 repaired (motherboard/cpu replaced)
    • RAID disk ordered (8TB disks instead of 6TB -> 184TB usable space)
  2. DAQ
  3. Front-end Firmware Status
  4. L3 Status (meetings)
  5. AOT


Recharge Wednesday: hot caramel sundae

Minutes

TBD