Raid-to-Silo Transfer Strategy

Below is a proposal for a raid-to-silo transfer strategy for moving Hall D data files from our local raid server to the JLab tape storage facility. We will update this as our ideas develop.

Elliott Wolin

Dave Lawrence

24-Oct-2013

Notes

We will use the jmirror facility from the Computer Center to transfer the files.
jmirror deletes the link to the file when the transfer is complete. It does not delete directories, only files.
jmirror is fairly smart and reliable. It only deletes the hard link when the file is safely transferred.
jmirror is run periodically via a cron job, it is not a tranfer server system. It transfers files it finds when it is run.
jmirror will not transfer files actively being written to, nor transfer files twice if invoked twice.
Additional hard links to the data file are untouched by jmirror. These can be used to keep the file on disk after transfer.
If files are kept they must be deleted in time to make room for new DAQ files. This will require cleanup strategy and cron scripts to implement it.
The DAQ creates a 10 GB file every 30 secs, about 1 TB/hour. Thus a two hour run generates 2 TB.
It is preferable to transfer files as they are ready for transfer, and not wait for the run to end before initiating transfer.
The simplest way to implement immediate transfer is for run control to run a script every time the ER closes a file.
Vardan and Carl are working out a simple scheme to allow users to specify such a script and have it run when a file is closed.
Mark I prefers to store files by "run period" with a simple naming scheme (RunPeriod001, RunPeriod002 or similar).
Run periods are just date ranges. Run numbers will NOT be reused, i.e. all run numbers are unique across all run periods.
Due to constraints in the mss a second level of directories is needed. Mark and I propose simply organizing files by run, e.g. something like Run000001, Run000002, etc.
Run files will have the run number in them, e.g something like: Run000001.evio.001, Run000001.evio.002, etc.
A two-hour run will generate around 250 files.
RAID disk partitions do not seem to be needed (see below), they can be implemented later if necessary.

Notes for Dec 2013 Online Data Challenge

We plan to use a basic autmomated file transfer mechanism in Dec that deletes files on transfer. If someone has the time we'll try just-in-time deletion.

Proposal

Raid-to-Silo Transfer Strategy

Navigation menu

Views

Personal tools

Navigation

Search

Tools