Difference between revisions of "HOWTO archive a large directory of files to the tape library"

From GlueXWiki
Jump to: navigation, search
m
Line 1: Line 1:
 
Backups of disk directories at JLab can be created with a script in the hd_utilities repository,
 
Backups of disk directories at JLab can be created with a script in the hd_utilities repository,
  
   $HD_UTILITIES/tar_multi/disk_to_tape_backup.sh
+
   $HD_UTILITIES_HOME/tar_multi/disk_to_tape_backup.sh
  
 
It takes as input positional arguments as follows:
 
It takes as input positional arguments as follows:
Line 21: Line 21:
 
For example, the command:
 
For example, the command:
  
  $HD_UTILITIES/tar_multi/disk_to_tape_backup.sh /work/halld/home/mpatsyuk/dirc/TImap1 $HD_UTILITIES/tar_multi/tar_multi_3.sh 20G
+
  $HD_UTILITIES_HOME/tar_multi/disk_to_tape_backup.sh /work/halld/home/mpatsyuk/dirc/TImap1 $HD_UTILITIES_HOME/tar_multi/tar_multi_3.sh 20G
  
 
results in the directory
 
results in the directory

Revision as of 12:08, 13 August 2019

Backups of disk directories at JLab can be created with a script in the hd_utilities repository,

 $HD_UTILITIES_HOME/tar_multi/disk_to_tape_backup.sh

It takes as input positional arguments as follows:

source_dir=$1 # directory name to be archive with full path
tar_multi=$2 # script to guide tar multi-volume archive creation and extraction, with full path
size=$3 # maximum size of each tar file volume (suffix: G, M, or k)

and creates a multi-volume tar archive on the write-through cache disk in a new directory. Recall large files on the write-through cache will automatically get archived to tape. The new directory is

 /cache/home/backups/$source_dir

where $source_dir is the full path to the directory that was used as the first argument to the script (see above). In addition to the multi-volume tar archive, three other files are created in this directory:

  1. $tar_multi: the script used to guide tar (basename only)
  2. README: instructions for how to extract the tar archive
  3. MANIFEST: a listing of the archive files and a list of the files within each archive file

For example, the command:

$HD_UTILITIES_HOME/tar_multi/disk_to_tape_backup.sh /work/halld/home/mpatsyuk/dirc/TImap1 $HD_UTILITIES_HOME/tar_multi/tar_multi_3.sh 20G

results in the directory

/cache/halld/home/backups/work/halld/home/mpatsyuk/dirc/TImap1

In that directory the README says:

Tue May 21 11:19:36 EDT 2019
To restore files:
tar xvf /cache/halld/home/backups/work/halld/home/mpatsyuk/dirc/TImap1/TImap1.tar -F /cache/halld/home/backups/work/halld/home/mpatsyuk/dirc/TImap1/tar_multi_3.sh --multi-volume

The MANIFEST says:

Tue May 21 11:19:36 EDT 2019
/cache/halld/home/backups/work/halld/home/mpatsyuk/dirc/TImap1
total 4192164616
-rw-rw-r-- 1 gluex halld-2          92 May 21 11:19 MANIFEST
-rw-rw-r-- 1 gluex halld-2         225 May 21 11:19 README
-rwxrwxr-x 1 gluex halld-2         636 May 20 14:19 tar_multi_3.sh
-rw-rw-r-- 1 gluex halld-2 21474836480 May 20 14:25 TImap1.tar
-rw-rw-r-- 1 gluex halld-2 21474836480 May 20 15:23 TImap1.tar:10
-rw-rw-r-- 1 gluex halld-2 21474836480 May 21 01:10 TImap1.tar:100
-rw-rw-r-- 1 gluex halld-2 21474836480 May 21 01:16 TImap1.tar:101
...
-rw-rw-r-- 1 gluex halld-2 21474836480 May 21 00:59 TImap1.tar:98
-rw-rw-r-- 1 gluex halld-2 21474836480 May 21 01:04 TImap1.tar:99
tar file contents:
drwxr-sr-x mpatsyuk/halld-2  0 2018-10-16 17:34 TImap1/
-rw-r--r-- mpatsyuk/halld-2 6943 2018-10-09 22:46 TImap1/pdf_x-69.0_y-53.0_th6.25786_phi-146.76.root
-rw-r--r-- mpatsyuk/halld-2 2119303361 2018-10-10 02:27 TImap1/kapi_x-61.0_y-45.0_th5.04162_phi-149.508.root
-rw-r--r-- mpatsyuk/halld-2       6935 2018-10-10 21:09 TImap1/pdf_x-25.0_y57.0_th8.41465_phi92.678.root
...
-rw-r--r-- mpatsyuk/halld-2 1719255521 2018-10-09 12:34 TImap1/kapi_x-93.0_y-95.0_th10.9009_phi-134.664.root
Preparing volume 2 of /cache/halld/home/backups/work/halld/home/mpatsyuk/dirc/TImap1/TImap1.tar.
-rw-r--r-- mpatsyuk/halld-2       6939 2018-10-09 13:32 TImap1/pdf_x-93.0_y55.0_th10.9527_phi132.984.root
...

The tar archive itself is the set of *.tar* files listed in the MANIFEST and resident in the results directory.