Transition from SWIF to SWIF2

From GlueXWiki
Jump to: navigation, search

Upcoming Retirement of Auger and SWIF version 1

Beginning on March 1st, 2022 the Auger and SWIF version 1 commands will no longer be available. They have been superseded by the Slurm workload manager and by SWIF version 2 (swif2):

  • Slurm includes GPU and MPI support, features absent in Auger. Detailed Documentation about Slurm at JLab is available.
  • Swif2 is built on top of Slurm and supports running workflows with file staging, both locally and at remote sites. See the SWIF2 manual for details.

For more information on the upcoming change and the changes to the job submission, please see the Migration Guide.


Schedule

  • On Tuesday, February 8th, jobs submitted using Auger or SWIF1 will have reduced scheduling capacity to encourage transition to Slurm and SWIF2.
  • On Tuesday, March 1st, the deprecated commands will be removed, Auger and SWIF1 services will be shut down, and documentation will be removed. The swif command will become an alias for swif2.


Necessary modifications

SWIF1 and SWIF2 have essentially the same command line interface, so only minor changes to the launch scripts are necessary. The options --project and --track in swif1 --add-job are replaced by --account and --partition in SWIF2. The valid accounts can be found at Slurm Accounts and partition information can be found from the Jlab Slurm Information Page. To insure backwards compatibility, we keep the PROJECT and TRACK keys in the config files.

We recommend the following settings:

PROJECT    halld
TRACK      production
OS         general

N.B.: there is no longer a gluex project/account

Set-by-step guide

1. Update your launch.py from svn or github.

2. Create workflow. The name has to match the jobs.config file.

swif2 create -workflow YOUR_WORKFLOW

3. Register jobs

./launch.py jobs.config MIN_RUN_NUMBER MAX_RUN_NUMBER

Or optional for a limited number of files (e.g. the first 5)

./launch.py jobs.config MIN_RUN_NUMBER MAX_RUN_NUMBER -f '00[0-4]'

4. Run workflow

swif2 run YOUR_WORKFLOW

5. Check status (ordered by level of detail)

swif2 list
swif2 status -workflow YOUR_WORKFLOW
swif2 status -jobs -workflow YOUR_WORKFLOW

6. Remove workflow

swif2 cancel -delete -workflow YOUR_WORKFLOW

7. Default shell environment is bash

To change shell do: add-job ... -shell /bin/tcsh