Difference between revisions of "Policies for Using Online Directories and Accounts"

From GlueXWiki
Jump to: navigation, search
m
m
Line 10: Line 10:
  
 
Below we discuss suggested best practices concerning allocation of directories on the online file server, as well as guidelines for using the various accounts available on the online computer system.  Our recommendations are based long-term on experience from CLAS, experience from the Hall D offline software effort, [http://argus.phys.uregina.ca/cgi-bin/private/DocDB/ShowDocument?docid=1892 GlueX-doc-1892] developed as part of the 12GEV planning process concerning this topic, and discussions with Serguei P and others.
 
Below we discuss suggested best practices concerning allocation of directories on the online file server, as well as guidelines for using the various accounts available on the online computer system.  Our recommendations are based long-term on experience from CLAS, experience from the Hall D offline software effort, [http://argus.phys.uregina.ca/cgi-bin/private/DocDB/ShowDocument?docid=1892 GlueX-doc-1892] developed as part of the 12GEV planning process concerning this topic, and discussions with Serguei P and others.
 +
 +
The following sections give background information and discuss general concerns, skip to the last section if you are just interested in a summary of recommended best practices.
 +
  
  
Line 26: Line 29:
 
Almost all directories of interest reside on the gluonfs1 file server in /gluex.  This area is visible on all computers and ROCs in the online cluster, and can be seen from all accounts.  This includes the main operator account, hdops, as well as individual accounts.  It is critical to get the protections in this area correct so that code can be developed and tested by individuals, installed by system installers (via the hdsys account), and used but not overwritten by the hdops account.  Of course the hdops account will need to have write access to some directories, for logging, backup and the like.
 
Almost all directories of interest reside on the gluonfs1 file server in /gluex.  This area is visible on all computers and ROCs in the online cluster, and can be seen from all accounts.  This includes the main operator account, hdops, as well as individual accounts.  It is critical to get the protections in this area correct so that code can be developed and tested by individuals, installed by system installers (via the hdsys account), and used but not overwritten by the hdops account.  Of course the hdops account will need to have write access to some directories, for logging, backup and the like.
  
In analogy with the offline, the main code deployment directory is /gluex/builds, with many named builds appearing underneath.  This area must be read-only to hdops, and only writeable only from the hdsys installation account (same as for the gluex account in the offline system).  Operators and developers must be able to rely on finding working code in the build areas, with the exception of the devel build, which by its very nature is unstable.
+
In analogy with the offline, the main code deployment directory is /gluex/builds, with many named builds appearing underneath.  This area must be read-only to hdops, and only writeable only from the hdsys installation account (same as for the gluex account in the offline system).  Operators and developers must be able to always rely on finding working code in the build areas, with the exception of the devel build, which by its very nature is unstable.
  
A small number of additional directories need to be treated like the build directory.  These will hold production configuration information needed by CODA and other packages, and must be installed only via the hdsys account.
+
A small number of additional directories need to be treated like the build directory.  These will hold production configuration information needed by CODA and other packages, and must be writable only by the hdsys account.
  
 
Note that development work should not be done in the directories described above.  Development should  
 
Note that development work should not be done in the directories described above.  Development should  
Line 40: Line 43:
  
  
'''More on Accounts'''
+
'''Summary of Recommended Best Practices'''
 +
 
 +
* Operators should work exclusively from the hdops account, which only has write access to selected logging and backup directories.
 +
* Production code should be installed via the hdsys account, which is the sole purpose of this account, and production directories are only writeable from the hdsys account.
 +
* Only selected online experts are allowed to install code via the hdsys account, contact the Online group if you need code installed.
 +
* Developers should work from their own personal accounts.
 +
* Developers can share code in /gluex/Subsystems (same purpose as /group/halld in the offline).
 +
* All development efforts should use available code management tools.
 +
* Use the scons-based online build system except in special cases (e.g. EPICS, offline code used in the online).

Revision as of 09:34, 26 November 2013

Draft - for discussion

This note is intended to jump-start a discussion on how to organize online directories and how to use online accounts. Comments, suggestions and criticisms are welcome. Our hope is to come to general consensus soon, and we expect developers to follow the guidelines shortly thereafter.


Elliott Wolin, Dave Lawrence, Mark Dalton
26-Nov-2013


Below we discuss suggested best practices concerning allocation of directories on the online file server, as well as guidelines for using the various accounts available on the online computer system. Our recommendations are based long-term on experience from CLAS, experience from the Hall D offline software effort, GlueX-doc-1892 developed as part of the 12GEV planning process concerning this topic, and discussions with Serguei P and others.

The following sections give background information and discuss general concerns, skip to the last section if you are just interested in a summary of recommended best practices.


Goals

The single most important goal is to protect the production software deployment directories from accidental overwrite. Note that the worst case is not simply causing the system to hang or crash, rather it is to cause the system to appear to work but in fact not work properly. Many days of data taking could be lost if this happens.

Another important goal is to provide a suitable environment for developers. Such an environment should allow for rapid development, testing and installation of new code without compromising production code, and should also allow developers to easily collaborate. Note that we have already implemented a code management/build system that will not be discussed below except to show how it furthers the goals outlined in this document.

Finally, for those of us who will be on call for online systems, another goal is to minimize the number of calls we get at 3am.


Directory and Account Strategy

Almost all directories of interest reside on the gluonfs1 file server in /gluex. This area is visible on all computers and ROCs in the online cluster, and can be seen from all accounts. This includes the main operator account, hdops, as well as individual accounts. It is critical to get the protections in this area correct so that code can be developed and tested by individuals, installed by system installers (via the hdsys account), and used but not overwritten by the hdops account. Of course the hdops account will need to have write access to some directories, for logging, backup and the like.

In analogy with the offline, the main code deployment directory is /gluex/builds, with many named builds appearing underneath. This area must be read-only to hdops, and only writeable only from the hdsys installation account (same as for the gluex account in the offline system). Operators and developers must be able to always rely on finding working code in the build areas, with the exception of the devel build, which by its very nature is unstable.

A small number of additional directories need to be treated like the build directory. These will hold production configuration information needed by CODA and other packages, and must be writable only by the hdsys account.

Note that development work should not be done in the directories described above. Development should

None of the directories discussed below will be used for production use by operators, and all development work should be done in these directories, preferably in your own account.

Again as in the offline, /gluex/Subsystems should be used by groups of developers to share code, and group protections here should be set up to allow this. Note that in general you need "umask 002" to allow for shared code development in a single area.

Similarly, /gluex/Users is for individual use. Note you can also use /home/<your-username>, although system protections might be somewhat stricter here than in /gluex/Users.


Summary of Recommended Best Practices

  • Operators should work exclusively from the hdops account, which only has write access to selected logging and backup directories.
  • Production code should be installed via the hdsys account, which is the sole purpose of this account, and production directories are only writeable from the hdsys account.
  • Only selected online experts are allowed to install code via the hdsys account, contact the Online group if you need code installed.
  • Developers should work from their own personal accounts.
  • Developers can share code in /gluex/Subsystems (same purpose as /group/halld in the offline).
  • All development efforts should use available code management tools.
  • Use the scons-based online build system except in special cases (e.g. EPICS, offline code used in the online).