Online task list for 2011
Contents
- 1 FY2011 Activity Schedule for Online Computing
- 2 Descriptions of Scheduled Activities
- 2.1 Plan Front-End Software
- 2.2 Plan DAQ Software Event Unblocking
- 2.3 Plan DAQ Software Scripts
- 2.4 Plan DAQ Software Run Control
- 2.5 Plan DAQ Software Code Management
- 2.6 Plan Monitoring Framework
- 2.7 Plan Monitoring Scalers
- 2.8 Plan Monitoring Histograms
- 2.9 Plan Monitoring Remote
- 2.10 Plan Monitoring Hardware Status
- 2.11 Plan Monitoring Process Status
- 2.12 Plan Monitoring Trigger
- 2.13 Plan Alarm Sys
- 2.14 Plan Archiving DAQ Configuration
- 2.15 Plan Archiving Run Info
- 2.16 Plan Archiving Controls
- 2.17 Plan Event Display
- 2.18 Plan Storage Management
- 2.19 Plan Experiment Controls Framework
- 2.20 Plan Experiment Controls Display management
- 2.21 Plan Experiment Controls Backup/Restore
- 2.22 Plan Experiment Controls Magnet PS
- 2.23 Plan Experiment Controls HV
- 2.24 Plan Experiment Controls LV
- 2.25 Plan Experiment Controls Motors
- 2.26 Plan Experiment Controls Gas Systems
- 2.27 Plan Experiment Controls Temperature
- 2.28 Plan Experiment Controls Target
- 2.29 Plan Experiment Controls Interface with DAQ
FY2011 Activity Schedule for Online Computing
The table below contains activities from the 12GeV project schedule in the Online Computing section which have work scheduled for FY2011. Detailed descriptions for the activities are kept at the bottom of the page and can be jumped to by clicking the short description in the table.
A breakdown of each activity into smaller tasks is maintained in an Excel file on the group disk here:
/group/halld/Individual-Schedules/Online_Computing
Activity Line | Activity Name | Man-weeks | Names of people | Comments |
---|---|---|---|---|
1532025 | Plan Front-End Software | 16.5 | D. Lawrence, D. Abbott, B. Moffit | |
1532030 | Plan DAQ Software Event Unblocking | 9 | D. Lawrence, D. Abbott, B. Moffit | |
1532030a | Plan DAQ Software Scripts | 6 | ||
1532030b | Plan DAQ Run Control | 5 | ? + V. Gyurjyan | Mostly Vardan |
1532030c | Plan DAQ Code Management | 4 | ||
1532035 | Plan Monitoring Framework | 6 | ? + V. Gyurjyan | |
1532035a | Plan Monitoring Scalers | 3 | ||
1532035b | Plan Monitoring Histograms | 4 | D. Lawrence | |
1532035c | Plan Remote Monitoring | 3 | ? + V. Gyurjyan | |
1532035d | Plan Monitoring Hardware | 4 | ? + V. Gyurjyan | |
1532035f | Plan Monitoring Processes | 3 | ? + V. Gyurjyan | |
1532035g | Plan Monitoring Trigger | 2 | S. Somov + V. Gyurjyan | |
1532040 | Plan Alarm Systems | 6 | Universities are expected to contribute more | |
1532045 | Plan Archiving DAQ Configuration | 3 | ||
1532045a | Plan Archiving Run Info | 5 | ||
1532045b | Plan Archiving Controls | 5 | ||
1532050 | Plan Event Display | 2 | Universities are expected to contribute more | |
1532055 | Plan Storage Management | 11.3 | Computer Center can help | |
1532060 | Plan Controls Framework | 4 | ? + V. Gyurjyan | |
1532060a | Plan Display Management | 3 | ||
1532060b | Plan Controls Backup/Restore | 3 | ||
1532060c | Plan Controls Magnet PS | 4 | ||
1532060d | Plan Controls HV | 3 | ||
1532060f | Plan Controls LV | 4 | ||
1532060g | Plan Controls Motors | 4 | ||
1532060h | Plan Controls Gas Systems | 4 | ||
1532060j | Plan Controls Temperature | 4 | ||
1532060k | Plan Controls Target | 5 | ||
1532060n | Plan Controls/DAQ interface | 3 | ? + V. Gyurjyan | Mostly Vardan |
1532065 | Trigger Board Initialization | 27 | S. Somov + Electronics Group | Two-year duration |
1532035 | Level 1 Verification | 24 | S. Somov + Electronics Group | Two-year+ duration |
Descriptions of Scheduled Activities
Plan Front-End Software
This will plan the Hall-D specific details of configuring and maintaining the software used in the front-end electronics in the hall. This includes where the CODA 3 configurations will be kept (disk resident XML files, database, ...?), and how we will revert to previous configurations or implement new ones.
This will also include plans for how the translation table needed for the offline will be interfaced with the online. Specifically, if the DAQ system detects module types automatically, how/where it will record these for use in parsing by both the online monitoring system and the offline systems.
Because the online systems can be very sensitive to configuration details, access to changes should probably limited to certain individuals. This plan should address how access to deployed system configurations will be limited to ensure integrity of the DAQ system.
Plan DAQ Software Event Unblocking
In production running the events will arrive entangled meaning all of the fragments of a single event will not appear in a single, contiguous memory section. Rather, the fragments will be mixed with fragments from other events and must be disentangled (or unblocked) to get a single event that may be analyzed. This will have to be done for monitoring as well as for L3 event filtering where the ability to save or discard a single event will be required.
This activity will provide a plan for how and where the events will be disentangled (EB, L3/monitoring farm, offline code base, ...?) This will include how the single events will be passed on to the CODA 3 Event Recorder for writing to disk/tape.
Estimates of CPU/memory/bandwidth resources required will be included so they may be added into the overall requirements for the Hall-D online computing resources.
Plan DAQ Software Scripts
Plan for general organization of scripts used as part of the Hall-D online systems. This will include the languages (python, perl, bash, ...) used for the command-line, batch-mode, cron-job, and GUI scripts.How the scripts will be maintained, and editing access restrictions will be included.
Plan DAQ Software Run Control
Plan for implementing the CODA 3 Run Control in the Hall-D online systems. This will include how the configuration will be maintained and how access to editing the configuration will be restricted. Ability to access Run Control from the counting house, the experimental hall, and via a remote, secure connection (for on-call maintenance) will be required. How that will be done while minimizing risk of disrupting operations will be addressed.
Plan DAQ Software Code Management
A plan for maintaining the online code base. This will include compiled programs, scripts, and configuration files that comprise the online software systems. This will include a choice of code management system and where it will be hosted. How this integrates with the offline software code-base which will very likely be used as a basis for the L3 event filter will be addressed.
Plan Monitoring Framework
The substantial number of independent monitoring subsystems developed for Hall D need to be coordinated and results presented to operators in a coherent way. Further, the monitoring system must interact with other independent systems such as the alarm system, archiving system, control system, etc. An overall strategy and architecture must be developed to ensure transparent interoperation among all these systems.
Plan Monitoring Scalers
Scaler information generated by the trigger, DAQ and other systems must be extracted from hardware, then monitored, analyzed and presented to operators and other automated monitoring systems as appropriate. Analyzed and raw scaler information must further be archived, and for critical scalers, archived in multiple places for redundancy. Scaler information in the data stream may need to be diverted into separate data streams for ease of access by the Offline group, and some scaler data may need to be entered into databases.
Finally, alarms need to be generated when automated analysis programs find problems in the scaler data.
Plan Monitoring Histograms
Events taken by the DAQ system must be continuously monitored for quality. The histogram monitoring system must extract a sample of events from the DAQ in real-time, analyze them, generate histograms, then present the information to operators and to other automated monitoring systems. The histograms must be archived periodically, and a reset mechanism must exist to clear histograms e.g. at the beginning of a new run. The system must also be able to read and analyze events from a file and operate independently of the system monitoring events in real-time.
Currently the RootSpy framework, developed within the Offline group but with the Online in mind, appears to be the best foundation for event histogram monitoring.
Finally, alarms need to be generated when automated analysis programs find problems in the histogram data.
Plan Monitoring Remote
A large fraction of detector hardware and some online software is being developed by collaboraters from other institutions, and they need to be able to monitor performance of their systems from off site. A system needs to be developed to allow them access to almost all information available to shift personnel, but in a way that satisfies JLab cyber security requirements. In some cases remote collaboraters may need to take control of DAQ and other systems to diagnose and repair problems in their systems.
Plan Monitoring Hardware Status
A large amount of detector hardware must be monitored for health during hall operations, beyond what is done by the EPICS-based control system. Hardware may inject status information periodically into the data stream, and processes must extract this information, archive it, and present it to operators. Other information will need to be proactively extracted fromt the hardware at appropariate times and in such a way as to not interfere with the high-speed DAQ system. And some information will only be extracted during special runs or calibrations procedures. And of course action must be taken or alarms must be generated when problems are detected.
This system must be designed to handle a large variety of disparate hardware while minimizing the amount of special programming required and must avoid compromising fast DAQ and other common operations.
Plan Monitoring Process Status
A large number of processes running on a large number of computers in the counting house need to be started, stopped and monitored during operations. These processes run under widely varying conditions. E.g. some need to be started at boot time and run continuously, others just during data taking, others just under special conditions.
The existance and health of all these processes needs to be continuously monitored in real time. Alarms need to be generated in case of failed processes, and if operator action is not required they can be restarted automatically. The monitoring system must be highly and easily configurable as the critical process list will change fairly often.
Plan Monitoring Trigger
The state-of-the-art high-speed Hall D trigger system must be monitored at all times for proper operation. This includes extraction and monitoring of scaler and data generated by the trigger hardware. This data must be analyzed and compared to expectations based on understanding of the physics involved and the trigger programming. Alarms must be generated and operators notified if problems are detected.
Plan Alarm Sys
The alarm system is foundational for transmission of information to operators and other systems concerning problems detected by online software. It must accept alarm inputs from a wide variety of programs monitoring a large number of disparate systems. Operators need the ability to view alarms in time sequence and/or priority order, and must be able to acknowlege alarms so they no longer appear on critical alarm screens. Alarms must further be "shelved" for some length of time for known problems that cannot be solved quickly. Alarm history must be preserved and be easily viewed.
Overall alarm system design is best described in "Alarm Management: Seven Effective Methods for Optimum Performance" by Hollifield and Habibi. The SNS EPICS alarm system was designed according to many of the principles in the book, and currently seems to be best choice for use in Hall D.
Plan Archiving DAQ Configuration
Plan Archiving Run Info
Plan Archiving Controls
Plan Event Display
Form a plan for the online event display. This display is expected to be running continuously in the counting house to provide a quick visual of individual events being read in from the DAQ. It will also be used to replay events to monitor data integrity and to help debug the DAQ system. The graphics package used and what features the event display must have will be included in the plan. How the event display will interface with the DAQ system to get events will also be addressed.
Plan Storage Management
Plan for online data storage from the DAQ and online systems. This will include hardware systems (raid disks?) to hold the data and how it will be transferred to the Computer Center for permanent storage. Bandwidth requirements for the disk will be included as it may need to support quick replay analysis while still acquiring data. Slow controls values critical for data replay will also need to be copied into long term storage, possibly alongside the event-level data so if/how that is done should be addressed.
Plan Experiment Controls Framework
Plan Experiment Controls Display management
Slow controls system in Hall D will require a single Display Management framework to monitor and control different components in the Hall. A careful study needs to be done to identify the requirements for different components of the controls system and monitoring. Also we will need to study and test different existing display management systems which are easy to interface with EPICS to be able to select the best Display Management system matching Hall D needs.
- Identify applications which need control and monitoring, and for each such application determine what screens they will require. Some of the systems may require large number of screen which will need a automated screen generation
- Study a few of most eligible frameworks and evaluate their applicability to Hall D systems. It is highly desirable that the framewrok allows for automated generation of screens.
- Make at least one prototype application utilizing the most favorable display management framework to identify the possible difficulties which we may encounter using it.
- Create a work plan for the next two years for developing the control screen for Hall D controls.
Plan Experiment Controls Backup/Restore
Plan Experiment Controls Magnet PS
Plan Experiment Controls HV
Plan Experiment Controls LV
Plan Experiment Controls Motors
Plan Experiment Controls Gas Systems
Plan Experiment Controls Temperature
Plan Experiment Controls Target
Plan Experiment Controls Interface with DAQ
Plan for configuring the Hall-D specific configuration for interfacing the experiment controls with the DAQ. CODA 3 will include support for full experiment controls which will be leveraged by the Hall-D online system. This will include checks on various non-DAQ online systems by the DAQ system to help ensure data integrity. How the configurations will be maintained and access to their modification will be limited will be addressed in the plan.