|
|
The CEBAF accelerator at Jefferson Lab provides high-energy electron beams to each of three experimental halls. In each hall the electron beam interacts with a target producing a variety of particles or reaction products. Spectrometers are complicated particle-detector systems that identify the reaction products and determine their kinematic parameters – energy, momentum, etc. Two of the experimental halls utilize high-resolution spectrometers. The third operates with a large solid-angle detector system, the CLAS spectrometer. In all cases a combination of particle detectors and magnetic fields are used to identify the reaction products. The spectrometers consist of many layers of individual detectors. Multiple layers of wire chambers are used to provide tracking information. These detectors have hundreds to thousands of sense wires per layer that indicate a hit when a particle passes nearby. As a result, much of the data stream is devoted to storing wire chamber information. Additionally, much of the computing resources are devoted to reconstructing the particles’ tracks. Layers of plastic scintillator detectors provide timing and energy information for particles passing through them. They also serve to trigger the detector’s electronic readout. Large calorimeter detectors provide energy and particle identification information, and several types of Cerenkov detectors provide additional particle-identification information. Magnetic fields are employed in combination with the position-sensitive wire chambers to momentum analyze the particles as they pass through the spectrometer. Therefore, the layers of detectors are placed before, after, and inside the region of magnetic field to identify the particles’ path. The experiments produce large data sets. For the two experimental halls with high-resolution spectrometers about 20 layers of detectors are used, creating 1 kilobyte of data per physics event. Given that typical data rates run about 1 kHz, about 100 gigabytes of data are produced per day. For the CLAS, the large acceptance spectrometer, much more data is acquired. As the CLAS system essentially completely surrounds the target region, all particles created in the reaction can be detected simultaneously, as compared to one or two particles for the other experimental halls. As a result the number of readout channels expands dramatically from 1,200 in the other halls to almost 40,000 for CLAS. Physics event size increases to around 5 kilobytes, so with a 2.5 kHz data rate, over one terabyte of data is taken per day, resulting in over 100 terabytes of data per year. This is expected to double in the near future as data acquisition capacities increase. Acquiring data is only half the battle in extracting the physics. All of the data must be reconstructed to analyze the physics events. Wire chamber data is analyzed for tracks – linking the various "hits" in the wires to physical paths. As magnetic fields are involved, tracks aren’t straight and the particle’s curvature in the magnetic fields must be taken into account. This becomes even more complicated in CLAS since there are multiple particles to be tracked through a complicated magnetic field involving many more wire chamber channels than in the other spectrometers. The other detectors’ information must also be folded into this analysis. For each track in a set of wire chambers, the corresponding data from the plastic scintillators, calorimeters, and Cerenkov detector must be correlated. Once the correlations are complete, each track can be identified as a particle and the physics information can be deduced. At this stage the physics analysis begins by forming the various physics observables and evaluating their meaning. The analyses may be done in a single pass or they may include multiple passes. For the high-resolution spectrometers the data involves a single reaction and often the raw data is often analyzed directly for physics information. Multiple physics processes are contained in the CLAS spectrometer’s data set. As a result a multi-step analysis process is required. The first pass is used to extract tracks, correlate them, and assign the physics kinematic variables. This serves as a common pass for all analyses. After the first pass the various subgroups begin their analyses, focusing on the physics events in their region of interest to form their physics observables. Clearly all of these analyses are data intensive tasks. Significant resources must be devoted to handle the flow of the data – from detector to storage, from storage to analysis, and from analysis back to storage. Simultaneously, resources are used to manage the bookkeeping of data location from the various experiments and to keep track of the correlated information in the data stream. Our application will address problems in all of these areas, providing a breakthrough application that will be useful at Jefferson Lab and beyond. |