GlueX Analysis Workshop 2013
From GlueXWiki
Contents
Purpose
The purpose of this tutorial/workshop session will be to give an introduction to using the Hall-D analysis tools. It is open to anyone either inside or outside the GlueX collaboration who has an interest in doing physics inside Hall-D. The tutorial will cover the basics of:
- Physics Analysis Framework
- Using Boosted Decision Trees
- Amplitude Analysis using AMPtools
The tutorial will consist of a few talks with room for questions/discussion.
Location and Time
The tutorial will take place:
DATE: July. 10-11, 2013
TIME: 9:00am to 4:30pm EDT (i.e. JLab time).
LOCATION: CEBAF Center L102 - L104
Registration
Remote Participation
Schedule Details (Work in Progress)
Overview (Matt)
- Talk (30 min): Physics Overview and Analysis Strategy
- Define the physics goals
- Discuss detector
- Discuss general analysis strategy
- Talk (15 min): A specific physics example: γp→π+π+π-n
- Exercise (15 min): Generate γp→π+π+π-n with intermediate resonances
Monte Carlo Generators and Detector Simulation (Mark)
- Talk (30 min): Other standard monte carlo tools (bggen, genr8, hdgeant, mcsmear)
- Exercise (20 min): Generate pythia data, run HDGeant and mcsmear
JANA/DANA (Paul)
- Talk (35 mins): Intro to JANA/DANA (factories & plugins)
- JANA is a framework: multithreaded, factory-based (describe concept of factories, tags (e.g. BCAL)), event-processors control the work flow
- The primary executable (hd_root) does virtually nothing by itself, the processor in your plugin tells it what to do
- JANA automatically loops over the events in your input files, executing the primary factory & plugin methods as needed: init(), brun(), evnt(), erun(), fini()
- Track/Shower reconstruction, REST format, danarest plugin
- Basic Analysis Classes: DKinematicData, DChargedTrackHypothesis, DChargedTrack, DNeutralShower, DMCThrown, DBeamPhoton
- hd_dump: Prints data to screen
- Exercise (25 mins):
- Use danarest plugin to do track reconstruction.
- Print basic analysis objects (DMCThrown, etc.) to screen via hd_dump and inside of plugin
- Stretch goal: figure out how to get the run number from inside the evnt() method and print it to screen for each event.
ANALYSIS Library (Paul)
Analysis Basics
- Talk (25 mins):
- User specifies reaction & decay chain, library generates all particle combinations
- Particle Combinations: Why, and why so many? (extra tracks, any PID of same q, decay products of different particles, beware double-counting), DParticleCombo contents
- Basic example plugin: 3pi_n (go over line by line)
- Exercise (25 mins):
- Generate 3pi_n plugin, manually histogram: #combos, #events with >= 1 combo, missing mass (calc manually), p-vs-theta & vertex-z for pi+'s and pi-'s
Kinematic Fitting
- Talk (30 mins):
- What it is, how it works, constraints, how neutrals & magnetic field are handled
- Setting it up automatically and manually (go over line-by-line)
- Exercise (30 mins):
- Auto-kinematic fit 3pi_n, histogram kinematic fit confidence level, missing mass after kinematic fit (conlev > 1%)
- Manually kinematic fit 3pi_n, including histograming confidence levels, missing mass after kinematic fit (conlev > 1%)
- Stretch goal: histogram pi- px and vx pulls (conlev > 5%)
Analysis Details
- Talk (30 mins):
- Thrown/Reconstructed track matching, PID FOM Calculation, new DTrackTimeBased for new PIDs (q+/q- subtlety), Event RF Bunch Selection
- Different object versions (DParticleCombo, hypotheses)
- Exercise (25 mins):
- Manually histogram: pid conlevs (dE/dx, timing, total)
- Stretch goal: histogram pid confidence level vs. projected track start time for different PIDs for BCAL/TOF
Analysis Actions - Pre-existing
- Talk (25 mins):
- Reaction-Independent (can be called manually): thrown, reconstructed track distros, gen-recon comparison (go over line-by-line on one)
- Reaction-Dependent Histograms (executed for each combo): PID, kinfit results, missing mass, invariant mass (go over line-by-line on one)
- Reaction-Dependent Cuts (executed for each combo): PID, kinfit conlev, missing mass, invariant mass (go over line-by-line on one)
- #events/combos pass cuts histograms
- Exercise (25 mins):
- Add reaction-independent analysis actions: fill histograms and look at kinematics in ROOT
- Add reaction-dependent analysis actions: hist/cut pid, hist/cut kinematic fit results, hist missing mass before/after kinfit conlev cut
Analysis Actions - Custom
- Talk (10 mins):
- Creating your own custom actions, remind to beware double-counting, warn about multi-threading issues
- Exercise (15 mins):
- Generate custom action to make dalitz plot of 3 pions.
REST & ROOT
- Talk (25 mins):
- REST Skims, TTree format, Making/Using a TSelector, beware double-counting when filling ROOT trees.
- Exercise (30 mins):
- Create a skim of 3pi_n events, analyze the skim data, save the results to a root ttree, analyze the data in ROOT (make some plots)
Analyzing Decays
- Talk (5 mins):
- Setting up DReaction
- Exercise (45 mins):
- Make a plugin to analyze: γp→η'p, η'→ηπ+π-, η→γγ
- Do kinematic fitting, output to ROOT, histogram PID, kinfitresults, mass peaks, etc.
- Make a plugin to analyze: γp→η'p, η'→ηπ+π-, η→γγ
Boosted Decision Trees (Mike & Justin)
- Talk (45 mins): Multivariate Analysis Overview
- Talk (20 mins):
- Introduce TMVA, setup factory, select input variables, book methods, and training samples
- Exercise (25 mins):
- Train BDT using 3pi_n samples, make plots of input variables and classifier response with TMVAGui
- Talk (20 mins):
- Describe weight files, setup reader, evaluation on larger samples, and storing classifier response in TTree
- Exercise (25 mins):
- Apply BDT to large background sample, compute efficiency and purity
- Talk (10 mins):
- Describe setup for other Multivariate Classifiers
- Exercise (20 mins):
- Try many different classifiers for 3pi_n channel and evaluate with TMVAGui
Amplitude Analysis (Matt)
- Talk (45 mins): Amplitude Analysis Fitting Strategy
- Exercise (20 mins): performing a single fit and analyzing projections
- Talk (30 mins): Multiple fits, opportunities for extensions
- Exercise (20 mins): performing multiple fits in different bins of a mass spectrum