DSelector

From GlueXWiki
Jump to: navigation, search

Overview

  • Inherits from TSelector: Can be used with TTree::Process(), PROOF.

Improvements over TSelectors

  • You can access the data in the C-style branches from C++ interface classes. After initial setup, using these classes incurs almost no overhead. With these classes, data can be easily passed into external functions, unlike the hard-coded-in-your-header-file TSelector data members. This enables the creation and usage of library code that can be shared amongst collaborators, including analysis actions.
  • The TTree knows what your reaction was, so when you generate your DSelector code is automatically generated with this in mind: DParticleCombo + example code.
  • Array sizes are no longer hard-coded (they are adaptively expanded). This means that you can generate your DSelector with one TTree, and keep using it regardless how much more data you get. With basic TSelector's, you have to regenerate your TSelector each time you get more data, in case it contains an event that needs a larger array size.
  • You can use the exact same code whether you're running directly over a tree, or you're running with PROOF. With just a TSelector, you have to change a lot of code to switch between them.

Setting up the software & environment

Setting up your environment to use any GlueX software including this one depends on where you are working and also what you may want to do.

If you work on the JLAB ifarm system

A good start to find out how to setup your computing environment can be found HERE: Setting up the GlueX Environment

If you work on a University computer system

Contact your local GlueX software expert and ask about how to do this at your local place.

If you want to use your own independent DSelector software install

This section is for you if you already have a GlueX environment but you want your own explicit copy of the DSelector code that you can use and modify at your own risk.

1) Go to the directory where you want the source code to go. Checkout the software here:

git clone https://github.com/JeffersonLab/gluex_root_analysis

2) Set the path to the checked-out gluex_root_analysis directory to be the variable:

$ROOT_ANALYSIS_HOME

3) After sourcing your standard GlueX environment file, source the environment file appropriate for your shell:

source $ROOT_ANALYSIS_HOME/env_analysis.csh
OR
source $ROOT_ANALYSIS_HOME/env_analysis.sh

4) Build and install the software

cd $ROOT_ANALYSIS_HOME
./make_all.sh

5) Optional, Recommended: Add the following to your ~/.rootrc file (create it if none present) to setup gStyle to make nice-looking plots:

Rint.Logon: $ROOT_ANALYSIS_HOME/scripts/rootlogon.C

Creating a DSelector

1) Run the MakeDSelector program to make a DSelector for your TTree. Run it with no arguments for usage instructions.

MakeDSelector

2) You can run the DSelector straight out of the box as indicated above and it will tell you what input it requires. It will in fact require 3 input strings and it will tell you what they are:

  1. the first is a string to a root file that contains the Root Tree you want to analyze.
  2. the second is a string that ask for the explicit name of the Root Tree. You can find the name by opening the root file with root -l RootFileToBeAnalyzed.root and on the prompt type in TBrowser a to open a browser window in which you can find the tree name listed on the content of the file on the right side of the window.
  3. the third string is your choice of how you want to label your DSelector. The program will create two files according to his label: DSelector_LABEL.h and DSelector_LABEL.C which contain the frame work of you DSelector analysis code.

3) Let's assume that you have generated a ROOT TTree file called "tree_omega.root" and that this file contains a tree named "omega_skim_Tree". Then, the command that will generate a DSelector for this file and tree is:

MakeDSelector tree_omega.root omega_skim_Tree my_selector

The last argument is the name you want to give to your DSelector.

4) When customizing your DSelector be very careful to read ALL of the pre-generated comments in the code. They are there for a reason.

5) Write code similar to the pre-generated code, and everything will work fine.

Using a DSelector

  • Example of analyzing a single root file:
root -l -b my_tree_file.root
.x $ROOT_ANALYSIS_HOME/scripts/Load_DSelector.C
my_tree_name->Process("DSelector_my_selector.C+");
.q
  • If the name of your Tree starts with an integer (say 2pi0eta_Tree), run as follows:
root -l -b my_tree_file.root
.x $ROOT_ANALYSIS_HOME/scripts/Load_DSelector.C
TTree* locTree = (TTree*)gDirectory->Get("2pi0eta_Tree");
locTree->Process("DSelector_my_selector.C+");
.q
  • Use TChain to run over many files by doing root -l "do_myanalysis()": (++ forces recompilation)
#include <dirent.h>
#include <TChain.h>
 
// to be replace by your application:
// LOCATIONOFROOTFILES
// LOCATIONOFROOTFILES
// STRINGSTUB
// my_deselector
 
void do_myanalysis() {
  string treename("MYTREENAME");
  TChain chain(treename.c_str());
  string loc("LOCATIONOFROOTFILES/");
  vector <string> RootFilesToDo;
  DIR *dir;
  struct dirent *ent;
  string FileStub("STRINGSTUB");
  if ((dir = opendir (loc.c_str())) != NULL) {
    /* print all the files and directories within directory */
    while ((ent = readdir (dir)) != NULL) {
      string d = ent->d_name;
      if (d.find(FileStube.c_str()) != string::npos) {
        RootFilesToDo.push_back(d);
      }
    }   
    closedir (dir);
  } else {
    /* could not open directory */
    perror ("");
    return EXIT_FAILURE;
  }
  for (unsigned int k=0; k<RootFilesToDo.size(); k++) {
    string f = loc+RootFilesToDo[k];
    chain.Add(f.c_str());
  }
 
  gROOT->ProcessLine(".x $ROOT_ANALYSIS_HOME/scripts/Load_DSelector.C");
  //chain.Process("DSelector_my_selector.C++"); // this will analyze the whole chain of trees
  chain.Process("DSelector_my_selector.C++", "", 100000); // this will analyze the first 100000 events

Using DSelector's with PROOF-Lite

Overview

  • PROOF is ROOT's solution for running over a TChain multi-threaded.
  • PROOF is for distributed computing, whereas PROOF-Lite is for running multi-threaded on your local machine.
    • Note that each thread has it's own copy of ROOT histograms, and then they are merged together at the end. So, watch your memory usage, because it multiplies.
    • However, since the objects in each thread are totally isolated from one another, you do not need to worry about locking.
  • The DSelector is automatically setup to work either with or without PROOF (or PROOF-Lite).
  • Note: "cout" statements are written to log files, not to screen. If you want to print to screen, call:
gProofServ->SendAsynMessage("My message"); //must #include "TProofServ.h"
  • Note: PROOF log files can be viewed from the PROOF GUI that gets launched, or you can find them in your ~/.proof/ folder.

Instructions

1) It is highly recommended that you add the below line to your ~/.rootrc file (create it if it doesn't exist). This is the maximum number of previous sessions (thread files) that PROOF-Lite will keep on your disk. Once the max is reached, it will delete the oldest ones. At the moment the default is 10, so if you execute 2 simultaneous instances of PROOF-Lite with 8 threads each, it will break the first one, unless you increase this value.

Proof.MaxOldSessions 100

2) After each time the DSelector library is built, (re-)build the PROOF DSelector package (this is done automatically by the "make_all.sh" file):

cd $ROOT_ANALYSIS_HOME/programs/MakePROOFPackage/
./build.sh

3) Launch ROOT, and load the DSelector library:

  • Note: PROOF-Lite will launch a GUI, so run ROOT with -b if you don't want it. It's useful for viewing log files though.
root -l
.x $ROOT_ANALYSIS_HOME/scripts/Load_DSelector.C

4) By default, PROOF-Lite will write the log files into $HOME/.proof which may fill up your quota. To avoid this, the so-called sandbox can be set to the local (or any other) directory with this command:

gEnv->SetValue("ProofLite.Sandbox", "$PWD/.proof/")

N.B.: DPROOFLiteManager::Set_SandBox("./") will not work, since the session has already been started.

5) Run PROOF-Lite:

DPROOFLiteManager::Process_Tree("my_tree_file.root", "my_tree_name", "my_selector.C+", my_num_threads); //my_num_threads = unsigned int

OR: Build your own TChain* (my_tchain) of files and run over it instead:

DPROOFLiteManager::Process_Chain(my_tchain, "my_selector.C+", my_num_threads); //my_num_threads = unsigned int

Note: Depending on what you are doing, you may need to use a different Process_Tree call. See DPROOFLiteManager.h for other calls.

6) example of modifying above do_analysis(){} root script for DPROOFLite:

...
 
int NumThreads = 6; // your choice of number of threads to use
 
void do_analysis(){
#include "TProof.h"
#include "TProofDebug.h"
  R__LOAD_LIBRARY(libDSelector);
  ....
 
  gROOT->ProcessLine(".x $ROOT_ANALYSIS_HOME/scripts/Load_DSelector.C");
  DPROOFLiteManager *dproof = new DPROOFLiteManager();
  dproof->Process_Chain(&chain, "DSelector_my_selector.C++", NumThreads, "outfilehist.root", "outfiletree.root"); 
}

Reading custom branches

  • If the branch is associated with a particle you can call:
DKinematicData::Get_Fundamental<Float_t>("MyVariable");
DKinematicData::Get_TObject<TVector3>("MyVector");

https://github.com/JeffersonLab/gluex_root_analysis/blob/master/libraries/DSelector/DKinematicData.h

  • Ditto for DParticleCombo:
DParticleCombo::Get_Fundamental<Float_t>("MyVariable");
DParticleCombo::Get_TObject<TVector3>("MyVector");

https://github.com/JeffersonLab/gluex_root_analysis/blob/master/libraries/DSelector/DParticleCombo.h

  • Otherwise, or in general, in your selector you can do:
dTreeInterface->Get_Fundamental<Float_t>("MyVariable");
dTreeInterface->Get_Fundamental<Float_t>("MyVariable", my_array_index);
dTreeInterface->Get_TObject<TVector3>("MyVector");
dTreeInterface->Get_TObject<TVector3>("MyVector", my_array_index);

https://github.com/JeffersonLab/gluex_root_analysis/blob/master/libraries/DSelector/DTreeInterface.h

Making Cuts and Saving the Survivors Into a New TTree

Cloning the Tree

  • To clone the input tree into an output ROOT file (except for the cuts that you make), first set the output file name inside of your selector's Init function:
dOutputTreeFileName = "my_tree_file.root"; //"" for none
  • This will automatically create the tree in this file. Then, to save the event (if it passes your cuts), inside of your selector's Process function:
FillOutputTree();
  • If you want to flag some combos as cut, but not the entire event, then for each combo that fails a cut call:
dComboWrapper->Set_IsComboCut(true);

Adding Custom Branches

  • To add custom branches, in the selector's Init function, AFTER the call to:
if(locInitializedPriorFlag)
	return; //have already created histograms, etc. below: exit
  • Call (e.g.):
dTreeInterface->Create_Branch_Fundamental<Int_t>("My_Int_Branch");
dTreeInterface->Create_Branch_NoSplitTObject<TLorentzVector>("My_P4_Branch");
dTreeInterface->Create_Branch_FundamentalArray<Float_t>("My_Float_Array", "Name_Of_Branch_Containing_Array_Size", 10); //10: init array size, will expand as needed
dTreeInterface->Create_Branch_ClonesArray<TLorentzVector>("My_P4_Array_Branch", 10); //10: init array size, will expand as needed
  • Then, fill the branches inside of your selector's Process function, call (e.g.):
dTreeInterface->Fill_Fundamental<Int_t>("My_Int_Branch", 5); //5: e.g. value
dTreeInterface->Fill_Fundamental<Float_t>("My_Float_Array", 4.7, 2); //4.7: e.g. value //2: e.g. array index
dTreeInterface->Fill_TObject<TLorentzVector>("My_P4_Branch", locP4); //locP4: e.g. value
dTreeInterface->Fill_TObject<TLorentzVector>("My_P4_Array_Branch", locP4, 0); //locP4: e.g. value //0: e.g. array index