HOWTO Use TMVA for a Boosted Decision Tree Analysis

From GlueXWiki
Jump to: navigation, search

Overview

This page is meant to be a summary of the information needed to get started using the TMVA package from ROOT to do Boosted Decision Tree (BDT) and other types of Multivariate Analyses (MVAs). The TMVA webpage and particularly the TMVA User Guide has considerable documentation on the different types of MVAs provided by the TMVA package which I won't go into more detail here.

Getting Started

TMVA is a standard package in ROOT (since v5.11) so for most people there should be no source code to check out or compile.

A generic example exists in the TMVA tutorial. For a simple test that TMVA is properly installed in your ROOT version run the following macro to train a simple MVA:

root -l $ROOTSYS/tmva/test/TMVAClassification.C

This should result in the creation of a ROOT file named TMVA.root and a weights/ directory containing several .xml and .C files with filenames starting with TMVAClassification. The TMVA.root file contains many useful diagnostic histograms to understand the performance the MVAs that were trained for the classification problem you're studying. There is a TMVA GUI which lets you display those diagnostic histograms very easily:

root -l $ROOTSYS/tmva/test/TMVAGui.C

The files in the weights/ directory contain the weights for each of the MVA types built in the training macro above, such as the BDT classifier. These weight are then used by a second macro to evaluate each of the classifiers for an independent sample of events:

root -l $ROOTSYS/tmva/test/TMVAClassificationApplication.C

GlueX Specific Example

This framework was used in the July 2013 GlueX Analysis Workshop to select n3pi events using the Analysis TTree framework. More details can be found in the presentations and exercises in Session 6a-7b of the workshop