SlideShare a Scribd company logo
1 of 42
Machine Learning with  T MVA A ROOT based Tool for Multivariate Data Analysis DESY Computing Seminar Hamburg 14.1.2008 The TMVA developer team:   Andreas H ö cker, Peter Speckmeyer,  J ö rg Stelzer , Helge Voss
General Event Classification Problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Example: k=2, n=3 H 2 H 1 x 1 x 2 H 3
General Event Classification Problem ,[object Object],[object Object],Simple example    I can do it by hand. H 2 H 1 x 1 x 2 H 3 Non-linear Boundaries ? H 2 H 1 x 1 x 2 H 3 Linear Boundaries ? H 2 H 1 x 1 x 2 H 3 Rectangular Cuts ?
[object Object],[object Object],[object Object],[object Object],[object Object],General Event Classification Problem ,[object Object],[object Object],Machine Learning
Classification Problems in HEP ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Classifiers in  T MVA
Rectangular Cut  Optimization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Projective Likelihood Estimator (PDE) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Reference PDF’s
Estimating PDF Kernels ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],original distribution is Gaussian Easy to automate, can create artefacts/suppress information Difficult to automate for arbitrary PDFs ,[object Object],Automatic, unbiased,  but  suboptimal
Multidimensional PDE ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],test event Carli-Koblitz, NIM A501, 576 (2003) H 1 H 0 x 1 x 2
Fisher’s Linear Discriminant Analysis ,[object Object],[object Object],[object Object],[object Object],Projection: Fisher Coefficients: W: sum of S and B covariance matrices ,[object Object],[object Object],[object Object],[object Object],[object Object],New
Artificial Neural Network (ANN) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],1 i . . . N 1 input layer k  hidden layers 1 ouput layer 1 j M 1 . . . .   .   . 1 . . . M k N var  discriminating input variables  . . . . . . 1 output variable Weierstrass theorem: MLP can approximate every continuous function to arbitrary precision with just one layer and infinite number  of nodes y’ j Typical activation function  A v’ j
Boosted Decision Trees (BDT) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],BDT requires only little tuning to achieve good performance S,B S 1 ,B 1 S 2 ,B 2
Predictive Learning via Rule Ensembles (RuleFit) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Friedman-Popescu,  Tech Rep, Stat. Dpt, Stanford U., 2003 rules  (cut sequence     r m =1 if all cuts satisfied, =0 otherwise) normalised discriminating event variables RuleFit classifier Linear Fisher term Sum of rules
Support Vector Machine ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],x 1 x 2 margin support vectors Separable data optimal hyperplane Non-separable data  (x 1 ,x 2 ) x 1 x 2 x 1 x 3 x 1 x 2
Data Preprocessing: Decorrelation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],original SQRT derorr. PCA derorr.
Is there a best Classifier ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
No Single Best         SVM         Curse of dimensionality Classifiers Clarity Robust-ness Speed Perfor-mance Criteria          MLP        BDT        RuleFit        H-Matrix        Fisher    Weak input variables      /     PDERS/ k-NN Overtraining  Response Training nonlinear correlations no / linear correlations      Cuts      Likeli-hood
What is  T MVA ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],Using  T MVA
Technical Aspects ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Users Guide at  http:// sf.tmva.net :  97p., classifier descriptions, code examples arXiv physics/0703039
Training with  T MVA ,[object Object],[object Object],[object Object],[object Object],Template TMVAnalysis.C (also .py) available at $TMVA/macros/ and $ROOTSYS/tmva/test/  T MVA GUI
Evaluation Output ,[object Object],[object Object],[object Object],[object Object],[object Object],Evaluation results ranked by best signal efficiency and purity (area) ------------------------------------------------------------------------------ MVA  Signal efficiency at bkg eff. (error):  |  Sepa-  Signifi- Methods:  @B=0.01  @B=0.10  @B=0.30  Area  |  ration:  cance: ------------------------------------------------------------------------------ Fisher  : 0.268(03)  0.653(03)  0.873(02)  0.882  |  0.444  1.189 MLP  : 0.266(03)  0.656(03)  0.873(02)  0.882  |  0.444  1.260 LikelihoodD  : 0.259(03)  0.649(03)  0.871(02)  0.880  |  0.441  1.251 PDERS  : 0.223(03)  0.628(03)  0.861(02)  0.870  |  0.417  1.192 RuleFit  : 0.196(03)  0.607(03)  0.845(02)  0.859  |  0.390  1.092 HMatrix  : 0.058(01)  0.622(03)  0.868(02)  0.855  |  0.410  1.093 BDT  : 0.154(02)  0.594(04)  0.838(03)  0.852  |  0.380  1.099 CutsGA  : 0.109(02)  1.000(00)  0.717(03)  0.784  |  0.000  0.000 Likelihood  : 0.086(02)  0.387(03)  0.677(03)  0.757  |  0.199  0.682 ------------------------------------------------------------------------------ Testing efficiency compared to training efficiency (overtraining check) ------------------------------------------------------------------------------ MVA  Signal efficiency: from test sample (from training sample) Methods:  @B=0.01  @B=0.10  @B=0.30 ------------------------------------------------------------------------------ Fisher  : 0.268 (0.275)  0.653 (0.658)  0.873 (0.873) MLP  : 0.266 (0.278)  0.656 (0.658)  0.873 (0.873) LikelihoodD  : 0.259 (0.273)  0.649 (0.657)  0.871 (0.872) PDERS  : 0.223 (0.389)  0.628 (0.691)  0.861 (0.881) RuleFit  : 0.196 (0.198)  0.607 (0.616)  0.845 (0.848) HMatrix  : 0.058 (0.060)  0.622 (0.623)  0.868 (0.868) BDT  : 0.154 (0.268)  0.594 (0.736)  0.838 (0.911) CutsGA  : 0.109 (0.123)  1.000 (0.424)  0.717 (0.715) Likelihood  : 0.086 (0.092)  0.387 (0.379)  0.677 (0.677) ----------------------------------------------------------------------------- Better classifier
More Evaluation Output --- Fisher  : Ranking result (top variable is best ranked) --- Fisher  : ---------------------------------------------------------------- --- Fisher  : Rank : Variable  : Discr. power --- Fisher  : ---------------------------------------------------------------- --- Fisher  :  1 : var4  : 2.175e-01 --- Fisher  :  2 : var3  : 1.718e-01 --- Fisher  :  3 : var1  : 9.549e-02 --- Fisher  :  4 : var2  : 2.841e-02 --- Fisher  : ---------------------------------------------------------------- Better variable --- Factory  : Inter-MVA overlap matrix (signal): --- Factory  : ------------------------------ --- Factory  :  Likelihood  Fisher --- Factory  : Likelihood:  +1.000  +0.667 --- Factory  :  Fisher:  +0.667  +1.000 --- Factory  : ------------------------------ Input Variable Ranking Classifier correlation and overlap ,[object Object],[object Object]
Graphical Evaluation ,[object Object]
Graphical Evaluation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Visualization Using the GUI ,[object Object],average no. of nodes before/after pruning: 4193 / 968
Choosing a Working Point ,[object Object],[object Object],[object Object],[object Object]
Applying the Trained Classifier ,[object Object],[object Object],[object Object],[object Object],[object Object],Templates TMVApplication.C ClassAplication.C available at $TMVA/macros/ and $ROOTSYS/tmva/test/ std::vector<std::string> inputVars; … classifier = new ReadMLP ( inputVars );  for (int i=0; i<nEv; i++) { std::vector<double> inputVec = …; double retval = classifier->GetMvaValue( *inputVec ); } from ClassApplication.C
Extending  T MVA ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
A Word on Treatment of Systematics? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Conclusion Remarks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Acknowledgments :  The fast development of TMVA would not have been possible without the contribution and feedback from many developers and users to whom we are indebted. We thank in particular the CERN Summer students Matt Jachowski (Stanford) for the implementation of TMVA's new MLP neural network, Yair  Mahalalel (Tel Aviv) for a significant improvement of PDERS, and Or Cohen for the development of the general classifier boosting, the Krakow student Andrzej Zemla and his supervisor Marcin Wolter for programming a powerful Support Vector Machine, as well as Rustem Ospanov for the development of a fast k-NN algorithm. We are grateful to Doug Applegate, Kregg Arms, Ren é  Brun and the ROOT team,  Tancredi Carli, Zhiyi Liu, Elzbieta Richter-Was, Vincent Tisserand and Alexei Volk for helpful conversations.
Outlook ,[object Object],[object Object],[object Object],[object Object]
A Few Toy Examples
Checker Board Example ,[object Object],[object Object],Theoretical maximum
Linear-, Cross-, Circular Correlations ,[object Object],Linear correlations (same for signal and background) Linear correlations (opposite for signal and background) Circular correlations (same for signal and background)
Linear-, Cross-, Circular Correlations ,[object Object],Linear correlations (same for signal and background) Cross-linear correlations (opposite for signal and background) Circular correlations (same for signal and background) Likelihood Likelihood - D PDERS Fisher MLP BDT
Final Performance ,[object Object],Linear Example Cross Example Circular Example
Additional Information
Stability with Respect to Irrelevant Variables ,[object Object],use only two discriminant variables in classifiers use all discriminant variables in classifiers
TMVAnalysis.C Script for Training void TMVAnalysis( )  {   TFile* outputFile = TFile::Open( &quot;TMVA.root&quot;, &quot;RECREATE&quot; );   TMVA::Factory *factory = new TMVA::Factory( &quot;MVAnalysis&quot;, outputFile,&quot;!V&quot;);    TFile *input = TFile::Open(&quot;tmva_example.root&quot;);   factory->AddSignalTree  ( (TTree*)input->Get(&quot;TreeS&quot;), 1.0 );   factory->AddBackgroundTree ( (TTree*)input->Get(&quot;TreeB&quot;), 1.0 );   factory->AddVariable(&quot;var1+var2&quot;, 'F');   factory->AddVariable(&quot;var1-var2&quot;, 'F');   factory->AddVariable(&quot;var3&quot;, 'F');   factory->AddVariable(&quot;var4&quot;, 'F');   factory->PrepareTrainingAndTestTree(&quot;&quot;, &quot;NSigTrain=3000:NBkgTrain=3000:SplitMode=Random:!V&quot; );     factory->BookMethod( TMVA::Types::kLikelihood, &quot;Likelihood&quot;,  &quot;!V:!TransformOutput:Spline=2:NSmooth=5:NAvEvtPerBin=50&quot; );    factory->BookMethod( TMVA::Types::kMLP, &quot;MLP&quot;, &quot;!V:NCycles=200:HiddenLayers=N+1,N:TestRate=5&quot; );   factory->TrainAllMethods();   factory->TestAllMethods();   factory->EvaluateAllMethods();      outputFile->Close();   delete factory; } create  Factory give training/test trees register input variables train, test and evaluate select MVA methods
TMVApplication.C Script for Application void TMVApplication( )  {   TMVA::Reader *reader = new TMVA::Reader(&quot;!Color&quot;);    Float_t var1, var2, var3, var4;   reader->AddVariable( &quot;var1+var2&quot;, &var1 );   reader->AddVariable( &quot;var1-var2&quot;,  &var2 );   reader->AddVariable( &quot;var3&quot;, &var3 );   reader->AddVariable( &quot;var4&quot;, &var4 );   reader->BookMVA( &quot;MLP classifier&quot;,  &quot;weights/MVAnalysis_MLP.weights.txt&quot; );   TFile *input = TFile::Open(&quot;tmva_example.root&quot;);   TTree* theTree = (TTree*)input->Get(&quot;TreeS&quot;);   // … set branch addresses for user TTree   for (Long64_t ievt=3000; ievt<theTree->GetEntries();ievt++) {   theTree->GetEntry(ievt);   var1 = userVar1 + userVar2;   var2 = userVar1 - userVar2;   var3 = userVar3;   var4 = userVar4;   Double_t out =  reader->EvaluateMVA(  &quot;MLP classifier&quot;  ); //  do something with it … }   delete reader; }  register the variables book classifier(s) prepare event loop  compute input variables calculate classifier output create  Reader

More Related Content

What's hot

Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain홍배 김
 
MLHEP Lectures - day 3, basic track
MLHEP Lectures - day 3, basic trackMLHEP Lectures - day 3, basic track
MLHEP Lectures - day 3, basic trackarogozhnikov
 
Vc dimension in Machine Learning
Vc dimension in Machine LearningVc dimension in Machine Learning
Vc dimension in Machine LearningVARUN KUMAR
 
Image Classification And Support Vector Machine
Image Classification And Support Vector MachineImage Classification And Support Vector Machine
Image Classification And Support Vector MachineShao-Chuan Wang
 
Amnestic neural network for classification
Amnestic neural network for classificationAmnestic neural network for classification
Amnestic neural network for classificationlolokikipipi
 
Svm and kernel machines
Svm and kernel machinesSvm and kernel machines
Svm and kernel machinesNawal Sharma
 
Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Mostafa G. M. Mostafa
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 
Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function InterpolationJesse Bettencourt
 
Application of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosisApplication of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosisDr.Pooja Jain
 

What's hot (14)

Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain
 
MLHEP Lectures - day 3, basic track
MLHEP Lectures - day 3, basic trackMLHEP Lectures - day 3, basic track
MLHEP Lectures - day 3, basic track
 
Clustering lect
Clustering lectClustering lect
Clustering lect
 
Vc dimension in Machine Learning
Vc dimension in Machine LearningVc dimension in Machine Learning
Vc dimension in Machine Learning
 
Image Classification And Support Vector Machine
Image Classification And Support Vector MachineImage Classification And Support Vector Machine
Image Classification And Support Vector Machine
 
Amnestic neural network for classification
Amnestic neural network for classificationAmnestic neural network for classification
Amnestic neural network for classification
 
505 260-266
505 260-266505 260-266
505 260-266
 
Svm and kernel machines
Svm and kernel machinesSvm and kernel machines
Svm and kernel machines
 
Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)
 
Text categorization
Text categorizationText categorization
Text categorization
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
QMC Opening Workshop, Support Points - a new way to compact distributions, wi...
QMC Opening Workshop, Support Points - a new way to compact distributions, wi...QMC Opening Workshop, Support Points - a new way to compact distributions, wi...
QMC Opening Workshop, Support Points - a new way to compact distributions, wi...
 
Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function Interpolation
 
Application of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosisApplication of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosis
 

Viewers also liked

Multivariate Linear Regression Model for Simulaneous Estimation of Debutanise...
Multivariate Linear Regression Model for Simulaneous Estimation of Debutanise...Multivariate Linear Regression Model for Simulaneous Estimation of Debutanise...
Multivariate Linear Regression Model for Simulaneous Estimation of Debutanise...NSEAkure
 
Multivariate regression techniques for analyzing auto crash variables in nigeria
Multivariate regression techniques for analyzing auto crash variables in nigeriaMultivariate regression techniques for analyzing auto crash variables in nigeria
Multivariate regression techniques for analyzing auto crash variables in nigeriaAlexander Decker
 
Machine learning for medical imaging data
Machine learning for medical imaging dataMachine learning for medical imaging data
Machine learning for medical imaging dataliyiou
 
Multivariate adaptive regression splines
Multivariate adaptive regression splinesMultivariate adaptive regression splines
Multivariate adaptive regression splinesEklavya Gupta
 
What determines Sales of a Product?
What determines Sales of a Product?What determines Sales of a Product?
What determines Sales of a Product?PRIYAJNVCTC
 
Multivariate Data Analysis
Multivariate Data AnalysisMultivariate Data Analysis
Multivariate Data AnalysisMerul Romadhani
 
Multiple regression analysis
Multiple regression analysisMultiple regression analysis
Multiple regression analysisDushyant Bheda
 
Reverse Logistics in Different Industries
Reverse Logistics in Different IndustriesReverse Logistics in Different Industries
Reverse Logistics in Different IndustriesPRIYAJNVCTC
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_financeStefan Duprey
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
 
Multivariate Analysis An Overview
Multivariate Analysis An OverviewMultivariate Analysis An Overview
Multivariate Analysis An Overviewguest3311ed
 
Market interviewing Nishmitha
Market interviewing NishmithaMarket interviewing Nishmitha
Market interviewing NishmithaNishmitha P H
 
Multivariate Analysis Techniques
Multivariate Analysis TechniquesMultivariate Analysis Techniques
Multivariate Analysis TechniquesMehul Gondaliya
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningSri Ambati
 

Viewers also liked (15)

Multivariate Linear Regression Model for Simulaneous Estimation of Debutanise...
Multivariate Linear Regression Model for Simulaneous Estimation of Debutanise...Multivariate Linear Regression Model for Simulaneous Estimation of Debutanise...
Multivariate Linear Regression Model for Simulaneous Estimation of Debutanise...
 
Multivariate regression techniques for analyzing auto crash variables in nigeria
Multivariate regression techniques for analyzing auto crash variables in nigeriaMultivariate regression techniques for analyzing auto crash variables in nigeria
Multivariate regression techniques for analyzing auto crash variables in nigeria
 
Machine learning for medical imaging data
Machine learning for medical imaging dataMachine learning for medical imaging data
Machine learning for medical imaging data
 
Multivariate adaptive regression splines
Multivariate adaptive regression splinesMultivariate adaptive regression splines
Multivariate adaptive regression splines
 
MULTIVARIATE STATISTICAL MODELS’ SYMBOLS
MULTIVARIATE STATISTICAL MODELS’ SYMBOLSMULTIVARIATE STATISTICAL MODELS’ SYMBOLS
MULTIVARIATE STATISTICAL MODELS’ SYMBOLS
 
What determines Sales of a Product?
What determines Sales of a Product?What determines Sales of a Product?
What determines Sales of a Product?
 
Multivariate Data Analysis
Multivariate Data AnalysisMultivariate Data Analysis
Multivariate Data Analysis
 
Multiple regression analysis
Multiple regression analysisMultiple regression analysis
Multiple regression analysis
 
Reverse Logistics in Different Industries
Reverse Logistics in Different IndustriesReverse Logistics in Different Industries
Reverse Logistics in Different Industries
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_finance
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Multivariate Analysis An Overview
Multivariate Analysis An OverviewMultivariate Analysis An Overview
Multivariate Analysis An Overview
 
Market interviewing Nishmitha
Market interviewing NishmithaMarket interviewing Nishmitha
Market interviewing Nishmitha
 
Multivariate Analysis Techniques
Multivariate Analysis TechniquesMultivariate Analysis Techniques
Multivariate Analysis Techniques
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
 

Similar to Jörg Stelzer

Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysisbutest
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysisbutest
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysisbutest
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysisbutest
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysisbutest
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysisbutest
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorizationmidi
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier홍배 김
 
Introduction
IntroductionIntroduction
Introductionbutest
 
isabelle_webinar_jan..
isabelle_webinar_jan..isabelle_webinar_jan..
isabelle_webinar_jan..butest
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesXavier Rafael Palou
 
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Varad Meru
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer PerceptronsESCOM
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Fabian Pedregosa
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1arogozhnikov
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big DataChristian Robert
 
Section5 Rbf
Section5 RbfSection5 Rbf
Section5 Rbfkylin
 

Similar to Jörg Stelzer (20)

Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Introduction
IntroductionIntroduction
Introduction
 
isabelle_webinar_jan..
isabelle_webinar_jan..isabelle_webinar_jan..
isabelle_webinar_jan..
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniques
 
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4
 
Pres metabief2020jmm
Pres metabief2020jmmPres metabief2020jmm
Pres metabief2020jmm
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1
 
Optimization tutorial
Optimization tutorialOptimization tutorial
Optimization tutorial
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big Data
 
Section5 Rbf
Section5 RbfSection5 Rbf
Section5 Rbf
 

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Jörg Stelzer

  • 1. Machine Learning with T MVA A ROOT based Tool for Multivariate Data Analysis DESY Computing Seminar Hamburg 14.1.2008 The TMVA developer team: Andreas H ö cker, Peter Speckmeyer, J ö rg Stelzer , Helge Voss
  • 2.
  • 3.
  • 4.
  • 5.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18. No Single Best         SVM         Curse of dimensionality Classifiers Clarity Robust-ness Speed Perfor-mance Criteria        MLP        BDT        RuleFit        H-Matrix        Fisher    Weak input variables      /     PDERS/ k-NN Overtraining Response Training nonlinear correlations no / linear correlations      Cuts      Likeli-hood
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34. A Few Toy Examples
  • 35.
  • 36.
  • 37.
  • 38.
  • 40.
  • 41. TMVAnalysis.C Script for Training void TMVAnalysis( ) { TFile* outputFile = TFile::Open( &quot;TMVA.root&quot;, &quot;RECREATE&quot; ); TMVA::Factory *factory = new TMVA::Factory( &quot;MVAnalysis&quot;, outputFile,&quot;!V&quot;); TFile *input = TFile::Open(&quot;tmva_example.root&quot;); factory->AddSignalTree ( (TTree*)input->Get(&quot;TreeS&quot;), 1.0 ); factory->AddBackgroundTree ( (TTree*)input->Get(&quot;TreeB&quot;), 1.0 ); factory->AddVariable(&quot;var1+var2&quot;, 'F'); factory->AddVariable(&quot;var1-var2&quot;, 'F'); factory->AddVariable(&quot;var3&quot;, 'F'); factory->AddVariable(&quot;var4&quot;, 'F'); factory->PrepareTrainingAndTestTree(&quot;&quot;, &quot;NSigTrain=3000:NBkgTrain=3000:SplitMode=Random:!V&quot; ); factory->BookMethod( TMVA::Types::kLikelihood, &quot;Likelihood&quot;, &quot;!V:!TransformOutput:Spline=2:NSmooth=5:NAvEvtPerBin=50&quot; ); factory->BookMethod( TMVA::Types::kMLP, &quot;MLP&quot;, &quot;!V:NCycles=200:HiddenLayers=N+1,N:TestRate=5&quot; ); factory->TrainAllMethods(); factory->TestAllMethods(); factory->EvaluateAllMethods(); outputFile->Close(); delete factory; } create Factory give training/test trees register input variables train, test and evaluate select MVA methods
  • 42. TMVApplication.C Script for Application void TMVApplication( ) { TMVA::Reader *reader = new TMVA::Reader(&quot;!Color&quot;); Float_t var1, var2, var3, var4; reader->AddVariable( &quot;var1+var2&quot;, &var1 ); reader->AddVariable( &quot;var1-var2&quot;, &var2 ); reader->AddVariable( &quot;var3&quot;, &var3 ); reader->AddVariable( &quot;var4&quot;, &var4 ); reader->BookMVA( &quot;MLP classifier&quot;, &quot;weights/MVAnalysis_MLP.weights.txt&quot; ); TFile *input = TFile::Open(&quot;tmva_example.root&quot;); TTree* theTree = (TTree*)input->Get(&quot;TreeS&quot;); // … set branch addresses for user TTree for (Long64_t ievt=3000; ievt<theTree->GetEntries();ievt++) { theTree->GetEntry(ievt); var1 = userVar1 + userVar2; var2 = userVar1 - userVar2; var3 = userVar3; var4 = userVar4; Double_t out = reader->EvaluateMVA( &quot;MLP classifier&quot; ); // do something with it … } delete reader; } register the variables book classifier(s) prepare event loop compute input variables calculate classifier output create Reader