SlideShare a Scribd company logo
1 of 1
Download to read offline
QUARK/GLUON JET TAGGING FOR ALICE:
MACHINE LEARNING FOR PARTICLE PHYSICS
ANDREW JOHN LOWE
Wigner Research Centre for Physics, Hungarian Academy of Sciences
INTRODUCTION
Search strategies for new subatomic particles often depend
on being able to efficiently discriminate between signal and back-
ground processes. Particle physics experiments are expensive, the
competition between rival experiments is intense, and the stakes
are high. This has lead to increased interest in advanced statisti-
cal methods to extend the discovery reach of the experiments. We
present a new method that could be used for differentiating be-
tween decays of quarks and gluons at experiments like those at the
Large Hadron Collider (LHC) at CERN. The power to discriminate
between these two types of particle would have a huge impact on
many new physics searches at CERN and beyond.
THE ALICE EXPERIMENT
ALICE (A Large Ion Collider Experiment) is one of seven de-
tector experiments at the LHC. ALICE is focusing on the physics
of strongly interacting matter in heavy-ion (lead nuclei) collisions.
The resulting temperature and energy density are expected to be
high enough to produce quark-gluon plasma, a state of matter wherein
quarks and gluons are freed. Similar conditions are believed to have
existed a fraction of the second after the Big Bang. Recreating this
primordial form of matter and understanding how it evolves is ex-
pected to shed light on questions about how matter is organized,
the mechanism that confines quarks and gluons, and the nature of
strong interactions and how they result in generating the bulk of the
mass of ordinary matter.
Figure 1: Computer generated cut-away view of ALICE.
WHAT IS A JET?
The production of quarks and gluons (collectively known as
partons) via strong interactions is the dominant high-momentum-
transfer process at the LHC. Quarks and gluons are not observed
individually. Instead, we can only measure their decay products.
What we observe is a cone-shaped spray of particles called a jet. The
measured particles are grouped together by a jet algorithm, and the
resultant jets are viewed as a proxy to the initial quarks and gluons
that we can’t measure.
Figure 2: When two high-energy protons collide, the partons that compose them (here only
quarks are depicted in green, red, and blue) can hit each other. Some of these partons (pink
balls) can fly away and "hadronize", forming directional jets of energetic particles (white balls).
From [1].
THE PROBLEM IN A NUTSHELL
Inside ALICE, beams of energetic protons and/or heavy ions
collide. Quarks and gluons emerge and decay into collimated sprays
of particles, and algorithms cluster these decay products into jets.
For each jet, we’d like to know what initiated it: was it a quark or
a gluon? This is an archetypal classification problem that might be
amenable to machine learning.
FEATURE ENGINEERING
There are several differences between quarks and gluons that
prove useful in motivating observables that might distinguish be-
tween jets initiated by quarks as compared to gluons. Specifically,
we wish to leverage differences in jet substructure to construct dis-
criminant variables. Many candidate discriminant variables (fea-
tures) were found during a thorough and extensive literature search,
but we also consider various unintuitive combinations of variables,
following the example in [2]. Combining particle attributes with
each other (to form sums, differences, or products, for example)
leads to a rapid proliferation of features. Consequently, we ex-
plore hundreds of experimentally motivated, physically motivated,
and unmotivated single-variable discriminants.
GETTING & CLEANING DATA
The ALICE analysis software framework ALIROOT was used
to process Monte-Carlo simulated data that contains lots of jets. We
inserted our own C++ code with handcrafted features into ALI-
ROOT. Unphysical (missing) values are denoted by NaNs. We re-
quire that jets have at least two tracks, are fully contained within
tracker geometrical acceptance and are isolated. We then analyse
the floating-point types contained in the data. This is a new sub-
process in particle physics data analysis that we have invented. The
data is contained in two-dimensional array-like structure, in which
each column contains measurements on one feature, and each row
contains one jet. We plot this below:
2 tracks
3 tracks
4 tracks
≥ 5 tracks
Feature
Jet
Number types
Zero
Normal < ε
Normal > ε
Large unnormal or ∞
NaN
Figure 3: "Missingness" and floating-point type map of the data. Feature names and jet ID
numbers have been omitted for clarity.
We observe that several features appear to be duplicates, and
several that are overwhelmingly NaN or zero. These have no pre-
dictive value and are removed. We reset large unnormalised and
infinite values to the largest representable normalised number on
our hardware. Several features have values below the machine
— these are are due to rounding-error in floating-point arithmetic,
and are essentially equal to zero. The variation in their values is not
real, and could be misleading to a classifier. We note that process-
ing very small floating-point values may significantly slow compu-
tation; in extreme cases, instructions may be as much as 100 times
slower [3, 4]. We flush these values to zero, which should speed up
classifier training.
JET TRUTH LABELLING
The jets are assigned a "ground-truth" label using information
in the data simulator event record. However, the labelling procedure
is not unambiguous, and there is significant class noise (i.e., there
are mislabelled jets) in the assignments. We have devised a new la-
belling scheme to address the problem of mislabelled jets. We adapt
the method in [5] by extending it to form an ensemble of multiple
different labelling schemes. We then reject all jets for which the en-
semble does not reach a consensus, i.e., we employ "the wisdom of
the crowds". Limiting class noise is critical for two reasons: firstly,
many machine learning classifiers are confused by mislabelled ob-
servations, and this will damage performance; secondly, the perfor-
mance of a classifier is measured with respect to its ability to cor-
rectly predict the assigned labels, so performance estimates are less
meaningful if the labels are uncertain.
JET TRUTH LABELLING (cont.)
We use an ensemble labeller with five members. To tab-
ulate their outputs would require ten tables corresponding
to the (5
2) possible adjacency matrices which, in turn, corre-
spond to the margins of the 5-dimensional adjacency matrix
that fully describes the relationships between the schemes.
We use a chord diagram to examine these relationships,
which are in agreement with our expectations:
∅
0
b
0
c
0
g
0
70
140
210
q
0
γ
0
∅
0
b
0
c
0
g 0
70
140
210
q 0
γ
0
∅
0
b
0
c
0
g
210
0
70
140
q
0
γ
0
∅
0
b
0
c
0
g
0
70
140
210
q
0
γ
0
∅
0
b
0
c
0
g
0
70
140
210
q
0
γ
0
ᵀp-xa
m
e
n
o
C
erawa-DCQeno
C
ᵀp-xamAG
G
A
Q C D - a w a r e
R
e
clu
stered
∅: no label
q: light quark
g: gluon
c: charm
b: bottom
γ: photon
Figure 4: Chord diagram showing the relationships between five chosen labelling schemes.
There is overwhelming agreement in the label assignments from each scheme, and (as expected)
variations of the "max-pT" scheme are prone to label a jet as photon-initiated (γ) when a QCD-
aware scheme would label the jet as a gluon or assign no label (as noted in [5]).
FEATURE RANKING RESULTS
Our data initially contains more than 300 features. Removing
duplicate and highly correlated features halves the number of fea-
tures; first we filter on the absolute value of Pearson correlation to
identify linearly correlated features, then we filter on the absolute
value of Spearman correlation to identify monotonically-related fea-
tures. To optimally search the remaining feature space to find the
variables that provide the best predictive power, we invented a fast
filter-based method that involves ranking variables by information
gain (Kullback-Leibler divergence) or Gini impurity and comparing
their rank with that of random probes injected into the data. We do
this repeatedly for a large number of bootstrap resamplings to yield
a median (or, optionally, a mean) and nonparametric confidence in-
terval estimate for the chosen metric for each feature. We then re-
move features with values of the metric less than that for random
probes, or within one standard deviation:
Figure 5: Box-and-whisker plot showing median information gain for features and six ran-
dom probes (denoted "BOGUS"). To the left is worse, to the right is better.
In addition to confirming the power of features already
proposed in past work, this method has found intriguing
new variables that promise better discrimination between
quark- and gluon-initiated jets and are therefore ideal can-
didates for further study.
REFERENCES
[1] C. Manuel. The Stopping Power of Hot Nuclear Matter. Physics, 7(97), 2014. doi:
10.1103/Physics.7.97. URL http://link.aps.org/doi/10.1103/Physics.7.97.
[2] J. Gallicchio, J. Huth, M. Kagan, M. D. Schwartz, K. Black, and B. Tweedie. Mul-
tivariate discrimination and the Higgs + W/Z search. JHEP, 04:069, 2011. doi:
10.1007/JHEP04(2011)069.
[3] E. M. Schwarz, M. Schmookler, and S. D. Trong. FPU implementations with denormalized
numbers. IEEE Transactions on Computers, 54(7):825–836, July 2005. ISSN 0018-9340. doi:
10.1109/TC.2005.118.
[4] I. Dooley and L. Kale. Quantifying the interference caused by subnormal floating-point
values. In Proceedings of the Workshop on Operating System Interference in High Performance
Applications, 2006.
[5] A. Buckley and C. Pollard. QCD-aware partonic jet clustering for truth-jet flavour labelling.
Eur. Phys. J., C76(2):71, 2016. doi: 10.1140/epjc/s10052-016-3925-z.
This work was supported by: Hungarian National Research Fund (OTKA) NK106119 and the Wigner GPU Laboratory of the Wigner RCP, Hungarian Academy of Sciences

More Related Content

What's hot

Egniya technical report_metric_for_stochastic_models
Egniya technical report_metric_for_stochastic_modelsEgniya technical report_metric_for_stochastic_models
Egniya technical report_metric_for_stochastic_modelsEngin Gul
 
2 achuthan c_pankaj--23-39
2 achuthan c_pankaj--23-392 achuthan c_pankaj--23-39
2 achuthan c_pankaj--23-39Alexander Decker
 
Expert system design for elastic scattering neutrons optical model using bpnn
Expert system design for elastic scattering neutrons optical model using bpnnExpert system design for elastic scattering neutrons optical model using bpnn
Expert system design for elastic scattering neutrons optical model using bpnnijcsa
 
Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra drboon
 
Web spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithmsWeb spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithmsaciijournal
 
Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...
Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...
Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...Waqas Tariq
 
ML_Unit_1_Part_C
ML_Unit_1_Part_CML_Unit_1_Part_C
ML_Unit_1_Part_CSrimatre K
 
New Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmNew Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmEditor IJCATR
 
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...IJDKP
 
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...Xin-She Yang
 
ML_ Unit 2_Part_B
ML_ Unit 2_Part_BML_ Unit 2_Part_B
ML_ Unit 2_Part_BSrimatre K
 
Clustering
ClusteringClustering
ClusteringMeme Hei
 
Improving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning AlgorithmImproving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning Algorithmijsrd.com
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniquestalktoharry
 
Single Channel Speech De-noising Using Kernel Independent Component Analysis...
	Single Channel Speech De-noising Using Kernel Independent Component Analysis...	Single Channel Speech De-noising Using Kernel Independent Component Analysis...
Single Channel Speech De-noising Using Kernel Independent Component Analysis...theijes
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithmijsrd.com
 

What's hot (20)

Egniya technical report_metric_for_stochastic_models
Egniya technical report_metric_for_stochastic_modelsEgniya technical report_metric_for_stochastic_models
Egniya technical report_metric_for_stochastic_models
 
2 achuthan c_pankaj--23-39
2 achuthan c_pankaj--23-392 achuthan c_pankaj--23-39
2 achuthan c_pankaj--23-39
 
Expert system design for elastic scattering neutrons optical model using bpnn
Expert system design for elastic scattering neutrons optical model using bpnnExpert system design for elastic scattering neutrons optical model using bpnn
Expert system design for elastic scattering neutrons optical model using bpnn
 
Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra
 
Sefl Organizing Map
Sefl Organizing MapSefl Organizing Map
Sefl Organizing Map
 
Web spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithmsWeb spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithms
 
Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...
Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...
Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...
 
ML_Unit_1_Part_C
ML_Unit_1_Part_CML_Unit_1_Part_C
ML_Unit_1_Part_C
 
Self-organizing map
Self-organizing mapSelf-organizing map
Self-organizing map
 
New Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmNew Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids Algorithm
 
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
 
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
 
Kmeans
KmeansKmeans
Kmeans
 
ML_ Unit 2_Part_B
ML_ Unit 2_Part_BML_ Unit 2_Part_B
ML_ Unit 2_Part_B
 
Clustering
ClusteringClustering
Clustering
 
Improving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning AlgorithmImproving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning Algorithm
 
H010223640
H010223640H010223640
H010223640
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
 
Single Channel Speech De-noising Using Kernel Independent Component Analysis...
	Single Channel Speech De-noising Using Kernel Independent Component Analysis...	Single Channel Speech De-noising Using Kernel Independent Component Analysis...
Single Channel Speech De-noising Using Kernel Independent Component Analysis...
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 

Similar to poster-lowe-6

Results from telescope_array_experiment
Results from telescope_array_experimentResults from telescope_array_experiment
Results from telescope_array_experimentSérgio Sacani
 
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...Anubhav Jain
 
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...m.a.kirn
 
Urop poster 2014 benson and mat 5 13 14 (2)
Urop poster 2014 benson and mat 5 13 14 (2)Urop poster 2014 benson and mat 5 13 14 (2)
Urop poster 2014 benson and mat 5 13 14 (2)Mathew Shum
 
Analytical model for the effect of pressure on the electronic structure of ge...
Analytical model for the effect of pressure on the electronic structure of ge...Analytical model for the effect of pressure on the electronic structure of ge...
Analytical model for the effect of pressure on the electronic structure of ge...Alexander Decker
 
Accelerating materials property predictions using machine learning
Accelerating materials property predictions using machine learningAccelerating materials property predictions using machine learning
Accelerating materials property predictions using machine learningGhanshyam Pilania
 
Sheet1Your NameSection #Part IIYourColor# ofGalaxyGalaxyObserved.docx
Sheet1Your NameSection #Part IIYourColor# ofGalaxyGalaxyObserved.docxSheet1Your NameSection #Part IIYourColor# ofGalaxyGalaxyObserved.docx
Sheet1Your NameSection #Part IIYourColor# ofGalaxyGalaxyObserved.docxlesleyryder69361
 
Malcolm Jardine Internship Poster - 2016-2
Malcolm Jardine Internship Poster - 2016-2Malcolm Jardine Internship Poster - 2016-2
Malcolm Jardine Internship Poster - 2016-2Malcolm Jardine
 
Analytical model for the effect of pressure on the electronic structure of ge...
Analytical model for the effect of pressure on the electronic structure of ge...Analytical model for the effect of pressure on the electronic structure of ge...
Analytical model for the effect of pressure on the electronic structure of ge...Alexander Decker
 
Application of graph theory in drug design
Application of graph theory in drug designApplication of graph theory in drug design
Application of graph theory in drug designReihaneh Safavi
 
Oscar Nieves (11710858) Computational Physics Project - Inverted Pendulum
Oscar Nieves (11710858) Computational Physics Project - Inverted PendulumOscar Nieves (11710858) Computational Physics Project - Inverted Pendulum
Oscar Nieves (11710858) Computational Physics Project - Inverted PendulumOscar Nieves
 
Optimal Estimations of Photometric Redshifts and SED Fitting Parameters
Optimal Estimations of Photometric Redshifts and SED Fitting ParametersOptimal Estimations of Photometric Redshifts and SED Fitting Parameters
Optimal Estimations of Photometric Redshifts and SED Fitting Parametersjulia avez
 
Imecs2012 pp440 445
Imecs2012 pp440 445Imecs2012 pp440 445
Imecs2012 pp440 445Rasha Orban
 
The build up_of_the_c_d_halo_of_m87_evidence_for_accretion_in_the_last_gyr
The build up_of_the_c_d_halo_of_m87_evidence_for_accretion_in_the_last_gyrThe build up_of_the_c_d_halo_of_m87_evidence_for_accretion_in_the_last_gyr
The build up_of_the_c_d_halo_of_m87_evidence_for_accretion_in_the_last_gyrSérgio Sacani
 

Similar to poster-lowe-6 (20)

SEPnet_Poster-FINAL
SEPnet_Poster-FINALSEPnet_Poster-FINAL
SEPnet_Poster-FINAL
 
Report
ReportReport
Report
 
Results from telescope_array_experiment
Results from telescope_array_experimentResults from telescope_array_experiment
Results from telescope_array_experiment
 
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
 
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...
 
Urop poster 2014 benson and mat 5 13 14 (2)
Urop poster 2014 benson and mat 5 13 14 (2)Urop poster 2014 benson and mat 5 13 14 (2)
Urop poster 2014 benson and mat 5 13 14 (2)
 
Analytical model for the effect of pressure on the electronic structure of ge...
Analytical model for the effect of pressure on the electronic structure of ge...Analytical model for the effect of pressure on the electronic structure of ge...
Analytical model for the effect of pressure on the electronic structure of ge...
 
Accelerating materials property predictions using machine learning
Accelerating materials property predictions using machine learningAccelerating materials property predictions using machine learning
Accelerating materials property predictions using machine learning
 
Sheet1Your NameSection #Part IIYourColor# ofGalaxyGalaxyObserved.docx
Sheet1Your NameSection #Part IIYourColor# ofGalaxyGalaxyObserved.docxSheet1Your NameSection #Part IIYourColor# ofGalaxyGalaxyObserved.docx
Sheet1Your NameSection #Part IIYourColor# ofGalaxyGalaxyObserved.docx
 
Malcolm Jardine Internship Poster - 2016-2
Malcolm Jardine Internship Poster - 2016-2Malcolm Jardine Internship Poster - 2016-2
Malcolm Jardine Internship Poster - 2016-2
 
Analytical model for the effect of pressure on the electronic structure of ge...
Analytical model for the effect of pressure on the electronic structure of ge...Analytical model for the effect of pressure on the electronic structure of ge...
Analytical model for the effect of pressure on the electronic structure of ge...
 
Application of graph theory in drug design
Application of graph theory in drug designApplication of graph theory in drug design
Application of graph theory in drug design
 
Applications of Computational Quantum Chemistry
Applications of Computational Quantum ChemistryApplications of Computational Quantum Chemistry
Applications of Computational Quantum Chemistry
 
Oscar Nieves (11710858) Computational Physics Project - Inverted Pendulum
Oscar Nieves (11710858) Computational Physics Project - Inverted PendulumOscar Nieves (11710858) Computational Physics Project - Inverted Pendulum
Oscar Nieves (11710858) Computational Physics Project - Inverted Pendulum
 
Optimal Estimations of Photometric Redshifts and SED Fitting Parameters
Optimal Estimations of Photometric Redshifts and SED Fitting ParametersOptimal Estimations of Photometric Redshifts and SED Fitting Parameters
Optimal Estimations of Photometric Redshifts and SED Fitting Parameters
 
Imecs2012 pp440 445
Imecs2012 pp440 445Imecs2012 pp440 445
Imecs2012 pp440 445
 
Guillemaud Thesis
Guillemaud ThesisGuillemaud Thesis
Guillemaud Thesis
 
The build up_of_the_c_d_halo_of_m87_evidence_for_accretion_in_the_last_gyr
The build up_of_the_c_d_halo_of_m87_evidence_for_accretion_in_the_last_gyrThe build up_of_the_c_d_halo_of_m87_evidence_for_accretion_in_the_last_gyr
The build up_of_the_c_d_halo_of_m87_evidence_for_accretion_in_the_last_gyr
 
BNL_Research_Poster
BNL_Research_PosterBNL_Research_Poster
BNL_Research_Poster
 
Project
ProjectProject
Project
 

poster-lowe-6

  • 1. QUARK/GLUON JET TAGGING FOR ALICE: MACHINE LEARNING FOR PARTICLE PHYSICS ANDREW JOHN LOWE Wigner Research Centre for Physics, Hungarian Academy of Sciences INTRODUCTION Search strategies for new subatomic particles often depend on being able to efficiently discriminate between signal and back- ground processes. Particle physics experiments are expensive, the competition between rival experiments is intense, and the stakes are high. This has lead to increased interest in advanced statisti- cal methods to extend the discovery reach of the experiments. We present a new method that could be used for differentiating be- tween decays of quarks and gluons at experiments like those at the Large Hadron Collider (LHC) at CERN. The power to discriminate between these two types of particle would have a huge impact on many new physics searches at CERN and beyond. THE ALICE EXPERIMENT ALICE (A Large Ion Collider Experiment) is one of seven de- tector experiments at the LHC. ALICE is focusing on the physics of strongly interacting matter in heavy-ion (lead nuclei) collisions. The resulting temperature and energy density are expected to be high enough to produce quark-gluon plasma, a state of matter wherein quarks and gluons are freed. Similar conditions are believed to have existed a fraction of the second after the Big Bang. Recreating this primordial form of matter and understanding how it evolves is ex- pected to shed light on questions about how matter is organized, the mechanism that confines quarks and gluons, and the nature of strong interactions and how they result in generating the bulk of the mass of ordinary matter. Figure 1: Computer generated cut-away view of ALICE. WHAT IS A JET? The production of quarks and gluons (collectively known as partons) via strong interactions is the dominant high-momentum- transfer process at the LHC. Quarks and gluons are not observed individually. Instead, we can only measure their decay products. What we observe is a cone-shaped spray of particles called a jet. The measured particles are grouped together by a jet algorithm, and the resultant jets are viewed as a proxy to the initial quarks and gluons that we can’t measure. Figure 2: When two high-energy protons collide, the partons that compose them (here only quarks are depicted in green, red, and blue) can hit each other. Some of these partons (pink balls) can fly away and "hadronize", forming directional jets of energetic particles (white balls). From [1]. THE PROBLEM IN A NUTSHELL Inside ALICE, beams of energetic protons and/or heavy ions collide. Quarks and gluons emerge and decay into collimated sprays of particles, and algorithms cluster these decay products into jets. For each jet, we’d like to know what initiated it: was it a quark or a gluon? This is an archetypal classification problem that might be amenable to machine learning. FEATURE ENGINEERING There are several differences between quarks and gluons that prove useful in motivating observables that might distinguish be- tween jets initiated by quarks as compared to gluons. Specifically, we wish to leverage differences in jet substructure to construct dis- criminant variables. Many candidate discriminant variables (fea- tures) were found during a thorough and extensive literature search, but we also consider various unintuitive combinations of variables, following the example in [2]. Combining particle attributes with each other (to form sums, differences, or products, for example) leads to a rapid proliferation of features. Consequently, we ex- plore hundreds of experimentally motivated, physically motivated, and unmotivated single-variable discriminants. GETTING & CLEANING DATA The ALICE analysis software framework ALIROOT was used to process Monte-Carlo simulated data that contains lots of jets. We inserted our own C++ code with handcrafted features into ALI- ROOT. Unphysical (missing) values are denoted by NaNs. We re- quire that jets have at least two tracks, are fully contained within tracker geometrical acceptance and are isolated. We then analyse the floating-point types contained in the data. This is a new sub- process in particle physics data analysis that we have invented. The data is contained in two-dimensional array-like structure, in which each column contains measurements on one feature, and each row contains one jet. We plot this below: 2 tracks 3 tracks 4 tracks ≥ 5 tracks Feature Jet Number types Zero Normal < ε Normal > ε Large unnormal or ∞ NaN Figure 3: "Missingness" and floating-point type map of the data. Feature names and jet ID numbers have been omitted for clarity. We observe that several features appear to be duplicates, and several that are overwhelmingly NaN or zero. These have no pre- dictive value and are removed. We reset large unnormalised and infinite values to the largest representable normalised number on our hardware. Several features have values below the machine — these are are due to rounding-error in floating-point arithmetic, and are essentially equal to zero. The variation in their values is not real, and could be misleading to a classifier. We note that process- ing very small floating-point values may significantly slow compu- tation; in extreme cases, instructions may be as much as 100 times slower [3, 4]. We flush these values to zero, which should speed up classifier training. JET TRUTH LABELLING The jets are assigned a "ground-truth" label using information in the data simulator event record. However, the labelling procedure is not unambiguous, and there is significant class noise (i.e., there are mislabelled jets) in the assignments. We have devised a new la- belling scheme to address the problem of mislabelled jets. We adapt the method in [5] by extending it to form an ensemble of multiple different labelling schemes. We then reject all jets for which the en- semble does not reach a consensus, i.e., we employ "the wisdom of the crowds". Limiting class noise is critical for two reasons: firstly, many machine learning classifiers are confused by mislabelled ob- servations, and this will damage performance; secondly, the perfor- mance of a classifier is measured with respect to its ability to cor- rectly predict the assigned labels, so performance estimates are less meaningful if the labels are uncertain. JET TRUTH LABELLING (cont.) We use an ensemble labeller with five members. To tab- ulate their outputs would require ten tables corresponding to the (5 2) possible adjacency matrices which, in turn, corre- spond to the margins of the 5-dimensional adjacency matrix that fully describes the relationships between the schemes. We use a chord diagram to examine these relationships, which are in agreement with our expectations: ∅ 0 b 0 c 0 g 0 70 140 210 q 0 γ 0 ∅ 0 b 0 c 0 g 0 70 140 210 q 0 γ 0 ∅ 0 b 0 c 0 g 210 0 70 140 q 0 γ 0 ∅ 0 b 0 c 0 g 0 70 140 210 q 0 γ 0 ∅ 0 b 0 c 0 g 0 70 140 210 q 0 γ 0 ᵀp-xa m e n o C erawa-DCQeno C ᵀp-xamAG G A Q C D - a w a r e R e clu stered ∅: no label q: light quark g: gluon c: charm b: bottom γ: photon Figure 4: Chord diagram showing the relationships between five chosen labelling schemes. There is overwhelming agreement in the label assignments from each scheme, and (as expected) variations of the "max-pT" scheme are prone to label a jet as photon-initiated (γ) when a QCD- aware scheme would label the jet as a gluon or assign no label (as noted in [5]). FEATURE RANKING RESULTS Our data initially contains more than 300 features. Removing duplicate and highly correlated features halves the number of fea- tures; first we filter on the absolute value of Pearson correlation to identify linearly correlated features, then we filter on the absolute value of Spearman correlation to identify monotonically-related fea- tures. To optimally search the remaining feature space to find the variables that provide the best predictive power, we invented a fast filter-based method that involves ranking variables by information gain (Kullback-Leibler divergence) or Gini impurity and comparing their rank with that of random probes injected into the data. We do this repeatedly for a large number of bootstrap resamplings to yield a median (or, optionally, a mean) and nonparametric confidence in- terval estimate for the chosen metric for each feature. We then re- move features with values of the metric less than that for random probes, or within one standard deviation: Figure 5: Box-and-whisker plot showing median information gain for features and six ran- dom probes (denoted "BOGUS"). To the left is worse, to the right is better. In addition to confirming the power of features already proposed in past work, this method has found intriguing new variables that promise better discrimination between quark- and gluon-initiated jets and are therefore ideal can- didates for further study. REFERENCES [1] C. Manuel. The Stopping Power of Hot Nuclear Matter. Physics, 7(97), 2014. doi: 10.1103/Physics.7.97. URL http://link.aps.org/doi/10.1103/Physics.7.97. [2] J. Gallicchio, J. Huth, M. Kagan, M. D. Schwartz, K. Black, and B. Tweedie. Mul- tivariate discrimination and the Higgs + W/Z search. JHEP, 04:069, 2011. doi: 10.1007/JHEP04(2011)069. [3] E. M. Schwarz, M. Schmookler, and S. D. Trong. FPU implementations with denormalized numbers. IEEE Transactions on Computers, 54(7):825–836, July 2005. ISSN 0018-9340. doi: 10.1109/TC.2005.118. [4] I. Dooley and L. Kale. Quantifying the interference caused by subnormal floating-point values. In Proceedings of the Workshop on Operating System Interference in High Performance Applications, 2006. [5] A. Buckley and C. Pollard. QCD-aware partonic jet clustering for truth-jet flavour labelling. Eur. Phys. J., C76(2):71, 2016. doi: 10.1140/epjc/s10052-016-3925-z. This work was supported by: Hungarian National Research Fund (OTKA) NK106119 and the Wigner GPU Laboratory of the Wigner RCP, Hungarian Academy of Sciences