SlideShare a Scribd company logo
1 of 17
Download to read offline
Multivariate Analysis of the Vector Boson Fusion Higgs Boson
Brendan Marsh University of Missouri August 8, 2016
Ph.D. Student Supervisor: Antonio De Maria
Supervisor: Prof. Dr. Arnulf Quadt
Abstract
A multivariate analysis is presented for the study of the vector boson
fusion (VBF) Higgs boson decaying to a pair of tau leptons. While the VBF
production mechanism of the Higgs is roughly an order of magnitude lower
in cross section than the dominant gluon-gluon fusion mechanism, it is
shown that VBF produces a distinctive signature that is well suited for
detection by multivariate analyses. A number of discriminant variables are
explored in addition to a direct comparison of different machine learning
toolkits. Ultimately, a statistical significance of 7.9 is achieved for detection
of the VBF Higgs boson in this truth level study.
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
. . . . . . . . . . . . . . . . . . . . . . . . . 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
. . . . . . . . . . . . . . . . . . . . . . . . . . 14
. . . . . . . . . . . . . . . . . . . . . . . . . . . 14
. . . . . . . . . . . . . . . . . . . . . . . . . . . 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Contents
1. Motivation and Background
1.1 The Higgs Boson
1.2 Vector Boson Fusion
1.3 Fully Hadronic Decay Mode
1.4 Background Processes
2. Multivariate Analysis
2.1 Monte Carlo Samples
2.2 Preselection Cuts
2.3 Cut Based Analysis
2.4 Decision Trees
2.5 Adaptive Boosting
2.6 Discriminant Variables
2.6.1 Collinear Approximation
2.6.2 Tau Centrality Product
2.6.3 πœ‚ Variables
2.6.4 Tau-Jet Angular Correlations
2.6.5 Fox-Wolfram Moments
2.6.6 MVA Variables
2.7 TMVA Multivariate Analysis
2.8 Scikit Learn Multivariate Analysis
3. Conclusions
3.1 Outlook for VBF Higgs Analysis
3.2 Suggestions for Future Studies
3.3 Thanks!
References
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 2
1. Motivation and Background
1.1 The Higgs Boson
Within the context of the Standard Model (SM),
the Higgs mechanism is necessary for the mass
generation of the W and Z gauge bosons. By
invoking a break in electroweak symmetry, the
Higgs mechanism implies the existence of a spin
zero, neutral particle; we know this particle as the
Higgs boson.
For many years, the Higgs remained elusive in
particle detectors. It was not until July 4, 2012 that
CERN announced that both the CMS and ATLAS
experiments at the large hadron collider (LHC) met
the 5𝜎 discovery benchmark for a new boson with a
mass of roughly 125 GeV that was consistent with
a Higgs boson. It seems the Higgs has finally been
found!
Many studies of the Higgs boson are ongoing as Run II of the LHC is currently approaching an
online integrated luminosity of 20 inverse femtobarns. As our studies of the Higgs progress, the vector
boson fusion production mechanism becomes increasingly important as a detection pathway, in CP
violation studies [1], and in other areas.
1.2 Vector Boson Fusion
A standard model Higgs boson may be produced via one of four production mechanisms at the
LHC. The vector boson fusion (VBF) mechanism involves the scattering of two quarks via the
exchange of a W or Z (vector) boson. This pair of vector bosons then fuses to produce a low mass
Higgs boson.
Figure 2 Left: Feynman diagrams of the four Higgs production mechanisms at the LHC, with vector boson
fusion highlighted in red. Right: Corresponding cross section for Higgs production mechanisms.
One can see from the cross section that the gluon-gluon mechanism is roughly an order of
magnitude greater than that of the VBF mechanism for a Higgs of mass 125 GeV [2]. However,
the addition of the two quarks into the final state, visible as highly energetic jets, produces a
Figure 1 The elementary particles of the Standard
Model, labelled with their mass, charge, and spin.
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 3
distinctive signature that is lacking in gluon-gluon fusion. In terms of measurable quantities,
VBF events may be recognized by the following characteristics:
β€’ Highly πœ‚ separated jets
β€’ Jets in opposite hemispheres
β€’ High invariant mass of jets
β€’ No central jets above a certain 𝑝$
1.3 Fully Hadronic Decay Mode
The 125 GeV Higgs boson most often decays into a 𝑏𝑏 pair, however this decay mode is not easily
recovered in a sea of 𝑑𝑑 background [3]. The Higgs additionally may decay into a 𝜏(
𝜏)
pair; this is the
decay mode studied in this analysis. Specifically, I investigate the β€œfully hadronic” decay mode in which
both tau leptons subsequently decay into a tau neutrino and a number of pions, which accounts for
roughly 41% of the branching ratio[2]. A Feynman diagram of the signal process is given below.
Figure 3 The Feynman diagram of the signal process of this study; a Higgs boson production via vector boson
fusion with a subsequent decay into tau leptons, a tau neutrino, and pion.
1.4 Background Processes
A bit like searching for a needle in a haystack, the VBF Higgs process is a rare event that is drowned
out by background processes with similar event characteristics and much higher cross sections. To
detect a small signal in a sea of background, one’s goal is to remove as much of the background as
possible while retaining as many signal events as possible. Thus, it is equally as important to
understand the background processes competing with your signal process as it is important to
understand your signal process. The main background processes relevant to this study are the Zβ†’ 𝜏𝜏
and 𝑑𝑑 processes.
Zβ†’ 𝜏𝜏 + 𝑗𝑒𝑑𝑠
According to the particle data group [2], the Z boson decays into a pair of tau leptons with a
branching ratio of roughly 3.4%. As Z bosons are produced in excess at the LHC, this channel
introduces a large background with the same final state, a pair of tau leptons. Fortunately, there do
exist features of VBF that we expect to differ in the case of Zβ†’ 𝜏𝜏. Foremost, the invariant mass of the
reconstructed taus should reflect the mass of the particle from which it came, although mass
reconstruction can be difficult (section 2.6.1). For VBF taus we expect to see the mass of the Higgs,
roughly 125 GeV, while for the Zβ†’ 𝜏𝜏 channel we expect a peak around 91 GeV. Additionally, the
distinctive jet topology of VBF is not expected in the Zβ†’ 𝜏𝜏 channel.
𝑑𝑑
Top quarks almost always decay into W boson – b quark pairs, with the W boson then emitting a
tau lepton. Thus, given two top quarks it is possible to have two taus in the final state. Therefore 𝑑𝑑
background, also produced in excess at the LHC, poses another background process. However, there
exist a number of features of the 𝑑𝑑 background that make it quite easy to eliminate. Very often in the
,
,
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 4
final state of the 𝑑𝑑 background there exist jets originating from b quarks, while this is rare for VBF final
states. Fortunately, there exist β€œb-tagging” algorithms capable of labelling jets in the detector that most
likely arise from b quarks. Thus, we may cut out events with b jets, leaving Zβ†’ 𝜏𝜏 as irreducible
background. Additionally, we do not expect to find any correlations between the tau decay products
and the missing transverse energy, unlike VBF in which they are heavily correlated.
2. Multivariate Analysis
The basic goal of any multivariate analysis (MVA) is to classify signal events over background
events, with as high of an efficiency as possible, given some input variables for each event. Most
MVAs take a number of input variables and return a single measure of β€œsignal-likeness”, which must
hit a certain threshold to be considered a signal event.
Before diving into the multivariate techniques used for this analysis, the training samples used to
develop and test the analysis will be described, along with the traditional cut based analysis for VBF
and reasons why it can be improved using a multivariate analysis.
2.1 Monte Carlo Samples
Monte Carlo simulations provide a powerful tool for studying stochastic processes. Here, Powheg
and Pythia 8 Monte Carlo generators were used to simulate truth level events for both VBF and the
relevant background processes at a centre of mass energy of 𝑠 = 13	TeV. Using these simulated
events, one may train a multivariate analysis method to be applied to real data. The Monte Carlo
samples used for this study are given below.
It is important to note that this was truth level study only; no reconstruction or trigger level effects
have been incorporated. These effects are non-negligible and should incorporated in future studies.
2.2 Preselection Cuts
A number of cuts may be applied to the events before any classifier is used. Some of these cuts
correspond to limitations of the ATLAS detector (corresponding to events that would not be well
reconstructed in practice) while others are made specifically to remove background events. The
preselection cuts used for this analysis are given below. If any event does not fulfill the criteria, it is
discarded from the analysis.
The transverse momentum of both tau leptons must be at least 20 GeV
to be detected and reconstructed by tau reconstruction algorithms.
The absolute value of πœ‚, the pseudorapidity, of each tau lepton must be
less than 2.5 for good reconstruction in the tracker.
The missing transverse energy should be greater than 20 GeV, as we
expect missing energy from neutrinos in the final state.
𝜏678 p$
> 20 GeV
|πœ‚;| < 2.5
MET > 20 GeV
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 5
The transverse momentum of the leading and subleading jet should be
greater than 20 GeV to be detected.
B-tagging algorithms can identify jets originating from b quarks, thus b-
tagged jets can be cut to eliminate 𝑑𝑑 background. In truth level studies,
one uses the PDG (Particle Data Group) ID to identify and cut b-jets.
2.3 Cut Based Analysis
The most basic form of classifier, and the one that is often used due to its simplicity and physical
motivation, is a simple cut based analysis. This entails requiring a candidate event to pass a series of
univariate β€œcuts” which are motivated by knowledge of the signal process. The traditional cuts used to
identify VBF events over background events are given below [4].
VBF produces highly energetic quark jets into the final state, we expect
to see a leading jet with high transverse momentum.
There are two quark jets into the final state, thus the subleading jet
should also have high transverse momentum.
The jets of VBF have characteristically high separation in
pseudorapidity.
The VBF topology exhibits jets that are back-to-back.
The highly energetic jets show a high invariant mass.
The tau leptons should be detected in the central part of the detector in
comparison to the jets. Explicitly, the pseudorapidity of the taus should
lie between the range spanned by the jets.
The cut based analysis has its advantages; it is very simple to implement, requires no β€œtraining” like
the multivariate methods, and the rationale for each of the cuts is grounded in physics. However, while
it excels in its understandability, it often lacks the classification power required to recover rare
processes like the VBF Higgs.
The inferiority of the cut based analysis lies in the assumption that each variable can be cut upon
independently of the others when, in fact, the best cut to make on one variable may depend on another,
or even many others. That is, correlations cannot be accounted for. This issue is addressed by
multivariate classification methods like decision trees.
2.4 Decision Trees
Decision trees, like cut based analyses, split events into groups by setting a threshold on some
variable. However, while the cut based analysis only makes a single round of cuts, decision trees
continue to further subdivide groups, separating signal from background more and more at each step
by making the most efficient cut possible. Additionally, the most efficient cuts are calculated
algorithmically from a set of data used to β€œtrain” the decision tree.
p$
<=>?
> 40 GeV
p$
8@A<=>?
> 30 GeV
|πœ‚<=>? βˆ’ πœ‚8@A<=>?| > 3
πœ‚<=>? βˆ— πœ‚8@A<=>? < 0
π‘šEFGHI(EJKLFGHI
> 300 GeV
Jets-Taus Centrality
	
p$
<=>?
, p$
8@A<=>?
> 20 GeV
No b-tagged jets
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 6
Figure 4 A simple decision tree. Here orange represents VBF events while blue represents background events.
At each stage, groups become more purely signal or background by splitting on some variable.
The metric that is normally minimized for each split is the Gini impurity of the current group of
events. It is defined as the probability of incorrectly labelling a random event in the group based on
the known distribution of signal and background within the group. For a binary classification problem,
the Gini impurity for a group of events is given by the following formula:
𝐼O = 𝑛87Q βˆ— 1 βˆ’ πœ‚87Q + 𝑛AQ βˆ— 1 βˆ’ 𝑛AQ
Unlike a cut based analysis, which can only form rectangular signal regions in the variable phase
space, decision trees can be grown to approximate arbitrarily complex decision functions. However,
decision trees, too, are not without their flaws. The intuition of a cut based analysis is lost since the
splits are generated algorithmically. Additionally, it is very easy to grow a tree that is too deep that
begins to train itself to recognize individual points in the training data, becoming artificially complex.
This phenomenon is well known in the field of machine learning, and is commonly known as
β€œovertraining”. To address this issue, a technique known as boosting is performed as opposed to older
β€œpruning” methods which grow full decision trees then backtrack and discard unimportant splits.
2.5 Adaptive Boosting
Adaptive boosting, or AdaBoost, is a general method that can be applied to a number of
classifiers, such as decisions trees, to improve reliability, performance, and resistance to
overtraining. In the context of adaptive boosting of decision trees, the single decision tree is replaced
by a β€œforest” consisting of hundreds of decision trees which are restricted to only a few levels, such
as the one above. As a whole, this forest of decision trees is called a boosted decision tree (BDT),
and the output of the BDT is a weighted sum of the outputs of each individual tree.
Each individual decision tree is called a β€œweak learner” in the sense that it is only one of many
classifiers in the forest. Here is where the adaptive boosting comes in; each weak learner is trained
iteratively to improve upon the previous one. The first weak learner is trained as a normal decision
tree from the training data. However, the results of the first weak learner are then used to weight the
importance of the training data for the next weak learner; points that were classified correctly receive
small weights while incorrectly classified points receive large weights. In this way, the next weak
learner is trained focusing on points that have not been classified well by the previous weak learner.
This process continues such that each weak learner focuses on correcting mistakes of the last,
improving at each step. The process is visualised below.
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 7
Figure 6 A view of the transverse plane depicting the collinear
approximation. The tau neutrinos go collinearly with the tau leptons
such that their sum matches the missing transverse energy.
Figure 5 Training of an AdaBoost classifier. The first classifier trains on unweighted data, then
reweights the data for the next and so on to produce the final classifier.
2.6 Discriminant Variables
When training a BDT, a balance should be found between the number of variable inputs to the
BDT and the performance of the BDT. Additionally, while BDTs are known to handle correlated
variables quite well, it is superfluous to include two strongly correlated variables, only one of which
adds discriminatory power to the classification.
Much of my work this summer was spent investigating variables, both common and newly
devised, to search for new discriminating variables for use in a multivariate analysis. The most
important in the analysis was the ditau mass, calculated via the collinear approximation.
2.6.1 Collinear Approximation
In the case of VBF, the mass of the ditau should correspond to the mass of the Higgs, for Zβ†’ 𝜏𝜏
the mass of the Z boson, and for 𝑑𝑑 we expect no clear peak. Thus, there are good physical motivations
for the use of the ditau mass in our MVA. However, in order to fully reconstruct the ditau one needs
the missing neutrinos. The collinear approximation accounts for the missing neutrinos by making the
following assumptions.
1. The tau neutrinos are perfectly collinear with their associated tau lepton.
2. The missing transverse energy is entirely due to the tau neutrinos.
Under these approximations, the magnitude
of the neutrino momenta becomes completely
determined by the missing transverse energy.
One is then left with a simple matter of
constructing the neutrinos collinearly with the
taus such that the sum of the neutrinos is
precisely the missing transverse energy.
The collinear approximation is not always
applicable; when the tau leptons are emitted
back to back in the πœ™ plane, it is impossible to
reconstruct the missing transverse energy.
This leads to a simple constraint between taus:
cos βˆ†πœ™ > βˆ’0.99
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 8
Historically, the collinear approximation has relied upon using the charged decay products of the
tau leptons, be it either 1-prong or 3-prong decays. However, the decay products may also include a
neutral pion. Recently, tau substructure algorithms have become available that allow for reconstruction
of the entire visible (charged + neutral) tau [5]. One of my first studies was on the marked improvement
in the collinear approximation as a result of using the entire visible tau.
Figure 7 The collinear approximation using the charged tau leptons (left) and the full visible tau leptons (right).
The blue histograms represent VBF and red represents combined backgrounds scaled appropriately. All
distributions normalized to unity, and units are in GeV.
As you can see, there is a remarkable improvement using tau substructure techniques to
reconstruct the visible tau. In future studies, I suggest applying smearing of the transverse momentum
or otherwise modelling imprecision in the detector to see if the collinear approximation remains as
robust as it is in this truth study. Needless to say, this variable made it to the final MVA.
2.6.2 Tau Centrality Product
In the context of VBF topology, centrality has been used as a flag indicating whether or not a tau
lepton is centrally located in the detector with respect to the jets. Explicitly, a tau lepton is central if
its pseudorapidity lies in the range spanned by the leading and subleading jet. To generalize this
binary variable to a continuous variable, which is more powerful in multivariate analyses, the
following definition has been suggested [6].
𝐢; ≔ exp βˆ’
πœ‚; βˆ’ πœ‚>6Q
βˆ†πœ‚
^
				where					πœ‚>6Q ≔
πœ‚<=>? + πœ‚8@A<=>?
2
			,				βˆ†πœ‚ ≔ πœ‚<=>? βˆ’ πœ‚8@A<=>?
A perfectly central tau lepton (with exactly the average πœ‚ of the jets) will have a centrality of one,
while a tau lepton far from the average πœ‚ of the jets will have centrality close to zero. Note that if the
jets are not well separated in πœ‚, the centrality also approaches zero.
The authors of this continuous centrality variable used the centrality of the two taus as independent
variables. However, I found the two variables to have an 88% positive correlation for VBF. By taking
the product of the two tau centralities, a single uncorrelated variable is achieved with greater
separation power than either of the individual centralities.
𝐢cde? ≔ 𝐢;f
βˆ— 𝐢;g
= exp βˆ’
πœ‚;f
βˆ’ πœ‚>6Q
βˆ†πœ‚
^
βˆ’
πœ‚;g
βˆ’ πœ‚>6Q
βˆ†πœ‚
^
Collinear Approximation Ditau Mass (Charged)
0 20 40 60 80 100 120 140 160
Events
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Collinear Approximation Ditau Mass (Visible)
0 20 40 60 80 100 120 140 160
Events
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 9
Figure 8 The centrality of the individual tau leptons (left and centre) vs. the product of tau centrality (right).
Given the redundancy of the correlated variables and increased separation power of the product
variable, it was the centrality product variable that made it to the final multivariate analysis.
2.6.3 𝜼 Variables
Variables explicitly related to the pseudorapidity of the leading and subleading jets are common in
analyses of the VBF Higgs, including the cut based analysis already presented. On the surface, these
variables seem well suited to multivariate analysis as well given their separation power. However, I
found that these traditional VBF variables are highly correlated with the invariant mass of the jets.
Figure 9 βˆ†πœ‚ (centre) and πœ‚<=>? βˆ— πœ‚8@A<=>? (right) of the leading and subleading jets, along with their correlations to
the invariant mass of the jets (left).
Given the strong correlations within this group of variables, I was not surprised to find that
eliminating βˆ†πœ‚ and πœ‚<=>? βˆ— πœ‚8@A<=>? from the MVA led to no decrease in performance of the BDT. The
invariant mass of the jets displayed the greatest separation power (see figure 11), thus, despite their
prevalence in traditional VBF studies, I have chosen to exclude βˆ†πœ‚ and πœ‚<=>? βˆ— πœ‚8@A<=>? from the final
analysis.
2.6.4 Tau-Jet Angular Correlations
The Higgs boson is a spin 0 particle; Z bosons are spin 1 particles. My Ph.D. supervisor and I were
interested in whether or not this difference in spin quantum number manifests itself in angular
correlations between the tau leptons themselves or between tau leptons and the leading and
subleading jet. A number of variables were investigated, boosted into different reference frames,
probing any angular correlations.
Tau 0 Centrality
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Events
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Tau 1 Centrality
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Events
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Tau Centrality Product
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Events
0
0.05
0.1
0.15
0.2
0.25
0.3
Jets dEta
0 1 2 3 4 5 6 7 8 9
Events
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Jets Eta Product
15βˆ’ 10βˆ’ 5βˆ’ 0 5 10
Events
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 10
Jets Plane / Taus Plane Angle
0 0.5 1 1.5 2 2.5 3
Events
0
0.005
0.01
0.015
0.02
0.025
0.03
Jets Plane Eta
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Events
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Selected Angular Variables
Taus βˆ†π‘… The βˆ†π‘… separation of the two tau leptons.
Taus πœ™ Centrality The same as the continuous tau centrality
variable, but in πœ™ instead of πœ‚.
Jets-Taus Plane Total angle between the two planes formed by
Angle the tau leptons and the jets.
Jets Plane πœ‚ πœ‚ of the normal vector to the plane formed by
the two jets.
The angular relationships amongst the tau leptons and jets, beyond the expected VBF jet topology,
seems to be subtle if existent at all. While the βˆ†π‘… of the taus above shows modest separation, inclusion
in the MVA yielded no improvement, and unfortunately the angle between the tau plane and jet plane
seems indifferentiable between VBF and background. Boosting to various center of mass reference
frames generally had little effect on separation power.
2.6.5 Fox-Wolfram Moments
The Fox-Wolfram moments are a set of event descriptors that are currently under investigation for
use in replacing traditional cuts with these more advanced metrics [7]. The moments arise from
superpositions of spherical harmonics, defined as follows.
π‘Š7,E
k
∢=
Above, the sum goes over any number of objects in the event (such as the leading and subleading
jet for the VBF topology), Ξ©7,E corresponds to the total angle between the i’th and j’th objects, and 𝑃<
are the Legendre polynomials. The weight term π‘Š7,E
k
may take many forms, as given above.
A preliminary study of the Fox-Wolfram moments in the analysis of VBF has shown that the
moments display considerable separation power, however, when included in the multivariate analysis
have not improved the classification efficiency. Included below are plots of two sets of Fox-Wolfram
moments. On the left, only the leading and subleading jets were considered, and the best weight was
found to be the unit weight. On the right, both tau leptons are also included as objects into the moment
calculations, for which the transverse momentum weighting scheme was found to be best.
Tau 1 Phi Centrality
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Events
0
0.1
0.2
0.3
0.4
0.5
Taus dR
0.5 1 1.5 2 2.5 3 3.5
Events
0
0.01
0.02
0.03
0.04
0.05
0.06
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 11
100βˆ’
80βˆ’
60βˆ’
40βˆ’
20βˆ’
0
20
40
60
80
100
ditauMass
mjj sumPT
PTsum
tausCentrality
ditauMass
mjj
sumPT
PTsum
tausCentrality
Correlation Matrix (signal)
100
100 -12 26 42
-12 100 21 -28
26 21 100 -2
42 -28 -2 100
Linear correlation coefficients in %
π‘š;f(;g
	
	
π‘šEFGHI(EJKLFGHI
𝐢;f
βˆ— 𝐢;g
p$
<=>?
+ p$
8@A<=>?
p$
<=>?(8@A<=>?
100βˆ’
80βˆ’
60βˆ’
40βˆ’
20βˆ’
0
20
40
60
80
100
ditauMass
mjj sumPT
PTsum
tausCentrality
ditauMass
mjj
sumPT
PTsum
tausCentrality
Correlation Matrix (background)
100 2 2
100 19 38 39
2 19 100 35
2 38 35 100 -2
39 -2 100
Linear correlation coefficients in %
Figure 10 The first four Fox-Wolfram moments considering only jets, with a unit weighting (left). The first four
Fox-Wolfram moments considering jets and tau leptons, with transverse momentum weight (right).
While only the first four moments are displayed here for brevity, the odd and even moments
are highly correlated though distinct. Unfortunately, my time has run short to fully investigate
the Fox-Wolfram moments as potentially useful discriminating variables in the multivariate
analysis. For future studies, I would suggest to explore the β€œmodified” Fox-Wolfram moments
which are invariant to Lorentz boosts, and explore any correlations that may exist between the
moments and the MVA variables already in use.
2.6.6 MVA Variables
The final list of variables for use in the multivariate analysis was pruned down starting with roughly
ten variables that showed the strongest separation power. After identifying correlations and removing
variables that led to no improvement in classification efficiency, the following variables remain in the
final analysis.
The invariant mass of the ditau, reconstructed via the collinear approximation using
the full visible tau leptons.
The invariant mass of the leading and subleading jets.
The product of the centrality of the two tau leptons.
The scalar sum of the transverse momenta of the leading and subleading jets.
The transverse momentum of the vector sum of the leading and subleading jets.
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 12
Figure 11 Discriminatory variables for the multivariate analysis. The blue histograms represent VBF and red
represents combined backgrounds scaled appropriately. All distributions normalized to unity, masses and
momenta are in units of GeV.
2.7 TMVA Multivariate Analysis
This multivariate analysis was performed at a centre-of-mass energy of 𝑠 = 13	TeV and at an
integrated luminosity of 20 inverse femtobarns, corresponding roughly to current Run II conditions at
the LHC. The ROOT analysis framework (or my preference, the python adaptation PyROOT) provides
a toolkit for multivariate analysis known as TMVA [8]. This toolkit was utilized to train a boosted
decision tree using the discriminant variables presented in section 2.6.6. I was interested in comparing
the performance of TMVA with the well-known python machine learning library Scikit Learn. To this
end, a boosted decision tree was optimized in TMVA and compared with an identically parameterized
boosted decision tree trained in Scikit Learn.
Optimization of the BDT parameters in TMVA was
performed by performing single scans over parameters
like the number of trees or tree depth. A full multivariate
sweep over parameter settings and variables was simply
too computationally timely and out of the scope of this
project. Should one like to take this analysis to the next
step, I would recommend performing such a multivariate
sweep over BDT parameter settings. The final
configuration of the BDT parameters that were found to be
important are given to the left.
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 13
When training and testing any multivariate method, one must be careful to weigh the training data
correctly; while we have a similar amount training data for both the VBF and background processes,
in reality the number of background events is much larger than the number of signal events. Thus a
weight needs to be applied to events from each process to correct for their relative abundance.
π‘Š =
opqr.
β„’tu
=
β„’pqr.u
β„’tu
where β„’v
𝜎 is provided by the Monte Carlo sample.
Cross sections were determined for each Monte Carlo sample from the TWiki cross section
summaries of the MC15 samples for Run II analyses. Given these cross sections and an integrated
luminosity of 20 inverse femtobarns, the expected number of events may be calculated. Additionally,
the percentage of events that pass the preselection criteria presented earlier may be calculated per
sample, and then applied to determine the expected number of events after preselection.
Process Cross Section pb)x
Events at β„’ = 20fb)x Events (Preselected)
VBF 9.993941 βˆ— 10)^ 1,999 398
Zβ†’ 𝝉𝝉 1.950632 βˆ— 10~ 39,012,642 1,148,098
𝒕𝒕 4.515915 βˆ— 10^ 9,031,830 26,394
As was expected from eliminating b-tagged jets, the 𝑑𝑑 background is more than decimated, leaving
Zβ†’ 𝜏𝜏 as the main background. Roughly speaking, the signal to combined background ratio is a
staggering 1 3000!
The metric for defining the optimal cut value of the classifier is the statistical significance defined
as follows, where β€œs” is the number of signal events and β€œb” the number of background events. For a
Poisson random variable, the standard deviation is defined as the square root of the total number of
events, 𝑠 + 𝑏. Then, the following statistical significance measures the ratio of signal events relative
to one standard deviation.
Statistical	Significance ∢=
𝑠
𝑠 + 𝑏
β‰ˆ
𝑠
𝑏
	for	b ≫ s
Thus, this definition of the statistical significance can either be interpreted as the number of signal
events relative to one standard deviation or, if b is much larger than s, as is usual, the number of signal
events over the background fluctuation level.
The TMVA output classifier along with the optimal cut value after training a boosted decision tree
using the parameters given above is shown below.
BDT response
0.15βˆ’ 0.1βˆ’ 0.05βˆ’ 0 0.05 0.1 0.15 0.2
dx/(1/N)dN
0
2
4
6
8
10
12
14
16
18 Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.008 (0.016)
U/O-flow(S,B):(0.0,0.0)%/(0.0,0.0)%
TMVA overtraining check for classifier: BDT
Cut value applied on BDT output
0.15βˆ’ 0.1βˆ’ 0.05βˆ’ 0 0.05 0.1 0.15 0.2
Efficiency(Purity)
0
0.2
0.4
0.6
0.8
1
Signal efficiency
Background efficiency
Signal purity
Signal efficiency*purity
S+BS/
For 398 signal and 1174098 background
isS+Bevents the maximum S/
7.9024 when cutting at 0.1453
Cut efficiencies and optimal cut value
Significance
0
1
2
3
4
5
6
7
8
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 14
The final statistical significance of the classifier reaches 7.9, albeit the significance curve becomes
noisy most likely due to statistical fluctuations with such heavily weighted background events. By any
interpretation, the statistical significance can be said to be roughly 6 at minimum. The full interpretation
of the outcome will be discussed in the conclusion.
2.8 Scikit Learn Multivariate Analysis
Scikit Learn (SKL) is a free, general machine learning library for python [9]. Given its popularity
and ease of use, I was interested to see how SKL compares to TMVA in terms of final classifier
efficiency, ease of use, and configurability.
SKL supports all of the machine learning methods implemented by TMVA and many more, and in
the case of boosted decision trees supports many of the same configuration options. However,
unlike TMVA, SKL does not directly provide the user with plots (classifier output distributions,
optimum cuts, correlation matrices) via a nice GUI. Code had to be written to randomize training and
test samples, for viewing the output classifier distribution, for calculation of the maximum statistical
significance, and other tasks.
For a direct comparison of TMVA and SKL, a boosted decision tree was trained in SKL with
identical parameters as was done for TMVA. The resulting output classifier is given below.
Max. Statistical Significance: 3.5
SKL performed worse in many regards. As
can be seen by the shape of the output
classifiers, there exists much more overlap
between signal and background even when
trained identically to TMVA, leading to roughly
only half the statistical significance, seen as
the green line, not to scale, that was achieved
by TMVA. Additionally, SKL took almost five
times longer to train the BDT.
3. Conclusions
3.1 Outlook for VBF Higgs Analysis
Overall, the development of a multivariate analysis for the detection of a VBF Higgs boson
decaying to a pair of tau leptons with subsequent hadronic decays was quite successful. A theoretical
basis was developed to understand the signal process and main backgrounds at play. With only a few
basic preselection cuts, the vast majority of 𝑑𝑑 background was eliminated, leaving the Zβ†’ 𝜏𝜏 process
as the main background. From knowledge of the underlying physics, a number of candidate
discriminant variables were explored for use in the multivariate analysis. Deserving of special attention
is the reconstructed ditau mass using the collinear approximation, which has shown very promising
improvements in mass resolution with the introduction of tau substructure reconstruction algorithm.
Some of the variables typically associated with vector boson fusion, such as the distinctively large
separation in pseudorapidity of the leading and subleading jet, were found to be highly correlated and
did not make it into the final analysis. Both TMVA and Scikit Learn were used to train boosted decision
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 15
trees; TMVA provided faster results with better classification power, and a convenient interface for
producing plots. The final statistical significance of the VBF signal reached 7.9.
Many aspects of the study, including the final statistical significance, must be kept in context. First
and foremost, all aspects of this study were calculated on purely the truth level, no trigger level effects
were accounted for, no detector effects beyond simple preselection cuts on pseudorapidity ranges
accounted for, and no reconstruction level effects were considered. These effects may pose important
effects that should be taken into account in further analyses. Additionally, every algorithm, in particular
the b-tagging, tau ID and tau substructure algorithm, has an associated efficiency. On truth level, these
efficiencies are not modelled and will further decrease performance on the reconstruction level.
Nevertheless, I hope that this multivariate analysis serves as a useful proof of concept for a full scale
multivariate analysis in which all of the above issues are addressed. Finally, I hope this study has
provided insight into the nature of the vector boson fusion production pathway of the Higgs and into
associated variables that may be used in the analysis.
3.2 Suggestions for Future Studies
The collinear approximation performed surprisingly, perhaps suspiciously, well once the entire
visible tau was used as opposed to the charged tau products. It is possible that the collinear
approximation is in fact a valid approximation much of the time, however, I have strong suspicions that
it will not work as well on reconstructed data. One way this could be studied still within a truth study is
by β€œsmearing” (adding zero mean Gaussian noise) to the transverse momentum of all objects in the
event to simulate reconstruction inaccuracy and observe how well the collinear approximation holds
up. Additionally, one could test just how collinear the neutrinos are with their respective tau leptons
explicitly by studying the βˆ†π‘… between the neutrino and tau on the truth level.
While there were over 750,000 Zβ†’ 𝜏𝜏 events, and over 6,000,000 𝑑𝑑 events, in the Monte Carlo
samples, only about 40,000 total background events survived preselection cuts, then only half of those
events were used to train the boosted decision tree while the other half was used for testing. In
comparison, over 300,000 VBF events make it past preselection to the multivariate analysis stage.
Although the initial number of events is very large for the background processes, I could have actually
used far more while training the BDT. For further Monte Carlo studies, I would suggest increasing the
statistics at least for the Zβ†’ 𝜏𝜏 background to at least a couple millions of events to ensure that enough
events make it past preselection to the BDT training.
The Fox-Wolfram moments have shown promising separation power, and may be very powerful
given a correct tuning to the VBF topology. In this study, moments calculated using just the leading
and subleading jet were experimented with in addition to a few studies using both the jets and the two
tau leptons. Further analyses may explore different combinations of objects to use in the moments,
perhaps even a third jet or no jets at all, in addition finding the optimal weighting term to use.
Additionally, there exist modified Fox-Wolfram moments that are invariant to Lorentz boosts which
may provide more clear results. In any case, it will need to be demonstrated the Fox-Wolfram moments
provide new information about the event that is not contained in the five variables presented for the
analysis in this study if they are to be useful in a multivariate analysis.
3.3 Thanks!
I can’t express my gratitude enough for the opportunity to study here in GΓΆttingen for the
summer, it has been an eye opening and truly enjoyable experience to live abroad and get a taste of
particle physics. To everyone within the institute, thank you for your kindness and help over the
summer; you’re all brilliant physicists and even better people. Finally, I have to thank my Ph.D.
student supervisor Antonio De Maria for organizing a great project for me to work on, for his help
whenever it was needed, and his fantastic taste in music.
Multivariate Analysis of the Vector Boson Fusion Higgs Boson 16
References
[1] β€œTest of CP Invariance in vector-boson fusion production of the Higgs bson using the Optimal
Observable method in the ditau decay channel with the ATLAS detector”.
arXiv:1602.04516v1
[2] K.A. Olive et al. (Particle Data Group), Chin. Phys. C, 38, 090001 (2014).
[3] β€œSearch for the 𝑏𝑏 decay of the Standard Model Higgs boson in associated (W/Z)H
production with the ATLAS detector”. arXiv:1409.6212v2
[4] β€œProspects for the Search for a Standard Model Higgs Boson in ATLAS using Vector Boson
Fusion”. arXiv:hep-ph/0402254v1
[5] β€œReconstruction of hadronic decay products of tau leptons with the ATLAS experiment”.
arXiv:1512.05955
[6] β€œEvidence for the Higgs-boson Yukawa coupling to tau leptons with the ATLAS detector”.
arXiv:1501.04943
[7] β€œFox-Wolfram Moments in Higgs Physics”. arXiv:1212.4436
[8] A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E. von Toerne, and H. Voss, TMVA -
Toolkit for Multivariate Data Analysis, PoS ACAT 040 (2007), arXiv:physics/0703039
[9] Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.

More Related Content

Viewers also liked

Make Like the Masters by Peter DiCristofaro, Providence Jewelry Museum
Make Like the Masters by Peter DiCristofaro, Providence Jewelry MuseumMake Like the Masters by Peter DiCristofaro, Providence Jewelry Museum
Make Like the Masters by Peter DiCristofaro, Providence Jewelry MuseumHarriete Estel Berman
Β 
Tics en la educaciΓ³n - redes sociales
Tics en la educaciΓ³n - redes socialesTics en la educaciΓ³n - redes sociales
Tics en la educaciΓ³n - redes socialesJavie Esau Estrada Torres
Β 
HIMSS Connected Health Conference Summary
HIMSS Connected Health Conference SummaryHIMSS Connected Health Conference Summary
HIMSS Connected Health Conference SummaryKent State University
Β 
Bni ojos que no ven...
Bni ojos que no ven...Bni ojos que no ven...
Bni ojos que no ven...Amaya RonzΓ³n
Β 
Topological sort
Topological sortTopological sort
Topological sortstella D
Β 
Bible Journey - ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 4,5
Bible Journey - ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 4,5Bible Journey - ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 4,5
Bible Journey - ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 4,5Dr. Bella Pillai
Β 
Bible Journey ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€Ώΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 9-11
Bible Journey ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€Ώΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 9-11Bible Journey ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€Ώΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 9-11
Bible Journey ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€Ώΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 9-11Dr. Bella Pillai
Β 
La transformaciΓ³n digital de la gestiΓ³n de clientes con MS Dynamics CRM
La transformaciΓ³n digital de la gestiΓ³n de clientes con MS Dynamics CRMLa transformaciΓ³n digital de la gestiΓ³n de clientes con MS Dynamics CRM
La transformaciΓ³n digital de la gestiΓ³n de clientes con MS Dynamics CRMInnovar TecnologΓ­as
Β 
Microsoft Dynamics CRM 2016 para servicios de campo
Microsoft Dynamics CRM 2016 para servicios de campoMicrosoft Dynamics CRM 2016 para servicios de campo
Microsoft Dynamics CRM 2016 para servicios de campoInnovar TecnologΓ­as
Β 

Viewers also liked (9)

Make Like the Masters by Peter DiCristofaro, Providence Jewelry Museum
Make Like the Masters by Peter DiCristofaro, Providence Jewelry MuseumMake Like the Masters by Peter DiCristofaro, Providence Jewelry Museum
Make Like the Masters by Peter DiCristofaro, Providence Jewelry Museum
Β 
Tics en la educaciΓ³n - redes sociales
Tics en la educaciΓ³n - redes socialesTics en la educaciΓ³n - redes sociales
Tics en la educaciΓ³n - redes sociales
Β 
HIMSS Connected Health Conference Summary
HIMSS Connected Health Conference SummaryHIMSS Connected Health Conference Summary
HIMSS Connected Health Conference Summary
Β 
Bni ojos que no ven...
Bni ojos que no ven...Bni ojos que no ven...
Bni ojos que no ven...
Β 
Topological sort
Topological sortTopological sort
Topological sort
Β 
Bible Journey - ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 4,5
Bible Journey - ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 4,5Bible Journey - ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 4,5
Bible Journey - ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 4,5
Β 
Bible Journey ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€Ώΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 9-11
Bible Journey ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€Ώΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 9-11Bible Journey ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€Ώΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 9-11
Bible Journey ΰ€¬ΰ€Ύΰ€‡ΰ€¬ΰ€Ώΰ€² ΰ€―ΰ€Ύΰ€€ΰ₯ΰ€°ΰ€Ύ 9-11
Β 
La transformaciΓ³n digital de la gestiΓ³n de clientes con MS Dynamics CRM
La transformaciΓ³n digital de la gestiΓ³n de clientes con MS Dynamics CRMLa transformaciΓ³n digital de la gestiΓ³n de clientes con MS Dynamics CRM
La transformaciΓ³n digital de la gestiΓ³n de clientes con MS Dynamics CRM
Β 
Microsoft Dynamics CRM 2016 para servicios de campo
Microsoft Dynamics CRM 2016 para servicios de campoMicrosoft Dynamics CRM 2016 para servicios de campo
Microsoft Dynamics CRM 2016 para servicios de campo
Β 

Similar to InternshipReport

Senior_Thesis_Evan_Oman
Senior_Thesis_Evan_OmanSenior_Thesis_Evan_Oman
Senior_Thesis_Evan_OmanEvan Oman
Β 
BEP Tom Schoehuijs
BEP Tom SchoehuijsBEP Tom Schoehuijs
BEP Tom SchoehuijsTom Schoehuijs
Β 
DavidBautista Imperial Thesis
DavidBautista Imperial ThesisDavidBautista Imperial Thesis
DavidBautista Imperial ThesisDavid Bautista
Β 
replicaciΓ³n 4.pdf
replicaciΓ³n 4.pdfreplicaciΓ³n 4.pdf
replicaciΓ³n 4.pdfWAntonioMuozCh1
Β 
GlebPhysicsThesis
GlebPhysicsThesisGlebPhysicsThesis
GlebPhysicsThesisGleb Batalkin
Β 
GFilosofi_solitons_2005
GFilosofi_solitons_2005GFilosofi_solitons_2005
GFilosofi_solitons_2005Gabriele Filosofi
Β 
Modelling the Chaotic Waterwheel
Modelling the Chaotic WaterwheelModelling the Chaotic Waterwheel
Modelling the Chaotic WaterwheelEdward Pode
Β 
MasterThesis(SubmittedVer)
MasterThesis(SubmittedVer)MasterThesis(SubmittedVer)
MasterThesis(SubmittedVer)YΓ»to Murashita
Β 
A_Papoulia_thesis2015
A_Papoulia_thesis2015A_Papoulia_thesis2015
A_Papoulia_thesis2015Asimina Papoulia
Β 
Coulomb gas formalism in conformal field theory
Coulomb gas formalism in conformal field theoryCoulomb gas formalism in conformal field theory
Coulomb gas formalism in conformal field theoryMatthew Geleta
Β 
Protdock - Aatu Kaapro
Protdock - Aatu KaaproProtdock - Aatu Kaapro
Protdock - Aatu KaaproSwapnesh Singh
Β 
thesis_choward
thesis_chowardthesis_choward
thesis_chowardChris Howard
Β 

Similar to InternshipReport (20)

thesis
thesisthesis
thesis
Β 
Senior_Thesis_Evan_Oman
Senior_Thesis_Evan_OmanSenior_Thesis_Evan_Oman
Senior_Thesis_Evan_Oman
Β 
MSci Report
MSci ReportMSci Report
MSci Report
Β 
BEP Tom Schoehuijs
BEP Tom SchoehuijsBEP Tom Schoehuijs
BEP Tom Schoehuijs
Β 
dissertation
dissertationdissertation
dissertation
Β 
DavidBautista Imperial Thesis
DavidBautista Imperial ThesisDavidBautista Imperial Thesis
DavidBautista Imperial Thesis
Β 
thesis_lmd
thesis_lmdthesis_lmd
thesis_lmd
Β 
replicaciΓ³n 4.pdf
replicaciΓ³n 4.pdfreplicaciΓ³n 4.pdf
replicaciΓ³n 4.pdf
Β 
trevor_thesis
trevor_thesistrevor_thesis
trevor_thesis
Β 
GlebPhysicsThesis
GlebPhysicsThesisGlebPhysicsThesis
GlebPhysicsThesis
Β 
GFilosofi_solitons_2005
GFilosofi_solitons_2005GFilosofi_solitons_2005
GFilosofi_solitons_2005
Β 
Bachelor Arbeit
Bachelor ArbeitBachelor Arbeit
Bachelor Arbeit
Β 
Modelling the Chaotic Waterwheel
Modelling the Chaotic WaterwheelModelling the Chaotic Waterwheel
Modelling the Chaotic Waterwheel
Β 
MasterThesis(SubmittedVer)
MasterThesis(SubmittedVer)MasterThesis(SubmittedVer)
MasterThesis(SubmittedVer)
Β 
Aaq
AaqAaq
Aaq
Β 
A_Papoulia_thesis2015
A_Papoulia_thesis2015A_Papoulia_thesis2015
A_Papoulia_thesis2015
Β 
Coulomb gas formalism in conformal field theory
Coulomb gas formalism in conformal field theoryCoulomb gas formalism in conformal field theory
Coulomb gas formalism in conformal field theory
Β 
Protdock - Aatu Kaapro
Protdock - Aatu KaaproProtdock - Aatu Kaapro
Protdock - Aatu Kaapro
Β 
Master thesis
Master thesisMaster thesis
Master thesis
Β 
thesis_choward
thesis_chowardthesis_choward
thesis_choward
Β 

InternshipReport

  • 1. Multivariate Analysis of the Vector Boson Fusion Higgs Boson Brendan Marsh University of Missouri August 8, 2016 Ph.D. Student Supervisor: Antonio De Maria Supervisor: Prof. Dr. Arnulf Quadt Abstract A multivariate analysis is presented for the study of the vector boson fusion (VBF) Higgs boson decaying to a pair of tau leptons. While the VBF production mechanism of the Higgs is roughly an order of magnitude lower in cross section than the dominant gluon-gluon fusion mechanism, it is shown that VBF produces a distinctive signature that is well suited for detection by multivariate analyses. A number of discriminant variables are explored in addition to a direct comparison of different machine learning toolkits. Ultimately, a statistical significance of 7.9 is achieved for detection of the VBF Higgs boson in this truth level study.
  • 2. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 . . . . . . . . . . . . . . . . . . . . . . . . . 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 . . . . . . . . . . . . . . . . . . . . . . . . . . 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Contents 1. Motivation and Background 1.1 The Higgs Boson 1.2 Vector Boson Fusion 1.3 Fully Hadronic Decay Mode 1.4 Background Processes 2. Multivariate Analysis 2.1 Monte Carlo Samples 2.2 Preselection Cuts 2.3 Cut Based Analysis 2.4 Decision Trees 2.5 Adaptive Boosting 2.6 Discriminant Variables 2.6.1 Collinear Approximation 2.6.2 Tau Centrality Product 2.6.3 πœ‚ Variables 2.6.4 Tau-Jet Angular Correlations 2.6.5 Fox-Wolfram Moments 2.6.6 MVA Variables 2.7 TMVA Multivariate Analysis 2.8 Scikit Learn Multivariate Analysis 3. Conclusions 3.1 Outlook for VBF Higgs Analysis 3.2 Suggestions for Future Studies 3.3 Thanks! References
  • 3. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 2 1. Motivation and Background 1.1 The Higgs Boson Within the context of the Standard Model (SM), the Higgs mechanism is necessary for the mass generation of the W and Z gauge bosons. By invoking a break in electroweak symmetry, the Higgs mechanism implies the existence of a spin zero, neutral particle; we know this particle as the Higgs boson. For many years, the Higgs remained elusive in particle detectors. It was not until July 4, 2012 that CERN announced that both the CMS and ATLAS experiments at the large hadron collider (LHC) met the 5𝜎 discovery benchmark for a new boson with a mass of roughly 125 GeV that was consistent with a Higgs boson. It seems the Higgs has finally been found! Many studies of the Higgs boson are ongoing as Run II of the LHC is currently approaching an online integrated luminosity of 20 inverse femtobarns. As our studies of the Higgs progress, the vector boson fusion production mechanism becomes increasingly important as a detection pathway, in CP violation studies [1], and in other areas. 1.2 Vector Boson Fusion A standard model Higgs boson may be produced via one of four production mechanisms at the LHC. The vector boson fusion (VBF) mechanism involves the scattering of two quarks via the exchange of a W or Z (vector) boson. This pair of vector bosons then fuses to produce a low mass Higgs boson. Figure 2 Left: Feynman diagrams of the four Higgs production mechanisms at the LHC, with vector boson fusion highlighted in red. Right: Corresponding cross section for Higgs production mechanisms. One can see from the cross section that the gluon-gluon mechanism is roughly an order of magnitude greater than that of the VBF mechanism for a Higgs of mass 125 GeV [2]. However, the addition of the two quarks into the final state, visible as highly energetic jets, produces a Figure 1 The elementary particles of the Standard Model, labelled with their mass, charge, and spin.
  • 4. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 3 distinctive signature that is lacking in gluon-gluon fusion. In terms of measurable quantities, VBF events may be recognized by the following characteristics: β€’ Highly πœ‚ separated jets β€’ Jets in opposite hemispheres β€’ High invariant mass of jets β€’ No central jets above a certain 𝑝$ 1.3 Fully Hadronic Decay Mode The 125 GeV Higgs boson most often decays into a 𝑏𝑏 pair, however this decay mode is not easily recovered in a sea of 𝑑𝑑 background [3]. The Higgs additionally may decay into a 𝜏( 𝜏) pair; this is the decay mode studied in this analysis. Specifically, I investigate the β€œfully hadronic” decay mode in which both tau leptons subsequently decay into a tau neutrino and a number of pions, which accounts for roughly 41% of the branching ratio[2]. A Feynman diagram of the signal process is given below. Figure 3 The Feynman diagram of the signal process of this study; a Higgs boson production via vector boson fusion with a subsequent decay into tau leptons, a tau neutrino, and pion. 1.4 Background Processes A bit like searching for a needle in a haystack, the VBF Higgs process is a rare event that is drowned out by background processes with similar event characteristics and much higher cross sections. To detect a small signal in a sea of background, one’s goal is to remove as much of the background as possible while retaining as many signal events as possible. Thus, it is equally as important to understand the background processes competing with your signal process as it is important to understand your signal process. The main background processes relevant to this study are the Zβ†’ 𝜏𝜏 and 𝑑𝑑 processes. Zβ†’ 𝜏𝜏 + 𝑗𝑒𝑑𝑠 According to the particle data group [2], the Z boson decays into a pair of tau leptons with a branching ratio of roughly 3.4%. As Z bosons are produced in excess at the LHC, this channel introduces a large background with the same final state, a pair of tau leptons. Fortunately, there do exist features of VBF that we expect to differ in the case of Zβ†’ 𝜏𝜏. Foremost, the invariant mass of the reconstructed taus should reflect the mass of the particle from which it came, although mass reconstruction can be difficult (section 2.6.1). For VBF taus we expect to see the mass of the Higgs, roughly 125 GeV, while for the Zβ†’ 𝜏𝜏 channel we expect a peak around 91 GeV. Additionally, the distinctive jet topology of VBF is not expected in the Zβ†’ 𝜏𝜏 channel. 𝑑𝑑 Top quarks almost always decay into W boson – b quark pairs, with the W boson then emitting a tau lepton. Thus, given two top quarks it is possible to have two taus in the final state. Therefore 𝑑𝑑 background, also produced in excess at the LHC, poses another background process. However, there exist a number of features of the 𝑑𝑑 background that make it quite easy to eliminate. Very often in the , ,
  • 5. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 4 final state of the 𝑑𝑑 background there exist jets originating from b quarks, while this is rare for VBF final states. Fortunately, there exist β€œb-tagging” algorithms capable of labelling jets in the detector that most likely arise from b quarks. Thus, we may cut out events with b jets, leaving Zβ†’ 𝜏𝜏 as irreducible background. Additionally, we do not expect to find any correlations between the tau decay products and the missing transverse energy, unlike VBF in which they are heavily correlated. 2. Multivariate Analysis The basic goal of any multivariate analysis (MVA) is to classify signal events over background events, with as high of an efficiency as possible, given some input variables for each event. Most MVAs take a number of input variables and return a single measure of β€œsignal-likeness”, which must hit a certain threshold to be considered a signal event. Before diving into the multivariate techniques used for this analysis, the training samples used to develop and test the analysis will be described, along with the traditional cut based analysis for VBF and reasons why it can be improved using a multivariate analysis. 2.1 Monte Carlo Samples Monte Carlo simulations provide a powerful tool for studying stochastic processes. Here, Powheg and Pythia 8 Monte Carlo generators were used to simulate truth level events for both VBF and the relevant background processes at a centre of mass energy of 𝑠 = 13 TeV. Using these simulated events, one may train a multivariate analysis method to be applied to real data. The Monte Carlo samples used for this study are given below. It is important to note that this was truth level study only; no reconstruction or trigger level effects have been incorporated. These effects are non-negligible and should incorporated in future studies. 2.2 Preselection Cuts A number of cuts may be applied to the events before any classifier is used. Some of these cuts correspond to limitations of the ATLAS detector (corresponding to events that would not be well reconstructed in practice) while others are made specifically to remove background events. The preselection cuts used for this analysis are given below. If any event does not fulfill the criteria, it is discarded from the analysis. The transverse momentum of both tau leptons must be at least 20 GeV to be detected and reconstructed by tau reconstruction algorithms. The absolute value of πœ‚, the pseudorapidity, of each tau lepton must be less than 2.5 for good reconstruction in the tracker. The missing transverse energy should be greater than 20 GeV, as we expect missing energy from neutrinos in the final state. 𝜏678 p$ > 20 GeV |πœ‚;| < 2.5 MET > 20 GeV
  • 6. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 5 The transverse momentum of the leading and subleading jet should be greater than 20 GeV to be detected. B-tagging algorithms can identify jets originating from b quarks, thus b- tagged jets can be cut to eliminate 𝑑𝑑 background. In truth level studies, one uses the PDG (Particle Data Group) ID to identify and cut b-jets. 2.3 Cut Based Analysis The most basic form of classifier, and the one that is often used due to its simplicity and physical motivation, is a simple cut based analysis. This entails requiring a candidate event to pass a series of univariate β€œcuts” which are motivated by knowledge of the signal process. The traditional cuts used to identify VBF events over background events are given below [4]. VBF produces highly energetic quark jets into the final state, we expect to see a leading jet with high transverse momentum. There are two quark jets into the final state, thus the subleading jet should also have high transverse momentum. The jets of VBF have characteristically high separation in pseudorapidity. The VBF topology exhibits jets that are back-to-back. The highly energetic jets show a high invariant mass. The tau leptons should be detected in the central part of the detector in comparison to the jets. Explicitly, the pseudorapidity of the taus should lie between the range spanned by the jets. The cut based analysis has its advantages; it is very simple to implement, requires no β€œtraining” like the multivariate methods, and the rationale for each of the cuts is grounded in physics. However, while it excels in its understandability, it often lacks the classification power required to recover rare processes like the VBF Higgs. The inferiority of the cut based analysis lies in the assumption that each variable can be cut upon independently of the others when, in fact, the best cut to make on one variable may depend on another, or even many others. That is, correlations cannot be accounted for. This issue is addressed by multivariate classification methods like decision trees. 2.4 Decision Trees Decision trees, like cut based analyses, split events into groups by setting a threshold on some variable. However, while the cut based analysis only makes a single round of cuts, decision trees continue to further subdivide groups, separating signal from background more and more at each step by making the most efficient cut possible. Additionally, the most efficient cuts are calculated algorithmically from a set of data used to β€œtrain” the decision tree. p$ <=>? > 40 GeV p$ 8@A<=>? > 30 GeV |πœ‚<=>? βˆ’ πœ‚8@A<=>?| > 3 πœ‚<=>? βˆ— πœ‚8@A<=>? < 0 π‘šEFGHI(EJKLFGHI > 300 GeV Jets-Taus Centrality p$ <=>? , p$ 8@A<=>? > 20 GeV No b-tagged jets
  • 7. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 6 Figure 4 A simple decision tree. Here orange represents VBF events while blue represents background events. At each stage, groups become more purely signal or background by splitting on some variable. The metric that is normally minimized for each split is the Gini impurity of the current group of events. It is defined as the probability of incorrectly labelling a random event in the group based on the known distribution of signal and background within the group. For a binary classification problem, the Gini impurity for a group of events is given by the following formula: 𝐼O = 𝑛87Q βˆ— 1 βˆ’ πœ‚87Q + 𝑛AQ βˆ— 1 βˆ’ 𝑛AQ Unlike a cut based analysis, which can only form rectangular signal regions in the variable phase space, decision trees can be grown to approximate arbitrarily complex decision functions. However, decision trees, too, are not without their flaws. The intuition of a cut based analysis is lost since the splits are generated algorithmically. Additionally, it is very easy to grow a tree that is too deep that begins to train itself to recognize individual points in the training data, becoming artificially complex. This phenomenon is well known in the field of machine learning, and is commonly known as β€œovertraining”. To address this issue, a technique known as boosting is performed as opposed to older β€œpruning” methods which grow full decision trees then backtrack and discard unimportant splits. 2.5 Adaptive Boosting Adaptive boosting, or AdaBoost, is a general method that can be applied to a number of classifiers, such as decisions trees, to improve reliability, performance, and resistance to overtraining. In the context of adaptive boosting of decision trees, the single decision tree is replaced by a β€œforest” consisting of hundreds of decision trees which are restricted to only a few levels, such as the one above. As a whole, this forest of decision trees is called a boosted decision tree (BDT), and the output of the BDT is a weighted sum of the outputs of each individual tree. Each individual decision tree is called a β€œweak learner” in the sense that it is only one of many classifiers in the forest. Here is where the adaptive boosting comes in; each weak learner is trained iteratively to improve upon the previous one. The first weak learner is trained as a normal decision tree from the training data. However, the results of the first weak learner are then used to weight the importance of the training data for the next weak learner; points that were classified correctly receive small weights while incorrectly classified points receive large weights. In this way, the next weak learner is trained focusing on points that have not been classified well by the previous weak learner. This process continues such that each weak learner focuses on correcting mistakes of the last, improving at each step. The process is visualised below.
  • 8. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 7 Figure 6 A view of the transverse plane depicting the collinear approximation. The tau neutrinos go collinearly with the tau leptons such that their sum matches the missing transverse energy. Figure 5 Training of an AdaBoost classifier. The first classifier trains on unweighted data, then reweights the data for the next and so on to produce the final classifier. 2.6 Discriminant Variables When training a BDT, a balance should be found between the number of variable inputs to the BDT and the performance of the BDT. Additionally, while BDTs are known to handle correlated variables quite well, it is superfluous to include two strongly correlated variables, only one of which adds discriminatory power to the classification. Much of my work this summer was spent investigating variables, both common and newly devised, to search for new discriminating variables for use in a multivariate analysis. The most important in the analysis was the ditau mass, calculated via the collinear approximation. 2.6.1 Collinear Approximation In the case of VBF, the mass of the ditau should correspond to the mass of the Higgs, for Zβ†’ 𝜏𝜏 the mass of the Z boson, and for 𝑑𝑑 we expect no clear peak. Thus, there are good physical motivations for the use of the ditau mass in our MVA. However, in order to fully reconstruct the ditau one needs the missing neutrinos. The collinear approximation accounts for the missing neutrinos by making the following assumptions. 1. The tau neutrinos are perfectly collinear with their associated tau lepton. 2. The missing transverse energy is entirely due to the tau neutrinos. Under these approximations, the magnitude of the neutrino momenta becomes completely determined by the missing transverse energy. One is then left with a simple matter of constructing the neutrinos collinearly with the taus such that the sum of the neutrinos is precisely the missing transverse energy. The collinear approximation is not always applicable; when the tau leptons are emitted back to back in the πœ™ plane, it is impossible to reconstruct the missing transverse energy. This leads to a simple constraint between taus: cos βˆ†πœ™ > βˆ’0.99
  • 9. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 8 Historically, the collinear approximation has relied upon using the charged decay products of the tau leptons, be it either 1-prong or 3-prong decays. However, the decay products may also include a neutral pion. Recently, tau substructure algorithms have become available that allow for reconstruction of the entire visible (charged + neutral) tau [5]. One of my first studies was on the marked improvement in the collinear approximation as a result of using the entire visible tau. Figure 7 The collinear approximation using the charged tau leptons (left) and the full visible tau leptons (right). The blue histograms represent VBF and red represents combined backgrounds scaled appropriately. All distributions normalized to unity, and units are in GeV. As you can see, there is a remarkable improvement using tau substructure techniques to reconstruct the visible tau. In future studies, I suggest applying smearing of the transverse momentum or otherwise modelling imprecision in the detector to see if the collinear approximation remains as robust as it is in this truth study. Needless to say, this variable made it to the final MVA. 2.6.2 Tau Centrality Product In the context of VBF topology, centrality has been used as a flag indicating whether or not a tau lepton is centrally located in the detector with respect to the jets. Explicitly, a tau lepton is central if its pseudorapidity lies in the range spanned by the leading and subleading jet. To generalize this binary variable to a continuous variable, which is more powerful in multivariate analyses, the following definition has been suggested [6]. 𝐢; ≔ exp βˆ’ πœ‚; βˆ’ πœ‚>6Q βˆ†πœ‚ ^ where πœ‚>6Q ≔ πœ‚<=>? + πœ‚8@A<=>? 2 , βˆ†πœ‚ ≔ πœ‚<=>? βˆ’ πœ‚8@A<=>? A perfectly central tau lepton (with exactly the average πœ‚ of the jets) will have a centrality of one, while a tau lepton far from the average πœ‚ of the jets will have centrality close to zero. Note that if the jets are not well separated in πœ‚, the centrality also approaches zero. The authors of this continuous centrality variable used the centrality of the two taus as independent variables. However, I found the two variables to have an 88% positive correlation for VBF. By taking the product of the two tau centralities, a single uncorrelated variable is achieved with greater separation power than either of the individual centralities. 𝐢cde? ≔ 𝐢;f βˆ— 𝐢;g = exp βˆ’ πœ‚;f βˆ’ πœ‚>6Q βˆ†πœ‚ ^ βˆ’ πœ‚;g βˆ’ πœ‚>6Q βˆ†πœ‚ ^ Collinear Approximation Ditau Mass (Charged) 0 20 40 60 80 100 120 140 160 Events 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 Collinear Approximation Ditau Mass (Visible) 0 20 40 60 80 100 120 140 160 Events 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
  • 10. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 9 Figure 8 The centrality of the individual tau leptons (left and centre) vs. the product of tau centrality (right). Given the redundancy of the correlated variables and increased separation power of the product variable, it was the centrality product variable that made it to the final multivariate analysis. 2.6.3 𝜼 Variables Variables explicitly related to the pseudorapidity of the leading and subleading jets are common in analyses of the VBF Higgs, including the cut based analysis already presented. On the surface, these variables seem well suited to multivariate analysis as well given their separation power. However, I found that these traditional VBF variables are highly correlated with the invariant mass of the jets. Figure 9 βˆ†πœ‚ (centre) and πœ‚<=>? βˆ— πœ‚8@A<=>? (right) of the leading and subleading jets, along with their correlations to the invariant mass of the jets (left). Given the strong correlations within this group of variables, I was not surprised to find that eliminating βˆ†πœ‚ and πœ‚<=>? βˆ— πœ‚8@A<=>? from the MVA led to no decrease in performance of the BDT. The invariant mass of the jets displayed the greatest separation power (see figure 11), thus, despite their prevalence in traditional VBF studies, I have chosen to exclude βˆ†πœ‚ and πœ‚<=>? βˆ— πœ‚8@A<=>? from the final analysis. 2.6.4 Tau-Jet Angular Correlations The Higgs boson is a spin 0 particle; Z bosons are spin 1 particles. My Ph.D. supervisor and I were interested in whether or not this difference in spin quantum number manifests itself in angular correlations between the tau leptons themselves or between tau leptons and the leading and subleading jet. A number of variables were investigated, boosted into different reference frames, probing any angular correlations. Tau 0 Centrality 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Events 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Tau 1 Centrality 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Events 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Tau Centrality Product 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Events 0 0.05 0.1 0.15 0.2 0.25 0.3 Jets dEta 0 1 2 3 4 5 6 7 8 9 Events 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 Jets Eta Product 15βˆ’ 10βˆ’ 5βˆ’ 0 5 10 Events 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
  • 11. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 10 Jets Plane / Taus Plane Angle 0 0.5 1 1.5 2 2.5 3 Events 0 0.005 0.01 0.015 0.02 0.025 0.03 Jets Plane Eta 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Events 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Selected Angular Variables Taus βˆ†π‘… The βˆ†π‘… separation of the two tau leptons. Taus πœ™ Centrality The same as the continuous tau centrality variable, but in πœ™ instead of πœ‚. Jets-Taus Plane Total angle between the two planes formed by Angle the tau leptons and the jets. Jets Plane πœ‚ πœ‚ of the normal vector to the plane formed by the two jets. The angular relationships amongst the tau leptons and jets, beyond the expected VBF jet topology, seems to be subtle if existent at all. While the βˆ†π‘… of the taus above shows modest separation, inclusion in the MVA yielded no improvement, and unfortunately the angle between the tau plane and jet plane seems indifferentiable between VBF and background. Boosting to various center of mass reference frames generally had little effect on separation power. 2.6.5 Fox-Wolfram Moments The Fox-Wolfram moments are a set of event descriptors that are currently under investigation for use in replacing traditional cuts with these more advanced metrics [7]. The moments arise from superpositions of spherical harmonics, defined as follows. π‘Š7,E k ∢= Above, the sum goes over any number of objects in the event (such as the leading and subleading jet for the VBF topology), Ξ©7,E corresponds to the total angle between the i’th and j’th objects, and 𝑃< are the Legendre polynomials. The weight term π‘Š7,E k may take many forms, as given above. A preliminary study of the Fox-Wolfram moments in the analysis of VBF has shown that the moments display considerable separation power, however, when included in the multivariate analysis have not improved the classification efficiency. Included below are plots of two sets of Fox-Wolfram moments. On the left, only the leading and subleading jets were considered, and the best weight was found to be the unit weight. On the right, both tau leptons are also included as objects into the moment calculations, for which the transverse momentum weighting scheme was found to be best. Tau 1 Phi Centrality 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Events 0 0.1 0.2 0.3 0.4 0.5 Taus dR 0.5 1 1.5 2 2.5 3 3.5 Events 0 0.01 0.02 0.03 0.04 0.05 0.06
  • 12. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 11 100βˆ’ 80βˆ’ 60βˆ’ 40βˆ’ 20βˆ’ 0 20 40 60 80 100 ditauMass mjj sumPT PTsum tausCentrality ditauMass mjj sumPT PTsum tausCentrality Correlation Matrix (signal) 100 100 -12 26 42 -12 100 21 -28 26 21 100 -2 42 -28 -2 100 Linear correlation coefficients in % π‘š;f(;g π‘šEFGHI(EJKLFGHI 𝐢;f βˆ— 𝐢;g p$ <=>? + p$ 8@A<=>? p$ <=>?(8@A<=>? 100βˆ’ 80βˆ’ 60βˆ’ 40βˆ’ 20βˆ’ 0 20 40 60 80 100 ditauMass mjj sumPT PTsum tausCentrality ditauMass mjj sumPT PTsum tausCentrality Correlation Matrix (background) 100 2 2 100 19 38 39 2 19 100 35 2 38 35 100 -2 39 -2 100 Linear correlation coefficients in % Figure 10 The first four Fox-Wolfram moments considering only jets, with a unit weighting (left). The first four Fox-Wolfram moments considering jets and tau leptons, with transverse momentum weight (right). While only the first four moments are displayed here for brevity, the odd and even moments are highly correlated though distinct. Unfortunately, my time has run short to fully investigate the Fox-Wolfram moments as potentially useful discriminating variables in the multivariate analysis. For future studies, I would suggest to explore the β€œmodified” Fox-Wolfram moments which are invariant to Lorentz boosts, and explore any correlations that may exist between the moments and the MVA variables already in use. 2.6.6 MVA Variables The final list of variables for use in the multivariate analysis was pruned down starting with roughly ten variables that showed the strongest separation power. After identifying correlations and removing variables that led to no improvement in classification efficiency, the following variables remain in the final analysis. The invariant mass of the ditau, reconstructed via the collinear approximation using the full visible tau leptons. The invariant mass of the leading and subleading jets. The product of the centrality of the two tau leptons. The scalar sum of the transverse momenta of the leading and subleading jets. The transverse momentum of the vector sum of the leading and subleading jets.
  • 13. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 12 Figure 11 Discriminatory variables for the multivariate analysis. The blue histograms represent VBF and red represents combined backgrounds scaled appropriately. All distributions normalized to unity, masses and momenta are in units of GeV. 2.7 TMVA Multivariate Analysis This multivariate analysis was performed at a centre-of-mass energy of 𝑠 = 13 TeV and at an integrated luminosity of 20 inverse femtobarns, corresponding roughly to current Run II conditions at the LHC. The ROOT analysis framework (or my preference, the python adaptation PyROOT) provides a toolkit for multivariate analysis known as TMVA [8]. This toolkit was utilized to train a boosted decision tree using the discriminant variables presented in section 2.6.6. I was interested in comparing the performance of TMVA with the well-known python machine learning library Scikit Learn. To this end, a boosted decision tree was optimized in TMVA and compared with an identically parameterized boosted decision tree trained in Scikit Learn. Optimization of the BDT parameters in TMVA was performed by performing single scans over parameters like the number of trees or tree depth. A full multivariate sweep over parameter settings and variables was simply too computationally timely and out of the scope of this project. Should one like to take this analysis to the next step, I would recommend performing such a multivariate sweep over BDT parameter settings. The final configuration of the BDT parameters that were found to be important are given to the left.
  • 14. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 13 When training and testing any multivariate method, one must be careful to weigh the training data correctly; while we have a similar amount training data for both the VBF and background processes, in reality the number of background events is much larger than the number of signal events. Thus a weight needs to be applied to events from each process to correct for their relative abundance. π‘Š = opqr. β„’tu = β„’pqr.u β„’tu where β„’v 𝜎 is provided by the Monte Carlo sample. Cross sections were determined for each Monte Carlo sample from the TWiki cross section summaries of the MC15 samples for Run II analyses. Given these cross sections and an integrated luminosity of 20 inverse femtobarns, the expected number of events may be calculated. Additionally, the percentage of events that pass the preselection criteria presented earlier may be calculated per sample, and then applied to determine the expected number of events after preselection. Process Cross Section pb)x Events at β„’ = 20fb)x Events (Preselected) VBF 9.993941 βˆ— 10)^ 1,999 398 Zβ†’ 𝝉𝝉 1.950632 βˆ— 10~ 39,012,642 1,148,098 𝒕𝒕 4.515915 βˆ— 10^ 9,031,830 26,394 As was expected from eliminating b-tagged jets, the 𝑑𝑑 background is more than decimated, leaving Zβ†’ 𝜏𝜏 as the main background. Roughly speaking, the signal to combined background ratio is a staggering 1 3000! The metric for defining the optimal cut value of the classifier is the statistical significance defined as follows, where β€œs” is the number of signal events and β€œb” the number of background events. For a Poisson random variable, the standard deviation is defined as the square root of the total number of events, 𝑠 + 𝑏. Then, the following statistical significance measures the ratio of signal events relative to one standard deviation. Statistical Significance ∢= 𝑠 𝑠 + 𝑏 β‰ˆ 𝑠 𝑏 for b ≫ s Thus, this definition of the statistical significance can either be interpreted as the number of signal events relative to one standard deviation or, if b is much larger than s, as is usual, the number of signal events over the background fluctuation level. The TMVA output classifier along with the optimal cut value after training a boosted decision tree using the parameters given above is shown below. BDT response 0.15βˆ’ 0.1βˆ’ 0.05βˆ’ 0 0.05 0.1 0.15 0.2 dx/(1/N)dN 0 2 4 6 8 10 12 14 16 18 Signal (test sample) Background (test sample) Signal (training sample) Background (training sample) Kolmogorov-Smirnov test: signal (background) probability = 0.008 (0.016) U/O-flow(S,B):(0.0,0.0)%/(0.0,0.0)% TMVA overtraining check for classifier: BDT Cut value applied on BDT output 0.15βˆ’ 0.1βˆ’ 0.05βˆ’ 0 0.05 0.1 0.15 0.2 Efficiency(Purity) 0 0.2 0.4 0.6 0.8 1 Signal efficiency Background efficiency Signal purity Signal efficiency*purity S+BS/ For 398 signal and 1174098 background isS+Bevents the maximum S/ 7.9024 when cutting at 0.1453 Cut efficiencies and optimal cut value Significance 0 1 2 3 4 5 6 7 8
  • 15. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 14 The final statistical significance of the classifier reaches 7.9, albeit the significance curve becomes noisy most likely due to statistical fluctuations with such heavily weighted background events. By any interpretation, the statistical significance can be said to be roughly 6 at minimum. The full interpretation of the outcome will be discussed in the conclusion. 2.8 Scikit Learn Multivariate Analysis Scikit Learn (SKL) is a free, general machine learning library for python [9]. Given its popularity and ease of use, I was interested to see how SKL compares to TMVA in terms of final classifier efficiency, ease of use, and configurability. SKL supports all of the machine learning methods implemented by TMVA and many more, and in the case of boosted decision trees supports many of the same configuration options. However, unlike TMVA, SKL does not directly provide the user with plots (classifier output distributions, optimum cuts, correlation matrices) via a nice GUI. Code had to be written to randomize training and test samples, for viewing the output classifier distribution, for calculation of the maximum statistical significance, and other tasks. For a direct comparison of TMVA and SKL, a boosted decision tree was trained in SKL with identical parameters as was done for TMVA. The resulting output classifier is given below. Max. Statistical Significance: 3.5 SKL performed worse in many regards. As can be seen by the shape of the output classifiers, there exists much more overlap between signal and background even when trained identically to TMVA, leading to roughly only half the statistical significance, seen as the green line, not to scale, that was achieved by TMVA. Additionally, SKL took almost five times longer to train the BDT. 3. Conclusions 3.1 Outlook for VBF Higgs Analysis Overall, the development of a multivariate analysis for the detection of a VBF Higgs boson decaying to a pair of tau leptons with subsequent hadronic decays was quite successful. A theoretical basis was developed to understand the signal process and main backgrounds at play. With only a few basic preselection cuts, the vast majority of 𝑑𝑑 background was eliminated, leaving the Zβ†’ 𝜏𝜏 process as the main background. From knowledge of the underlying physics, a number of candidate discriminant variables were explored for use in the multivariate analysis. Deserving of special attention is the reconstructed ditau mass using the collinear approximation, which has shown very promising improvements in mass resolution with the introduction of tau substructure reconstruction algorithm. Some of the variables typically associated with vector boson fusion, such as the distinctively large separation in pseudorapidity of the leading and subleading jet, were found to be highly correlated and did not make it into the final analysis. Both TMVA and Scikit Learn were used to train boosted decision
  • 16. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 15 trees; TMVA provided faster results with better classification power, and a convenient interface for producing plots. The final statistical significance of the VBF signal reached 7.9. Many aspects of the study, including the final statistical significance, must be kept in context. First and foremost, all aspects of this study were calculated on purely the truth level, no trigger level effects were accounted for, no detector effects beyond simple preselection cuts on pseudorapidity ranges accounted for, and no reconstruction level effects were considered. These effects may pose important effects that should be taken into account in further analyses. Additionally, every algorithm, in particular the b-tagging, tau ID and tau substructure algorithm, has an associated efficiency. On truth level, these efficiencies are not modelled and will further decrease performance on the reconstruction level. Nevertheless, I hope that this multivariate analysis serves as a useful proof of concept for a full scale multivariate analysis in which all of the above issues are addressed. Finally, I hope this study has provided insight into the nature of the vector boson fusion production pathway of the Higgs and into associated variables that may be used in the analysis. 3.2 Suggestions for Future Studies The collinear approximation performed surprisingly, perhaps suspiciously, well once the entire visible tau was used as opposed to the charged tau products. It is possible that the collinear approximation is in fact a valid approximation much of the time, however, I have strong suspicions that it will not work as well on reconstructed data. One way this could be studied still within a truth study is by β€œsmearing” (adding zero mean Gaussian noise) to the transverse momentum of all objects in the event to simulate reconstruction inaccuracy and observe how well the collinear approximation holds up. Additionally, one could test just how collinear the neutrinos are with their respective tau leptons explicitly by studying the βˆ†π‘… between the neutrino and tau on the truth level. While there were over 750,000 Zβ†’ 𝜏𝜏 events, and over 6,000,000 𝑑𝑑 events, in the Monte Carlo samples, only about 40,000 total background events survived preselection cuts, then only half of those events were used to train the boosted decision tree while the other half was used for testing. In comparison, over 300,000 VBF events make it past preselection to the multivariate analysis stage. Although the initial number of events is very large for the background processes, I could have actually used far more while training the BDT. For further Monte Carlo studies, I would suggest increasing the statistics at least for the Zβ†’ 𝜏𝜏 background to at least a couple millions of events to ensure that enough events make it past preselection to the BDT training. The Fox-Wolfram moments have shown promising separation power, and may be very powerful given a correct tuning to the VBF topology. In this study, moments calculated using just the leading and subleading jet were experimented with in addition to a few studies using both the jets and the two tau leptons. Further analyses may explore different combinations of objects to use in the moments, perhaps even a third jet or no jets at all, in addition finding the optimal weighting term to use. Additionally, there exist modified Fox-Wolfram moments that are invariant to Lorentz boosts which may provide more clear results. In any case, it will need to be demonstrated the Fox-Wolfram moments provide new information about the event that is not contained in the five variables presented for the analysis in this study if they are to be useful in a multivariate analysis. 3.3 Thanks! I can’t express my gratitude enough for the opportunity to study here in GΓΆttingen for the summer, it has been an eye opening and truly enjoyable experience to live abroad and get a taste of particle physics. To everyone within the institute, thank you for your kindness and help over the summer; you’re all brilliant physicists and even better people. Finally, I have to thank my Ph.D. student supervisor Antonio De Maria for organizing a great project for me to work on, for his help whenever it was needed, and his fantastic taste in music.
  • 17. Multivariate Analysis of the Vector Boson Fusion Higgs Boson 16 References [1] β€œTest of CP Invariance in vector-boson fusion production of the Higgs bson using the Optimal Observable method in the ditau decay channel with the ATLAS detector”. arXiv:1602.04516v1 [2] K.A. Olive et al. (Particle Data Group), Chin. Phys. C, 38, 090001 (2014). [3] β€œSearch for the 𝑏𝑏 decay of the Standard Model Higgs boson in associated (W/Z)H production with the ATLAS detector”. arXiv:1409.6212v2 [4] β€œProspects for the Search for a Standard Model Higgs Boson in ATLAS using Vector Boson Fusion”. arXiv:hep-ph/0402254v1 [5] β€œReconstruction of hadronic decay products of tau leptons with the ATLAS experiment”. arXiv:1512.05955 [6] β€œEvidence for the Higgs-boson Yukawa coupling to tau leptons with the ATLAS detector”. arXiv:1501.04943 [7] β€œFox-Wolfram Moments in Higgs Physics”. arXiv:1212.4436 [8] A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E. von Toerne, and H. Voss, TMVA - Toolkit for Multivariate Data Analysis, PoS ACAT 040 (2007), arXiv:physics/0703039 [9] Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.