3. Introduction
• The physical detector causes the jet transverse momentum
pT to be different from the true particle-level jet
• Corrected such that it agrees on average with the pT of
the particle level jet
• Determined by using basic kinematic quantities of the jet
• Possible to include more information and get better
corrections using machine learning
• Has been done successfully for b-jets using a deep
feed-forward neural network
• However, this study is about generically applicable
DNN-based corrections
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 3
4. Dataset
• QCD HT -binned samples, 2016 configuration
• /QCD HT* TuneCUETP8M1 13TeV-
madgraphMLM-pythia8/
RunIISummer16MiniAODv3*/MINIAODSIM
• Custom ML JEC dataset by A. Popov (ULB)
• Forked and added SV angles for initial coordinates in
ParticleNet
• Use 10M jets for training set, 2M jets for validation set
and 2M jets for test set
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 4
5. Data distribution
• Same shape for all jet flavours
• Flat in (pT , η) at low pT
• Steeply falling in pT at high pT
• Proportions of b, c, uds, and g jets fixed as 1 : 1 : 2 : 2
0 30 100 300 1000 3000
pgen
T
0
1
2
3
4
5
|
gen
|
100
101
102
103
104
u d s c b g unknown
0
1
2
3
4
Number
of
jets
1e6
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 5
6. Training features
• Event level
• pT , log pT , η, φ, ρ, mass, area
• multiplicity, pT D, σ2, num pv
• Charged PF candidates
• pT , η, φ, ∆pT , ∆η, ∆φ
• dxy, dz, dxy significance, normalized χ2
• num hits, num pixel hits, lost hits
• particle id, pv association quality
• Neutral PF candidates
• pT , η, φ, ∆pT , ∆η, ∆φ
• particle id, hcal energy fraction
• Secondary vertices
• pT , η, φ, ∆pT , ∆η, ∆φ, mass
• flight distance, significance, num tracks
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 6
7. Feature engineering
0 10 20 30 40
Multiplicity
0.00
0.02
0.04
0.06
0.08
fraction
of
jets/bin
80<pgen
T <100 GeV, 0<| gen|<1.3
quark
gluon
Create event-level
features: multiplicity,
pT D, σ2 that helps with
quark gluon discrimation
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
pTD
0.00
0.02
0.04
0.06
0.08
0.10
0.12
fraction
of
jets/bin
80<pgen
T <100 GeV, 0<| gen|<1.3
quark
gluon
0.00 0.05 0.10 0.15 0.20
2
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
fraction
of
jets/bin
80<pgen
T <100 GeV, 0<| gen|<1.3
quark
gluon
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 7
8. Feature engineering
• Relative features for all constituents
• ∆pT,i = ppf
T,i /pjet
T
• ∆ηi = sgn(ηjet)(ηpf
i − ηjet)
• ∆φi = (φpf
i − φjet + π)mod(2π) − π
• One hot encode categorical features
• particle id and primary vertex association quality
• e.g. neutral pid:
[1, 2, 22, 130] -> [
[1, 0, 0, 0], [0, 1, 0, 0],
[0, 0, 1, 0], [0, 0, 0, 1]
]
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 8
9. Target and loss
• Regression target ŷ = log(pgen
T /pT )
• Correction factor is thus ey
where y is the NN output
• MAE loss function L = 1
N
PN
i=1 |yi − ŷi |I|ŷi |<1
• The last factor rejects 0.8% of jets where the target is
way off
1.5 1.0 0.5 0.0 0.5 1.0 1.5
log(pgen
T /pT)
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
fraction
of
jets/bin
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 9
10. Choice of ML models
• For every jet there are global features as well as
constituents
• Jet constituents form a permutation invariant set
• Number of constituents varies from jet to jet
• Order doesn’t matter
• ⇒ Requires special treatment to use it for ML
• Deep Sets and Dynamic Graph CNN are examples of NN
architectures allowing for unordered sets to be consumed
• They have been used for jet tagging in Energy Flow
Networks and ParticleNet respectively
• Modified versions of Deep Sets and ParticleNet are used
to include all available information, both global features
and constituents!
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 10
11. Deep Sets
• Used a JEC study with Deep Sets from 2020 as baseline
• Procedure
• An MLP F : xi → yi is applied to every constituent xi
• Weights of the MLP are shared among all constituents
• The learned parameters are aggregated using a
permutation invariant operation
• Here the sum over all constituents
P
i yi is chosen
• This is based on the theorem that any function G({xi })
invariant under permutations of its inputs can be
represented in the form
P
i F(xi )
• Concatenate with global features and feed into MLP
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 11
12. Deep Sets architecture
Deep Sets Block
n = (64, 128, 256)
Fully Connected
512, ReLu
Fully Connected
256, ReLu
Fully Connected
128, ReLu
Fully Connected
64, ReLu
charged
constituents
global features
Fully Connected
1
Deep Sets Block
n = (64, 128, 256)
neutral
constituents
Deep Sets Block
n = (32, 64, 128)
secondary
vertices
Fully Connected
1024, ReLu
(a) Complete network
Aggregation
constituents
Dense
ReLu
Dense
ReLu
Dense
ReLu
applied
elementwise
BatchNorm
BatchNorm
BatchNorm
(b) Deep Sets block
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 12
13. ParticleNet
• Started from H. Qu’s Keras version of ParticleNet
• Edge convolution
• Begin with coordinates in pseudorapidity-azimuth space
• Calculate k-nearest neighboring particles for each particle
using the coordinates
• “Edge features” are constructed from the constituent
features using the indices of k-nearest neighboring
particles
• Feed into shared MLP to update each particle in the
graph (in practice using convolution layers)
• Perform permutation invariant aggregation, selected
mean which is used in the ParticleNet paper
• Subsequent EdgeConv blocks use the learned feature
vectors as coordinates (hence dynamic)
• Concatenate with global features and feed into MLP
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 13
14. ParticleNet architecture
coordinates
neutral
constituents
EdgeConv Block
k = 16, c = (64, 64, 64)
EdgeConv Block
k = 16, c = (128, 128, 128)
EdgeConv Block
k = 16, c = (256, 256, 256)
Global Average Pooling
Fully Connected
512, ReLu
global features
Fully Connected
256, ReLu
Fully Connected
128, ReLu
Fully Connected
64, ReLu
Fully Connected
1
coordinates
charged
constituents
EdgeConv Block
k = 16, c = (64, 64, 64)
EdgeConv Block
k = 16, c = (128, 128, 128)
EdgeConv Block
k = 16, c = (256, 256, 256)
Global Average Pooling
coordinates
secondary
vertices
EdgeConv Block
k = 8, c = (32, 32, 32)
EdgeConv Block
k = 8, c = (64, 64, 64)
EdgeConv Block
k = 8, c = (128, 128, 128)
Global Average Pooling
(a) Complete network
coordinates constituents
k-NN
edge features
k-NN indices
Dense
ReLu
Dense
ReLu
Dense
ReLu
Aggregation
ReLu
BatchNorm
BatchNorm
BatchNorm
(b) EdgeConv block
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 14
15. Training
• Two models are trained
• Deep Sets with 1.47M parameters
• ParticleNet with 1.20M parameters
• Using TensorFlow 2.4.1
• MirroredStrategy on two Nvidia GeForce RTX 3090 cards
• Adam optimizer
• Batch size 1024
• Learning rate 2 × 10−3, reduced by a factor of 5 when
validation loss plateaus
• Regularization through early stopping callback
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 15
16. Effective data pipeline
Figure: Naive and parallel data handling in TensorFlow.
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 16
17. Loss
• Deep Sets
• min training loss 0.0784
• min validation loss 0.0792
• ParticleNet
• min training loss 0.0776
• min validation loss 0.0785
(a) Deep Sets loss (b) ParticleNet loss
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 17
19. all jet response
102 103
pgen
T
0.99
1.00
1.01
1.02
1.03
1.04
Median
response
0<| gen|<2.5
Standard
Deep Sets
ParticleNet
0.00
0.05
0.10
0.15
0.20
0.25
0.30
IQR
/
median
for
response
0<| gen|<2.5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.9
1.0
Ratio
102 103
pgen
T
0.97
0.98
0.99
1.00
1.01
1.02
1.03
Median
response
2.5<| gen|<5
Standard
Deep Sets
ParticleNet
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
IQR
/
median
for
response
2.5<| gen|<5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.8
0.9
1.0
Ratio
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 19
20. uds jet response
102 103
pgen
T
0.98
1.00
1.02
1.04
1.06
Median
response
uds, 0<| gen|<2.5
Standard
Deep Sets
ParticleNet
0.00
0.05
0.10
0.15
0.20
0.25
IQR
/
median
for
response
uds, 0<| gen|<2.5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.9
1.0
Ratio
102
pgen
T
0.96
0.98
1.00
1.02
1.04
1.06
Median
response
uds, 2.5<| gen|<5
Standard
Deep Sets
ParticleNet
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
IQR
/
median
for
response
uds, 2.5<| gen|<5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.8
0.9
1.0
Ratio
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 20
21. gluon jet response
102 103
pgen
T
0.985
0.990
0.995
1.000
1.005
1.010
1.015
1.020
Median
response
g, 0<| gen|<2.5
Standard
Deep Sets
ParticleNet
0.00
0.05
0.10
0.15
0.20
0.25
0.30
IQR
/
median
for
response
g, 0<| gen|<2.5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.9
1.0
Ratio
102
pgen
T
0.96
0.97
0.98
0.99
1.00
1.01
1.02
Median
response
g, 2.5<| gen|<5
Standard
Deep Sets
ParticleNet
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
IQR
/
median
for
response
g, 2.5<| gen|<5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.8
0.9
1.0
Ratio
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 21
22. b jet response
102 103
pgen
T
0.985
0.990
0.995
1.000
1.005
1.010
1.015
1.020
Median
response
b, 0<| gen|<2.5
Standard
Deep Sets
ParticleNet
0.00
0.05
0.10
0.15
0.20
0.25
0.30
IQR
/
median
for
response
b, 0<| gen|<2.5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.9
1.0
Ratio
102
pgen
T
0.96
0.98
1.00
1.02
1.04
1.06
1.08
Median
response
b, 2.5<| gen|<5
Standard
Deep Sets
ParticleNet
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
IQR
/
median
for
response
b, 2.5<| gen|<5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.8
0.9
1.0
Ratio
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 22
23. c jet response
102 103
pgen
T
0.99
1.00
1.01
1.02
1.03
Median
response
c, 0<| gen|<2.5
Standard
Deep Sets
ParticleNet
0.00
0.05
0.10
0.15
0.20
0.25
0.30
IQR
/
median
for
response
c, 0<| gen|<2.5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.9
1.0
Ratio
102 103
pgen
T
0.97
0.98
0.99
1.00
1.01
1.02
1.03
1.04
Median
response
c, 2.5<| gen|<5
Standard
Deep Sets
ParticleNet
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
IQR
/
median
for
response
c, 2.5<| gen|<5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.8
0.9
1.0
Ratio
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 23
24. flavour difference
g u d s c b
0.990
0.995
1.000
1.005
1.010
1.015
Median
response
pgen
T >30 GeV, 0<| gen|<2.5
Standard
Deep Sets
ParticleNet
g u d s c b
0.97
0.98
0.99
1.00
1.01
1.02
1.03
Median
response
pgen
T >30 GeV, 2.5<| gen|<5
Standard
Deep Sets
ParticleNet
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 24
25. Summary
• Improved pT resolution w.r.t standard corrections
• 10-15% for uds jets, 10% for b & c jets and around 8%
for g jets in the central region
• 10-20% for uds jets and 5-20% for the rest of the jets in
the forward region
• Reduced flavour differences
• Factor of 3 improvement in central region and 30% in
forward region
• ParticleNet vs Deep Sets
• 270k less parameters in my ParticleNet model
• Despite this ParticleNet achieves slightly better
resolution, especially for jets with higher pT
• ParticleNet also has slightly less flavour difference for the
response
• However, Deep Sets has fewer GPU intense operations
and is faster to train
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 25
26. Extra material
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 26
27. Residual response
102 103
pgen
T
0.01
0.00
0.01
0.02
0.03
0.04
0.05
0.06
R
uds
R
b
0<| gen|<2.5
Standard
Deep Sets
ParticleNet
102
pgen
T
0.100
0.075
0.050
0.025
0.000
0.025
0.050
0.075
0.100
R
uds
R
b
2.5<| gen|<5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.01
0.00
0.01
0.02
0.03
0.04
R
uds
R
c
0<| gen|<2.5
Standard
Deep Sets
ParticleNet
102
pgen
T
0.02
0.00
0.02
0.04
0.06
R
uds
R
c
2.5<| gen|<5
Standard
Deep Sets
ParticleNet
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 27
28. Residual response
102 103
pgen
T
0.01
0.00
0.01
0.02
0.03
0.04
0.05
0.06
R
uds
R
g
0<| gen|<2.5
Standard
Deep Sets
ParticleNet
102
pgen
T
0.06
0.04
0.02
0.00
0.02
0.04
0.06
0.08
R
uds
R
g
2.5<| gen|<5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.015
0.010
0.005
0.000
0.005
0.010
0.015
R
b
R
g
0<| gen|<2.5
Standard
Deep Sets
ParticleNet
102
pgen
T
0.02
0.00
0.02
0.04
0.06
R
b
R
g
2.5<| gen|<5
Standard
Deep Sets
ParticleNet
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 28
29. Residual response
102 103
pgen
T
0.000
0.005
0.010
0.015
0.020
0.025
R
c
R
g
0<| gen|<2.5
Standard
Deep Sets
ParticleNet
102
pgen
T
0.01
0.00
0.01
0.02
0.03
0.04
0.05
R
c
R
g
2.5<| gen|<5
Standard
Deep Sets
ParticleNet
102 103
pgen
T
0.005
0.000
0.005
0.010
0.015
0.020
0.025
R
c
R
b
0<| gen|<2.5
Standard
Deep Sets
ParticleNet
102
pgen
T
0.00
0.02
0.04
0.06
R
c
R
b
2.5<| gen|<5
Standard
Deep Sets
ParticleNet
CMS ML Forum 08.09.2021 Jet Energy Corrections with DNN Regression 29