Machine Learning Applications

Machine Learning Applications
Armando Benitez
BMO Capital Markets
Jul 18, 2016

Armando Benitez - @jabenitez - Data x Design - Jul 18, 2016
• Located on the outskirts of
Geneva. France - Switzerland

• 27 km in circumference

• The tunnel is buried around 50
to 175 m. underground.
2
LHC - CERN

Armando Benitez - @jabenitez - Data x Design - Jul 18, 2016 3
Atlas Detector
Detector
Amplifier
Digitizer
selection
storage
computers
Particle
signal
Trash
010010
5/6/03
Shabnam Jabeen (Kansas)
Trig

Multiple Algorithms in Parallel"#$%&'()&(%*+(,($-.&.+/*%012.
!!!!!"##$%&'!!!!!!!!!!!!!!!!"()&$*(+!!!!!!!!!!,(%-*.
/&0*$*#+!1-&&$!!2&3-(4!2&%5#-6$!!74&8&+%$
Using another ML algorithm to combine the
result of individual classiﬁers.
Purpose: extract all possible information
from the Dataset.
The Combination
produces an output, from
where all measurements
are obtained
Combine

Mobile Market Place

Data Processing and Modelling
Transaction
grade
APIs + MQs
Data Lake
HBase,
Cassandra,
etc.
Stream
Processing
Batch
Processing
Model
Generator
Decision
Engine
(context, event, data)
(event)
(data)
Feature Selection
Model Training
Model Evaluation
Model Assembly
Real-Time
Layer
Batch Processing
Layer
{
Data Science
1. Fraud Detection
2. Search
3. Recommendations
4. Notiﬁcations
5. Ratings
6. Merchant Intelligence
7. Engagement
Optimization
8. Marketing Optimization
9. App Personalization
10. Ad Network Support
11. Image / Speech
Recognition
Theory
(Math, Algorithms)
Proof-of-Concept
(R, Python, Scala, C++)
Spark Implementation
(Scalability, Robustness)
Platform Integration

Fraud Detection
7
• Very small number of fraud cases

• Large number of good transactions

• Many diﬀerent “types” of anomalies.
Hard for algorithms to learn from
positive examples what the anomalies
look like

• Future anomalies may look nothing
like any of the anomalous examples
we’ve seen so far

Personalization
• Oﬀers targeted for each user

• Use browsing history and shopping
habits to determine products the user is
most likely to buy

• Similarity among users

• Similarity among items

• Catalog search results

Incorporating ML to Design
Visual Inputs
Aural Inputs
Corporal Inputs
Environmental Inputs

• Machine Learning algorithm capable of
discovering pattern with data presented to
them. How can we make use of it?

• Find discovery opportunities that only are
possible with the help of Machine Learning

• Designers and programmers to establish a
strong collaboration to ﬁnd ground-
breaking applications.

• Understand rules to know which ones to
bend or break
10
Creating Dialogue

Search Strategy
Initial
objects Found it!
15
)2
Invariant Mass (GeV/c
5.2 5.4 5.6 5.8 6 6.2 6.4 6.6 6.8 7
Events/(0.05)
2
4
6
8
10
12
)2
5.2 5.4 5.6 5.8 6 6.2 6.4 6.6 6.8 7
Events/(0.05)
2
4
6
8
10
12
)2
5.2 5.4 5.6 5.8 6 6.2 6.4 6.6 6.8 7
Events/(0.05)
0
2
4
6
8
10
12
)2
5.2 5.4 5.6 5.8 6 6.2 6.4 6.6 6.8 7
Events/(0.05)
0
2
4
6
8
10
12
FIG. 16: b mass distribution of background events from J/ sideband events after all selection cuts have been applied (top),
and these events -red squares- on top of the signal observed in right-sign combination events -open circles- (bottom).
3. ⇥b reconstruction on b ⇥ J/⇥ (p ) MC events.
We applied our ⇥b selection on 30K generated b ⇥ J/⇥ (p ) MC events. This is p17 MC with the same
cuts at generation level as those applied to our ⇥b MC, and reprocessed with the same extended configuration
as used on data. No events survived after selection.
VI. CONCLUSIONS
By using a simple set of cuts we observe a signal peak with a mass of 5.774 ± 0.011 GeV/c2
(stat) ± 0.22 GeV/c2
(sys) and a width of 0.037 ± 0.008 GeV/c2
, a significance of 5.53 and S/
⇤
B = 7.80. This peak is showed in Fig. 12
and the results of the fit are in Table II. This support the previous report of the observation by using Bagger Decision
Trees [6]. We measure a relative production ratio to be
f(b⇥⇥b )Br(⇥b ⇥J/⇥⇥ ( ))
f(b⇥ b)Br( b⇥J/⇥ ) = 0.376 ± 0.119stat. ± 0.188syst
[1] PL B384 449, D. Buskalic et. al.
[2] ZPHY C68 541 P. Abreu et al.
[3] Common Samples Group, http://wwwd0.fnal.gov/Run2Physics/cs/.
[4] See description of ”J/psi & dimuon mass continuum” at http://d0server1.fnal.gov/users/nomerot/Run2A/BANA/Dskim.html.
[5] Reconstruction of B hadron signals at DØ , DØ Note 4481.
[6] DØ Note 5401.
DØ Note 5403
Version 4.1 as June 5, 2007
Observation of the heavy baryon b
E. De La Cruz Burelo, H.A. Neal, and J. Qian
University of Michigan
B. Abbott
University of Oklahoma
G.D. Alexeev, Yu.P. Merekov, G.A. Panov, A.M. Rozhdestvensky, L.S. Vertogradov, Yu.L. Vertogradova
Joint Institute for Nuclear Research, Russia
Using approximately 1.3 fb 1
of data collected by the upgraded DØ detector in Run II of the
Tevatron, the ⇤b state has been observed in the decay mode J/⇤(⇤ µ+
µ )⇤ (⇤ ⇤ ⇥⇥±
, ⇥ ⇤ ⇥p)
A tracking algorithm which allows a more e⇧cient method of reconstructing tracks with large impact
parameters was used in order to increase the e⇧ciency of reconstructing the ⇥ and ⇤ . We observe
the ⇤b with a significance of 2 ln(L) = 5.53, S/
⌅
B = 7.80 with a mass of 5.774 ± 0.011
GeV/c2
(stat) ± .022 GeV/c2
(sys). We measure the relative production ratio to be
f(b ⇤ ⇤b )Br(⇤b ⇤ J/⇤⇤ (⇥⇥ ))
f(b ⇤ ⇥b)Br(⇥b ⇤ J/⇤⇥)
= 0.376 ± 0.119 stat. ± 0.188 syst.
Data Cleaning
Signal to Bkg
20:1
Initial
objects
Found it!Data Cleaning
Machine
Learning
9.4.2 Observed Results
tb+tqb DT Output
0 0.2 0.4 0.6 0.8 1
EventYield
0
200
400
600
800
-1
D0 RunII Prelim. 2.3 fb
channelµp17+p20 e+
1-2 b-tags
2-4 jets
tb+tqb DT Output
0 0.2 0.4 0.6 0.8 1
EventYield
0
200
400
600
800
tb+tqb DT Output
0 0.2 0.4 0.6 0.8 1
EventYield
2
10
3
10 -1
channelµp17+p20 e+
1-2 b-tags
2-4 jets
tb+tqb DT Output
0 0.2 0.4 0.6 0.8 1
EventYield
2
10
3
10
ield
60 -1
ield
60
Traditional searches
Small Signal Analysis
Signal to Bkg
1:20

Signal
13
Decision Trees
1
Signal
Bkg
Bkg
x
y
y
1
1
2
2
X
Y
Bkg
Signal
x<x
x<x
y<y
Bkg
Signal
BkgSignal
Bkg
x
2
1
1
L 4 R4
L R3 3L R2 2
y<y2
L 1 R
Figure 8.1: 2D plane of a simple classiﬁcation problem, and a Decision Tree solving
the classiﬁcation problem of signal and background.
8.1 Overview
Signal
Signal
Bkg
Bkg
Bkg
Task: separate signal from background
Issue: A single split on X or Y is not
enough!
Solution: Use a series of
consecutive splits,
generating a tree structure
1
Signal
Bkg
Bkg
x
y
y
1
1
2
2
X
Y
Bkg
Signal
x<x
x<x
y<y
Bkg
Signal
BkgSignal
Bkg
x
2
1
1
L 4 R4
L R3 3L R2 2
y<y2
L 1 R

Signal
14
Decision Trees
1
Signal
Bkg
Bkg
x
y
y
1
1
2
2
X
Y
Bkg
Signal
x<x
x<x
y<y
Bkg
Signal
BkgSignal
Bkg
x
2
1
1
L 4 R4
L R3 3L R2 2
y<y2
L 1 R
8.1 Overview
Failed
C1
Split 1: on the X variable
1
Signal
Bkg
Bkg
x
y
y
1
1
2
2
X
Y
Bkg
Signal
x<x
x<x
y<y
Bkg
Signal
BkgSignal
Bkg
x
2
1
1
L 4 R4
L R3 3L R2 2
y<y2
L 1 R
Passed
C1
P1F1

Signal
15
Decision Trees
1
Signal
Bkg
Bkg
x
y
y
1
1
2
2
X
Y
Bkg
Signal
x<x
x<x
y<y
Bkg
Signal
BkgSignal
Bkg
x
2
1
1
L 4 R4
L R3 3L R2 2
y<y2
L 1 R
8.1 Overview
F: C1
F: C2
Split 2: Recovered events that failed the split 1
1
Signal
Bkg
Bkg
x
y
y
1
1
2
2
X
Y
Bkg
Signal
x<x
x<x
y<y
Bkg
Signal
BkgSignal
Bkg
x
2
1
1
L 4 R4
L R3 3L R2 2
y<y2
L 1 R
Passed
C1
P1F1
P2F2
F: C1
P: C2
repeat and continue the splitting process until events are classiﬁed

Decision Trees
After 4 splits: Signal and Background regions are separated! Done!
1
Signal
Bkg
Bkg
x
y
y
1
1
2
2
X
Y
Bkg
Signal
x<x
x<x
y<y
Bkg
Signal
BkgSignal
Bkg
x
2
1
1
L 4 R4
L R3 3L R2 2
y<y2
L 1 R
P1F1
P2F2 P3F3
P4F4
Signal
1
Signal
Bkg
Bkg
x
y
y
1
1
2
2
X
Y
Bkg
Signal
x<x
x<x
y<y
Bkg
Signal
BkgSignal
Bkg
x
2
1
1
L 4 R4
L R3 3L R2 2
y<y2
L 1 R
8.1 Overview
F: C1
P: C2
P: C1,C2
F: C4
P: C1, 
C3,C4
F: C1,C2
P: C1
F: C2
Toy model: only 2 variables, easy to determine cut values

A/B Testing

• Consultant @ BMO Capital Markets
• Previously:
• Data Scientist @ Paytm Labs
• Researcher - ATLAS Experiment @ CERN
• Researcher - Fermilab National Laboratory
18
Background

Anomaly detection
19
๏
Fit model on training set
๏
On a cross validation/test example, predict
๏
Possible evaluation metrics:
๏ True positive, false positive, false negative, true negative
๏ Precision/Recall
๏ F1-score

• The SM describes the world around
us
• Components:
• 24 particles of matter
• 4 mediators
• Interactions of the particles explained
by the mediators
• Does not include: gravity, dark
matter and dark energy
20
Standard Model (SM)

Identity Resolution
• What?  
Identify products having similar properties (name, colour, size) as a
unique product

• Why?  
Recommender systems trained on these products would produce
better recommendations -> Non-repetitive

• How?

• Classifying pairs as match or non-match, based on how similar they
are.

• Making use of catalog known features

Machine Learning Applications

Recommended

Recommended

More Related Content

Similar to Machine Learning Applications

Similar to Machine Learning Applications (20)

Recently uploaded

Recently uploaded (20)

Machine Learning Applications