NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Russell

NYAI #7 (SPEAKER SERIES):
Data Science to Operationalize
Machine Learning (Matthew Russell)
& Computational Creativity
(Dr. Cole D. Ingraham DMA)

OPERATIONALIZING MACHINE
LEARNING WITH DATA SCIENCE
Matthew A. Russell
Chief Technology Officer
November 2016

WHAT WE DO
Cognitive Computing platform
that understands human
communication
OFFICE LOCATIONS:
Nashville
Washington
New York
London
INVESTORS:
Goldman Sachs, Credit
Suisse, Nasdaq, In-Q-Tel, HCA
& Lemhi Ventures
RESULTS PROVEN IN:
Government
Financial Services
Health Care
Data Science
STRATEGIC PARTNERS
DIGITAL REASONING
2

AGENDA
• The best way to operationalize machine learning is with data
science
• Data science teams that can accomplish more experiments in less
time will outperform those that don’t
3

KNOWLEDGE GRAPH:
ENTITIES ORGANIZED IN RELATIONSHIP, SPACE, AND TIME
4

HUMAN LANGUAGE IS HIGHLY PLASTIC
5
Would you rather try to build something awesome by sculpting plastic or by
composing Legos?

BETTER ABSTRACTIONS YIELD BETTER OUTCOMES
6
Practitioners of equal ability will be able
to build far more useful things with Legos
than by sculpting plastic with artisan
tools.

7Metadata Tokens Phrases Entities Concepts
Temporal
Reasoning Assertions Relationships Concept Resolution

Temporal
Reasoning Assertions Relationships Concept ResolutionMetadata

Temporal
Noun Plural Noun
Modal
Verb
Verb
Determiner
‟ Adjective Noun
“ Preposition
Verb (Present
Participle)
To Adjective Adjective
Adjective Plural Noun Preposition Plural Noun Conjunction Verb Determiner Noun
, Adjective Adjective
Proper Noun Proper Noun
Proper
Noun Symbol
Proper
Noun
Verb (Past
Tense) Proper
Noun Adverb .
Metadata Tokens

Temporal
NN NNS MD VB DT JJ NN “ IN VBG TO JJ JJ
JJ NNS IN NNS IN VB DT NN JJ JJ
NNP NNP NNP VBD NNP RB •
NNP
“
NNP Pos NNS IN JJ CC JJ NNS VB
NN IN DT JJ NN CC MD VB “ VBD RB “ P3S VBD • DT
JJ JJ JJ NN “ VBD IN DT NN IN PRP$ NN “
VBZ IN “ DT JJ JJ NN MD VB “ P3S VBD •
NNP VBD VBNEX NN IN IN NNP NNP VBZDTVBZ JJ
NNP NNP DTCC JJ NNS NNS VBNNIN IN “ To
VB NNS RB “ RB VBG “ NNS CC NNS “ To DT NNP
NN • IN DT NN NNP NNP VBZ To VB PRP$ JJ
NNS NNS P3S VBD •
“ P3S VBZNN To VB DT NN To DT JJ NN “ P3S VBD VBG DT JJ
JJ •
NNP Pos NN To DT JJ NN IN NNP RB VBD NNS
NNS IN NNP NNP IN IN DT CD NNS VBN To VB
NN IN DT NNP NNPS CC NNP NNP IN NN To JJ
NNP NNS • NNP VBD DT CD JJ NNS VB “ RB JJ RB RB “•
NNP CD NNS IN PRP$ NN VBZ VBG VBG JJ NN RB To
VB DT JJ NN IN DT NNP JJ JJ NN • DT CD
NNS VB VBG To VB NN IN NN CC NN RB RB RB
NN •
VBG IN DT NNP NNP NNP VBD DT NNP NNP IN PRP$
NN IN DT NNP NNP VBG RP IN NN CD NNS IN
VB NNS IN DT NN CC P3S VBD DT NN NNP NNP
VBZ IN DT NN IN DT NN IN NN NNS •
NNP NNP VBZ VBG DT NN“ JJS IN NN “ CC VBG
To NNP NNP Pos NNS “ RB CC RB “ P3S VBD VBG IN DT
JJ JJ JJ NNS VB VBN DT JJ NN IN
NNP IN NN •
NNP VBD P3S MD VB IN VBG NN IN NNP NNP IN “ P1S VBP
JJ IN NN VBZ DT NN To NN “ CC JJ NN • RB
P3S VBD NNP NNP VBZ To VB RP PRP$ JJ JJ NN CC
VB DT JJ“ NN “ IN DT JJ NN •
DT“ NNP IN NNP MD RB VB DT JJ Sym JJ NNP NNP “ P3S
VBD • “ NNP Pos NNS MD VB VBN RB • “
NNPSym
Sym
Metadata Tokens

Temporal
Noun Phrase Verb Phrase Noun Phrase Noun Phrase
Noun Phrase
Noun Phrase
Noun Phrase Noun Phrase Noun Phrase
Noun Phrase Verb Phrase
Noun Phrase Noun Phrase
Noun Phrase Verb Phrase
Noun Phrase
Noun Phrase
Noun Phrase
Noun Phrase Noun Phrase Verb Phrase Noun Phrase
Verb Phrase Noun Phrase Noun Phrase Noun Phrase
Verb Phrase
Noun Phrase Noun Phrase Verb Phrase Noun Phrase
Noun Phrase Verb Phrase Noun Phrase
Noun Phrase
Noun Phrase Verb Phrase Noun Phrase
Verb Phrase
Noun Phrase
Verb PhraseNoun Phrase
Metadata Tokens Phrases

Temporal
Reasoning Assertions Relationships Concept ResolutionMetadata Tokens Phrases Entities

Temporal
*08-MAY-2013
*07-MAY-2013
Metadata Tokens Phrases Entities Concepts
Temporal
Reasoning

14
Concept Mention Predicate Related Entity Fact Category
Sentimen
t
Sentence
World powers end North Korea Action Negative
World powers must end the “vicious circle” of responding to periodic North Korean provocations
with actions that reward such behavior, South Korean President Park Geun-hye told Congress
yesterday.
Park Geun-hye
South Korean
President Park
Geun-hye
tell Congress Statement Negative
World powers must end the “vicious circle” of responding to periodic North Korean provocations
with actions that reward such behavior, South Korean President Park Geun-hye told Congress
yesterday.
North Korea
North Korea’s
threats
undermine Korean Peninsula Conflict Negative
North Korea’s threats, including nuclear and missile tests, undermine security on the Korean
peninsula and will be “met decisively,” she said.
Park Geun-hye she say North Korea Statement Negative
North Korea’s threats, including nuclear and missile tests, undermine security on the Korean
peninsula and will be “met decisively,” she said.
South Korean
Government
strong South
Korean
government
ensure North Korea Communication Positive
A strong South Korean government “backed by the might of our alliance” ensures that “no North
Korean provocation can succeed,” she said.
Park Geun-hye she say
South Korean
Government
Statement Positive
A strong South Korean government “backed by the might of our alliance” ensures that “no North
Korean provocation can succeed,” she said.
North Korea North Korea threaten South Korea Conflict Negative
Park said there has been a historical pattern in which North Korea threatens South Korea and, after
a period of international sanctions, nations try “to patch things up” by offering “concessions and
rewards” to the Pyongyang government.
Park Geun-hye Park say North Korea Statement Negative
nations patch up
Pyongyang
government
Communication Negative
North Korea North Korea advance
its nuclear weapons
capabilities
Motion Negative In the meantime, North Korea continues to advance its nuclear weapons capabilities, she said.
Park Geun-hye she say North Korea Statement Negative In the meantime, North Korea continues to advance its nuclear weapons capabilities, she said.
Park Geun-hye she say vicious circle Statement Negative “It’s time to put an end to this vicious circle,” she said, drawing a standing ovation.
Park Geun-hye she draw standing ovation Action Positive “It’s time to put an end to this vicious circle,” she said, drawing a standing ovation.
Park Geun-hye Park’s address follow President Obama Communication Neutral
Park’s address to a joint meeting of Congress yesterday followed talks Tuesday with President
Obama…
the two leaders display unity Relationship Neutral
...at which the two leaders sought to display unity between the United States and South Korea in
response to North Korean threats.
two longtime allies be united Relationship Positive Obama said the two longtime allies are “as united as ever.”
President Barack
Obama
Obama say two longtime allies Statement Positive Obama said the two longtime allies are “as united as ever.”
Park Geun-hye Park make first trip abroad Travel Neutral
Park, three months into her presidency, is making her first trip abroad to mark the 60th anniversary
of the U.S.-South Korean alliance.
Park Geun-hye Park mark 60th anniversary Relationship Neutral
Park, three months into her presidency, is making her first trip abroad to mark the 60th anniversary
of the U.S.-South Korean alliance.
Two nations expand cooperation Relationship Positive The two nations are seeking to expand cooperation on trade and energy as well as security.
Park Geun-hye Park thank United States Communication Neutral
Park thanked the United States for its support in the Korean War, singling out for recognition four
lawmakerswho are veterans of that conflict…
Park Geun-hye Park stress importance Communication Neutral
…and she stressed the importance South Korea places on the alliance in the face of security
challenges.
South Korea South Korea maintain readiness Status Positive
South Korea is maintaining the “highest level of readiness” and responding to North Korea’s actions
“resolutely but calmly,” she said…
South Korea is maintaining the “highest level of readiness” and responding to North Korea’s actions
Temporal
Reasoning Assertions Relationships Concept ResolutionMetadata Tokens Phrases Entities Concepts
Temporal
Reasoning Assertions
NYAI

KNOWLEDGE GRAPHS: THE NEXT WAVE OF INNOVATION
• Document analysis is becoming commoditized
• The synthesis of knowledge graphs from a corpus is the next frontier
• Knowledge graphs will accelerate conversational interfaces/agents
• Conversational interfaces are a key enabler of the Internet of Things
15

161 1 / 2 5 / 2 0 1 6 NYAIMetadata Tokens Phrases Entities Concepts
Temporal
EXPERIMENTAL ILLUSTRATION

1 1 / 2 5 / 2 0 1 6 NYAI 17
% $
% %
%
*April 2013
*18-Jun-2013
*26-Apr-2013 *22-Apr-2013
*2013
Temporal

18
Temporal
Reasoning Assertions Relationships

19
Temporal
CanonHong Kong
Park Geun-hye
KNOWLEDGE GRAPHS: ENTITIES IN RELATIONSHIP, TIME,
& SPACE

THESIS
science
• Practicing data science requires careful application of the scientific method
with repeatable and well-defined experiments
20

REASONS TO OPERATIONALIZE MACHINE LEARNING
• Increase revenue
• Decrease operational expenses
• Curtail Risk
1 1 / 2 5 / 2 0 1 6 21

PHYSICS REFRESHER
• Machines do work
• Work = Force x distance
• Power = Work / time
1 1 / 2 5 / 2 0 1 6 22

MOST IMPORTANT KPI FOR DATA SCIENCE
• Optimizing for power output is the most important KPI for data
science practitioners
• Work ~ Experiment
• Power ~ Experiments per unit time
1 1 / 2 5 / 2 0 1 6 23

OPTIMIZE FOR POWER OUTPUT
• Optimize for power output by doing more experiments in less time
• Doing it with…
• Better tools*
• Better experiments*
• Better know-how
• Better teamwork
1 1 / 2 5 / 2 0 1 6 24

BEST PRACTICES FOR EXPERIMENTS
• An experiment should yield an artifact that tests a hypothesis
• Repeatable experiments yield momentum
• Repeatability => Collaboration => Innovation => Momentum
• Progress should be measured with scorecards
• Think:
• Chemistry lab
• Test-driven development
1 1 / 2 5 / 2 0 1 6 25

AN EXPERIMENT IS THE FUNDAMENTAL UNIT OF WORK
• An Experiment is a tuple:
• Versioned Training Data
• Versioned Evaluation Data
• Versioned Source Code
• Versioned Hyperparameters
• Versioned Tests
1 1 / 2 5 / 2 0 1 6 26

BARE MINIMUMS FOR EXPERIMENTATION
• Vagrant
• Jupyter Notebook
• Git
• Insatiable Appetite Automation
1 1 / 2 5 / 2 0 1 6 27

• Define a hypothesis with a quantifiable outcome that can be tested:
• I can teach a machine to diagnose cancer from medical reports with precision
of 95% and recall of 85%.
• Build a model that yields an “IF CANCER” document label
• Yielding a “WHICH CANCER” document label naturally follows
• Test the outcome:
• Build a predictive model that “reads” the pathology reports and predicts
cancer with a quantifiable confidence level
• Wash, Rinse, Repeat…
28

CD NN ABV ABR CD
ABV VB DT NN IN JJ NN ABV JJ
NN CTSymCTSymCT CTSymCT JJ
NN IN NN JJ JJSymNN
ABV NN IN DT NN VBD VBN RB DT
JJ NN IN CT ABV ABV Sym CT •
NN NN CC
CT Sym
JJ JJ NN VBD VBN •
NN
JJ NN Sym EX VB Neg ABV NN IN JJ
NN •
DT
JJ JJ NN VBZ JJ NNS JJ IN NN •
JJ NN CC NN Sym DT NNS VB RB
JJ •
JJ JJ JJ NN NN •
JJ NN JJ JJ NN IN
NN CD IN NN CD VBG CD ABV• JJ JJ JJ NN
NN ••
JJ CD JJ NNS Sym Neg JJ CC JJ
NN •
Neg JJ JJ CC JJ NN •
NN CC NN Sym
EX VB Neg JJ NNS IN CC
NN CC
NN IN VB RB RB CDABV IN JJ NN NN •
NN SymNeg NN IN JJ NN CC JJ
NN •
DT JJ NN VBZ IN JJ •
JJ NN Sym JJ NN IN DT JJ NN VBZ JJ •
JJ Sym
Neg NN IN JJ NN •
JJ JJ JJ NN NN •
JJ JJ JJ NN JJ NN •
VB NNS RB •
NNP NNP NNS IN NN NN IN JJ
NNS Sym
JJ NN NN Sym
SymCDSymCDABVSym JJ NN ABV CD NN •
CC JJ Neg RB NN NN
JJ NN NN Sym
SymCDSymCDABVSym
JJ NN ABVCDSym CDNN CC CDSymCD NN
JJ NN NNS VB NNS IN DT JJ CC JJ NN IN
NN CC JJ JJ NN NNS IN NN NN •
JJ NN NNS VB NNS IN DT NN IN NN CC JJ
JJ NN NNS IN NN NN IN JJ NN NN
IN NN NN NN IN NNS NN CC NN •

Computerized Tomography Computerized Tomography
Computerized Tomography
100 milliliters Isovue-370 (iopamidol)
Negative
Negative Computerized Tomography
0.7 centimeters (70 millimeters)
Negative
Negative
Negative
>1 centimeter
Negative
Negative
NegativeComputerized Tomography
4 -6 millimeters
4 -6 millimeters

Negative
Negative
Negative
Negative
Negative
Negative
Negative
Negative

Medical Entity Flag
lung nodule Yes
bronchial wall Yes
pulmonary embolism No
lobe infiltrate No
pleural effusion No
pericardial effusion No
pleural mass No
pericardial mass No
mediastinum No
hilum No
aortic aneurysm No
heart No
abdomen No
lungs No
lymph nodes No

SUMMARY
science
• Data science necessarily involves highly repeatable experiments
that are contextualized within the scientific method
• The most important KPI for data science teams is number of
experiments per unit time
• Data science teams that thoughtfully consider this KPI while
accomplishing more experiments in less time will outperform those
that don’t
33

34
I HAVE THE HONOR TO BE,
YOUR OBEDIENT SERVANT…
M.R.
• Matthew A. Russell
• @ptwobrussell
• LinkedIn
• Gmail
• Twitter
• Digital Reasoning
• http://digitalreasoning.com
• @dreasoning

DIGITAL REASONING COGNITIVE COMPUTING AND DATA
SCIENCE RECOGNITION
35

WHAT OUR CUSTOMERS & PARTNERS SAYING …
36
“Using Synthesys gives our team the
means to discover potential
problems and act on them before
they ripen into actual problems”
Vinny Tortorella, Chief Compliance &
Surveillance Officer
“Digital Reasoning provides the
proactive identification of potential
risks across our business and
continuous of learning of resulting
reviews”
Will Davis, Global Head of Compliance
& Operational Risk Control Technology
“Congratulations to Digital Reasoning
on being recognized as a leader in Big
Data Text Analytics. We are exited to be
working with Digital Reasoning and its
award winning technology”
Valarie Bannert-Thurner, Global Head,
Risk & Surveillance Solutions

WHAT OTHERS ARE SAYING …
37
“Banks now want to go one step
further, and are looking at acquiring
technology that can spot and prevent
inappropriate communication or
fraudulent activity… There is a huge
market for this right now," said Sang
Lee, founding partner at Aite Group”
“Digital Reasoning applies AI to
understand human communication to
ferret out suspicious
activity. Over time, this class of service
may become indispensable”, Gartner Cool
Vendor Smart Machines”
"By continually learning from context,
Synthesys reveals insights that normally
go undetected, helping to avoid the “I-
don’t-know-what-I-don’t-know”
problem of most other analytics tools

NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Russell

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (15)

Recently uploaded

Recently uploaded (20)

NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Russell