SlideShare a Scribd company logo
Black Magic
How to apply ML to real-world problems
It is great tool for some purposes
ML is (Magic) Hammer
“
If all you have is a hammer, everything
looks like a nail
I am Evion Kim
Lead Machine Learning Engineer @ Mattermark
Senior Software Engineer /Data Scientist @ Linkedin
M.S. , Computer Science @ Stanford University
B.S., Computer Science @ KAIST
Hello!
Today’s Talk
◇ Machine Learning - the concept
◇ Mattermark?
◇ Funding Extraction Problem @ Mattermark
& Some Magic Spells
Not about
# Deep academic technical
knowledge about ML
algorithms
# Data Infrastructure
About
# How to transform real-
world problem into ML-
problem
# Tips and tricks on Machine
Learning based problem
solving
This talk is...
Machine Learning
The powerful hammer
1
def traditional(x):
return x*(x+1)
Traditional way
y = x * (x+1)
2 = 6
3 = 12
4 = 20
5 = 30
6 = 42
ML Way y = x * (x+1)
Model
Data
DEEP LEARNING?
Trained
Model
It is not SKYNET
… at least not “yet”.
It is tool
that can be used for some problems
What is
(just quick advertisement)
2
Case Study: Funding
Extraction
-And Dark Magic spells we learned
3
Small(er) company
# Much smaller training data points
# Very high precision requirement.
Big(ger) company
# Millions of Millions of training
data points
# Precision requirement: not
that high
ML @ Big(ger) vs. Small(er)
What’s the bottleneck?
# Scalability or Accuracy?
# Precision or Recall?
# Engineering or Machine Learning?
Spell 1: Know your Enemy
~$156 BillionTotal VC funding in year 2015
~8,532VC funding events in year 2015
Problem to solve
Divide big chunky problem into smaller
ML-solvable problems.
Spell 2: Slice and Dice
Smaller Problems
Classify Funding
Articles
Classify
Funding
Sentences
Extract
Funding
Entities
Confidence
Scorer
Classify Funding Article
TF-IDF + SVM Classifier
NO
YES
Analyze and understand the problem
space you are working on.
Spell 3: Understand
your Domain
Amount/Series/Investors
...has closed a $3.5m
Series A funding round
led by Inter Capital, ...
Investors
Intel Capital led the
round with
participation from other
investors that included
Horizons Ventures
Amount/ Series
...has raised $3.5
million in Series A
Funding
Funding Sentences
Patterns
Classify Funding Sentences
Word2Vec
+ Semantic Role
Labeling (SRL)
+ Gradient
Boosting
Classsifier
Regex Parsing
+ Named Entity
Recognition
Extract Funding Entities
Spell 4: Probabilistic
Train and use the probabilistic models
helps a lot sometimes.
“What’s the probability of these
extracted information to be
correct?”
Confidence Scoring
0~1
probability
score
Spell 5: Human + Machine
Let some part of the job get the help from
mighty human-being
Human Administration
~$156 BillionTotal VC funding in year 2015
~8,532VC funding events in year 2015
Spell 5: Human + MachineSpell 4: ProbabilisticSpell 3: Understand your domain
Spell 2: Slice and DiceSpell 1: Know your enemyML is powerful Hammer
Summary
We are hiring!
Thanks!
Any questions?
You can find me at:
◇ in/evionkim
◇ twitter@evion12
◇ evion12@gmail.com

More Related Content

Similar to Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems

A Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptxA Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptx
RajSingh512965
 

Similar to Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems (20)

Data Science Salon Miami Presentation
Data Science Salon Miami PresentationData Science Salon Miami Presentation
Data Science Salon Miami Presentation
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
How to Turn Machine Learning Into Products by Capital One PM
How to Turn Machine Learning Into Products by Capital One PMHow to Turn Machine Learning Into Products by Capital One PM
How to Turn Machine Learning Into Products by Capital One PM
 
Information Architecture for Retail Web Sites: Lessons from the Field
Information Architecture for Retail Web Sites: Lessons from the FieldInformation Architecture for Retail Web Sites: Lessons from the Field
Information Architecture for Retail Web Sites: Lessons from the Field
 
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsMachine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
 
Machine Learning - why the hype and how it does its magic
Machine Learning - why the hype and how it does its magicMachine Learning - why the hype and how it does its magic
Machine Learning - why the hype and how it does its magic
 
Real World End to End machine Learning Pipeline
Real World End to End machine Learning PipelineReal World End to End machine Learning Pipeline
Real World End to End machine Learning Pipeline
 
Why 4Segment
Why 4SegmentWhy 4Segment
Why 4Segment
 
Why 4Segments
Why 4SegmentsWhy 4Segments
Why 4Segments
 
Understanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingUnderstanding the New World of Cognitive Computing
Understanding the New World of Cognitive Computing
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
Intro to Machine Learning by Google Product Manager
Intro to Machine Learning by Google Product ManagerIntro to Machine Learning by Google Product Manager
Intro to Machine Learning by Google Product Manager
 
L'évolution du métier du DAF induite par la transformation digitale
L'évolution du métier du DAF induite par la transformation digitale L'évolution du métier du DAF induite par la transformation digitale
L'évolution du métier du DAF induite par la transformation digitale
 
Wecp all-india-test-series-program-brochure
Wecp all-india-test-series-program-brochureWecp all-india-test-series-program-brochure
Wecp all-india-test-series-program-brochure
 
Wecp all-india-test-series-program-brochure
Wecp all-india-test-series-program-brochureWecp all-india-test-series-program-brochure
Wecp all-india-test-series-program-brochure
 
A Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptxA Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptx
 
Apply AI Finance Webinar.pdf
Apply AI Finance Webinar.pdfApply AI Finance Webinar.pdf
Apply AI Finance Webinar.pdf
 
Big Data LDN 2017: Cognitive Search & Analytics – Bringing the Power of AI to...
Big Data LDN 2017: Cognitive Search & Analytics – Bringing the Power of AI to...Big Data LDN 2017: Cognitive Search & Analytics – Bringing the Power of AI to...
Big Data LDN 2017: Cognitive Search & Analytics – Bringing the Power of AI to...
 
"Marketing Technology as Competitive Advantage" - Scott Brinker, Digital Velo...
"Marketing Technology as Competitive Advantage" - Scott Brinker, Digital Velo..."Marketing Technology as Competitive Advantage" - Scott Brinker, Digital Velo...
"Marketing Technology as Competitive Advantage" - Scott Brinker, Digital Velo...
 
AI Hierarchy of Needs
AI Hierarchy of NeedsAI Hierarchy of Needs
AI Hierarchy of Needs
 

Recently uploaded

RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
Atif Razi
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
Kamal Acharya
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdf
Kamal Acharya
 

Recently uploaded (20)

2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
 
A case study of cinema management system project report..pdf
A case study of cinema management system project report..pdfA case study of cinema management system project report..pdf
A case study of cinema management system project report..pdf
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltage
 
Arduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectArduino based vehicle speed tracker project
Arduino based vehicle speed tracker project
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
shape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptxshape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptx
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf
 
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Toll tax management system project report..pdf
Toll tax management system project report..pdfToll tax management system project report..pdf
Toll tax management system project report..pdf
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdf
 
Danfoss NeoCharge Technology -A Revolution in 2024.pdf
Danfoss NeoCharge Technology -A Revolution in 2024.pdfDanfoss NeoCharge Technology -A Revolution in 2024.pdf
Danfoss NeoCharge Technology -A Revolution in 2024.pdf
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Online resume builder management system project report.pdf
Online resume builder management system project report.pdfOnline resume builder management system project report.pdf
Online resume builder management system project report.pdf
 

Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems