SlideShare a Scribd company logo
1 of 36
Download to read offline
Towards*an*OpBmizer* 
for*MLbase 
Collaborators:*Evan*Sparks,*Michael*Franklin,*Michael*I.*Jordan,*Tim*Kraska* 
UC Berkeley 
Ameet*Talwalkar
Problem:"Scalable"implementa.ons" 
difficult"for"ML"Developers… 
Meta-Statistics 
ML Contract + 
Code 
ML Key Features 
t )JHIMFWFMMBOHVBHFGPSOVNFSJDBM 
DPNQVUBUJPO
WJTVBMJ[BUJPO
BOEBQ- 
QMJDBUJPOEFWFMPQNFOU 
t *OUFSBDUJWFFOWJSPONFOUGPSJUFSBUJWF 
FYQMPSBUJPO
EFTJHO
BOEQSPCMFN 
TPMWJOH 
t .BUIFNBUJDBMGVODUJPOTGPSMJOFBS 
BMHFCSB
TUBUJTUJDT
'PVSJFSBOBMZTJT
 
GJMUFSJOH
PQUJNJ[BUJPO
OVNFSJDBM 
JOUFHSBUJPO
BOETPMWJOHPSEJOBSZ 
EJGGFSFOUJBMFRVBUJPOT 
t #VJMUJOHSBQIJDTGPSWJTVBMJ[JOHEBUB 
BOEUPPMTGPSDSFBUJOHDVTUPNQMPUT 
t %FWFMPQNFOUUPPMTGPSJNQSPWJOH 
DPEFRVBMJUZBOENBJOUBJOBCJMJUZ 
BOENBYJNJ[JOHQFSGPSNBODF 
t 5PPMTGPSCVJMEJOHBQQMJDBUJPOTXJUI 
DVTUPNHSBQIJDBMJOUFSGBDFT 
t 'VODUJPOTGPSJOUFHSBUJOH.5-# 
CBTFEBMHPSJUINTXJUIFYUFSOBMBQ- 
QMJDBUJPOTBOEMBOHVBHFTTVDIBT$
 
+BWB
/5
BOE.JDSPTPGU®YDFM® 
The Language of Technical Computing 
MATLAB® is a high-level language and 
interactive environment for numerical com-putation, 
visualization, and programming. 
Using MATLAB, you can analyze data, 
develop algorithms, and create models and 
applications. The language, tools, and built-in 
math functions enable you to explore 
multiple approaches and reach a solution 
faster than with spreadsheets or traditional 
programming languages, such as C/C++ or 
Java™. 
You can use MATLAB for a range of appli-cations, 
including signal processing and 
communications, image and video process-ing, 
control systems, test and measurement, 
computational finance, and computational 
biology. More than a million engineers and 
scientists in industry and academia use 
MATLAB, the language of technical 
computing. 
MATLAB Overview 2:04 
Analyzing and visualizing data using the MATLAB 
desktop. The MATLAB environment also lets you write 
programs and develop algorithms and applications.
Problem:Scalableimplementa.ons 
difficultforMLDevelopers… 
Meta-Statistics 
ML Contract + 
Code 
ML
Problem:Scalableimplementa.ons 
difficultforMLDevelopers… 
Meta-Statistics 
ML Contract + 
Code 
ML CHALLENGE:*Can*we*simplify* 
distributed*ML*development?
Problem:MLisdifficult 
forEndUsers… 
Too*many*ways*to* 
preprocess… 
Too*many* 
knobs… 
Difficult*to* 
debug… 
Doesn’t*scale… 
Too*many* 
algorithms…
Problem:MLisdifficult 
forEndUsers… 
Too*many*ways*to* 
preprocess… 
Too*many* 
knobs… 
Difficult*to* 
debug… 
CHALLENGE:*Can*we* 
automate*ML*pipeline* 
Doesn’t*scale… 
construcBon? 
Too*many* 
algorithms…
MLbase 
4 
MLOpt 
MLI 
MLlib 
Apache*Spark 
MLbase'aims'to' 
simplify'development' 
and'deployment'of' 
scalable'ML'pipelines 
Spark:*Cluster*compuBng*system*designed*for*iteraBve*computaBon 
MLlib:*Spark’s*core*ML*library 
MLI:*API*to*simplify*ML*development 
MLOpt:*DeclaraBve*layer*to*automate*hyperparameter*tuning
MLbase 
MLbase'aims'to' 
simplify'development' 
and'deployment'of' 
scalable'ML'pipelines MLOpt and MLI are 
experimental testbeds 
4 
MLOpt 
MLI 
MLlib 
Apache*Spark 
Spark:*Cluster*compuBng*system*designed*for*iteraBve*computaBon 
MLlib:*Spark’s*core*ML*library 
MLI:*API*to*simplify*ML*development 
MLOpt:*DeclaraBve*layer*to*automate*hyperparameter*tuning
MLlib 
+**Scalable*and*fast** 
+**Simple*development*environment* 
+**Part*of*Spark’s*robust*ecosystem 
5 
SparkSQL 
Spark 
Streaming 
MLlib 
(machine 
learning) 
Apache Spark 
GraphX 
(graph)
AcBve*Development 
Ini.alRelease 
• Developed*by*MLbase*team*in*AMPLab*(11*contributors) 
• Scala,*Java 
• Shipped*with*Spark*v0.8*(Sep*2013) 
15monthslater… 
• 60+*contributors*from*various*organizaBons 
• Scala,*Java,*Python 
• Many*more*algorithms*and*ML*uBliBes 
• Improved*documentaBon*/*code*examples,*API*stability 
• Latest*release*part*of*Spark*v1.1*(Sept*2014)
MLlib,*MLI*and*Roadmap 
• MLI:*Shield*ML*Developers*from*low]details* 
• Provide*familiar*mathemaBcal*operators*in*distributed*se^ng* 
(tables,*matrices,*opBmizaBon*primiBves)* 
• Standard*APIs*defining*ML*algorithms*and*feature*extractors 
• MLlib*v1.2*will*include*ML*API*inspired*by*MLI 
• Longer*term*for*MLlib* 
• Scalable*implementaBons*of*standard*ML*algorithms*and* 
underlying*opBmizaBon*primiBves* 
• Further*support*for*ML*pipeline*development*(including*hyper* 
parameter*tuning*using*ideas*from*MLOpt)
MLlib,*MLI*and*Roadmap 
• MLI:*Shield*ML*Developers*from*low]details* 
• Provide*familiar*mathemaBcal*operators*in*distributed*se^ng* 
(tables,*matrices,*opBmizaBon*primiBves)* 
• Standard*APIs*defining*ML*algorithms*and*feature*extractors 
Feedback'and' 
• MLlib*v1.2*will*include*ML*API*inspired*by*MLI 
Contribu9ons'Encouraged! 
• Longer*term*for*MLlib* 
• Scalable*implementaBons*of*standard*ML*algorithms*and* 
underlying*opBmizaBon*primiBves* 
• Further*support*for*ML*pipeline*development*(including*hyper* 
parameter*tuning*using*ideas*from*MLOpt)
Vision 
MLOpt
SQL Result 
PAQ Model 
Planner for Large-scale Predictive 
Analytic Queries 
ML 
✦ User*declaraBvely*specifies*task 
✦ PAQ*=*PredicBve*AnalyBc*Query 
✦ Search*through*MLlib*to*find*the* 
best*model/pipeline 
the develop-ment 
enabled a wide 
recommen-dations, 
speech driven in-terfaces. 
supported by 
SELECT e.sender, e.subject, e.message 
FROM Emails e 
WHERE e.user = ’Bob’ 
AND PREDICT(e.spam, e.message) = false GIVEN 
LabeledData
A*Standard*ML*Pipeline 
! 
Data 
Feature* 
ExtracBon 
Model* 
Training 
Final** 
Model 
✦ In*pracBce,*model*building*is*an*iteraBve*process*of*conBnuous* 
refinement 
✦ Our*grand*vision*is*to*automate*the*construcBon*of*these* 
pipelines
Training*A*Model 
✦ For*each*point*in*dataset* 
✦ compute*gradient* 
✦ update*model* 
✦ repeat*unBl*converged 
✦ Requires*mul$ple'passes 
✦ Common*access*padern* 
✦ Naive*Bayes,*Trees,*etc. 
✦ Minutes*to*train*an*SVM*on*200GB* 
of*data*on*a*16]node*cluster
The*Tricky*Part 
✦ Algorithms* 
✦ LogisBc*Regression,*SVM,*Tree] 
based,*etc.* 
✦ Algorithm*hyper]parameters* 
✦ Learning*Rate,*RegularizaBon,*etc. 
Algorithms 
Hyper* 
Parameters 
FeaturizaBon 
✦ FeaturizaBon* 
✦ Text:*n]grams,*TF]IDF* 
✦ Images:*Gabor*filters,*random* 
convoluBons* 
✦ Random*projecBon?*Scaling?
A*Standard*ML*Pipeline 
! 
Data 
Feature* 
ExtracBon 
Model* 
Training 
✦ In*pracBce,*model*building*is*an*iteraBve*process*of*conBnuous* 
refinement 
✦ Our*grand*vision*is*to*automate*the*construcBon*of*these* 
pipelines 
✦ Start*with*one*aspect*of*the*pipeline*]*model*selecBon 
Final** 
Model 
AutomatedModelSelec.on
One*Approach 
Learning* 
Rate 
RegularizaBon 
Bestanswer 
✦ Try*it*all!* 
✦ Search*over*all* 
hyperparameters,*algorithms,* 
features,*etc. 
✦ Drawbacks* 
✦ Expensive*to*compute*models* 
✦ Hyperparameter*space*is*large 
✦ Some*version*of*this*sBll* 
olen*done*in*pracBce!
A*Beder*Approach 
✦ Beder*resource*uBlizaBon* 
✦ through*batching* 
✦ Algorithmic*Speedups* 
✦ via*early*stopping*/*bandits* 
✦ Improved*Search* 
✦ e.g.,*via*randomizaBon 
Learning* 
Rate 
RegularizaBon 
Bestanswer
A*Tale*Of*3*OpBmizaBons 
BeLerResourceU.liza.on 
AlgorithmicSpeedups 
ImprovedSearch
Beder*Resource*UBlizaBon 
✦ Typical*model*update*requires*2]4*flops/double 
✦ But*modern*memory*much*slower*than*processors* 
✦ We*can*do*25*flops*/*double*read!* 
✦ This*equates*to*6]8*model*updates*per*double*we* 
read,*assuming*models*fit*in*cache 
✦ Trainmul.plemodelssimultaneously
What*Do*We*See*In*Spark? 
✦ 2x*and*5x*increase*in*models* 
trained/sec*with*batching* 
✦ Overhead*from*virtualizaBon,* 
network,*etc.
What*Do*We*See*In*Spark? 
✦ These*numbers*are*with* 
vector]matrix*mulBplies 
✦ Can*do*beder*when*rewriBng* 
in*terms*of*matrix]matrix* 
mulBplies
A*Tale*Of*3*OpBmizaBons 
BeLerResourceU.liza.on 
AlgorithmicSpeedups 
ImprovedSearch
Learning* 
Rate 
RegularizaBon 
Bestanswer 
Bandit*Search 
✦ Each*point*is*a*trained*model 
✦ Some*models*look*bad*early 
✦ So*we*give*up*early! 
✦ Can*frame*this*as*a*mul.Q 
armedbandit*problem 
✦ Each*model*is*an*arm
Bandit*Search 
✦ Each*point*is*a*trained*model 
✦ Some*models*look*bad*early 
✦ So*we*give*up*early! 
✦ Can*frame*this*as*a*mul.Q 
armedbandit*problem 
✦ Each*model*is*an*arm
A*Tale*Of*3*OpBmizaBons 
BeLerResourceU.liza.on 
AlgorithmicSpeedups 
ImprovedSearch
What*Search*Method?* 
GRID NELDER_MEAD POWELL RANDOM SMAC SPEARMINT TPE 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 
australian breast diabetes fourclass splice 
16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 
Method and Maximum Calls 
Dataset and Validation Error 
Maximum Calls 
16 
81 
256 
625 
Comparison of Search Methods Across Learning Problems 
✦ Various*derivaBve]free*opBmizaBon*techniques* 
✦ Simple*ones*(Grid,*Random)* 
✦ Classic*DerivaBve]Free*(Nelder]Mead,*Powell’s*method)* 
✦ Bayesian*(SMAC,*TPE,*Spearmint) 
✦ What*should*we*do?* 
✦ Tried*on*5*datasets,*opBmized*over*4*hyperparameters!
What*Search*Method?* 
GRID NELDER_MEAD POWELL RANDOM SMAC SPEARMINT TPE 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 
australian breast diabetes fourclass splice 
16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 
Method and Maximum Calls 
Dataset and Validation Error 
Maximum Calls 
16 
81 
256 
625 
Comparison of Search Methods Across Learning Problems
Pu^ng*It*All*Together 
●●●●●●●●●●●●●●●● 
●●●● 
●●●● 
●●●● 
●●●● 
●●●●●●●●● 
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 
● 
●●●● 
●●●●● 
●● 
●●●●●●●●●●●● 
●●●●●● ●●●●●●● 
●●● 
● 
● 
●● 
●●●●●●●●●●●●●●●●● ●●●●●●●●● 
0.75 
0.50 
0.25 
0 200 400 600 800 
Time elapsed (m) 
Best Validation Error Seen So Far 
Search Method 
● 
● 
● 
Grid − Unoptimized 
Random − Optimized 
TPE − Optimized 
Model Convergence Over Time 
✦ First*version*of*MLbase*opBmizer 
✦ 30GB*dense*images*(240K*x*16K) 
✦ 2*model*families,*5*hyperparams 
✦ Baseline:*grid*search 
✦ Our*method:*combinaBon*of* 
✦ Batching* 
✦ Early*stopping*/*Bandit* 
✦ Random*or*TPE* 
20xspeedupcompared'to'grid'search' 
15'minutes'vs'5'hours!'
Does*It*Scale? 
✦ 1.5TB*dataset*(1.2M*x*160K) 
✦ 128*nodes,*thousands*of*passes* 
over*data 
✦ Tried*32*models*in*15*hours* 
✦ Good*answer*aler*11*hours 
●●●●● ●●● ●●● ●● 
●● 
● ● ● 
0.75 
0.50 
0.25 
5 10 
Time elapsed (h) 
Best Validation Error Seen So Far 
Convergence of Model Accuracy on 1.5TB Dataset
Future*Work 
! 
Data 
Feature* 
ExtracBon 
Model* 
Training 
Final** 
Model 
AutomatedMLPipelineConstruc.on
Data Image! 
Parser Normalizer Convolver 
Pooler 
sqrt,mean 
Zipper 
Linear 
Solver 
Symmetric! 
Rectifier ident,abs 
ident,mean 
Global 
Patch! 
Extractor 
Patch 
Whitener 
KMeans! 
Clusterer 
Feature Extractor 
Label! 
Extractor 
Test! 
Feature 
Data 
Linear! 
Extractor 
Mapper Model Label! 
Extractor 
Test 
Error 
Error! 
Computer 
No Hyperparameters! 
A few Hyperparameters! 
Lotsa Hyperparameters 
Slide courtesy of Evan Sparks
Other*Future*Work 
✦ Ensembling 
✦ Leverage*sampling 
✦ Beder*parallelism*for*smaller*datasets 
✦ MulBple*hypothesis*tesBng*issues
MLOpt:*DeclaraBve*layer*that*aims*to* 
automate*ML*pipeline*construcBon* 
MLI:*API*to*simplify*ML*development* 
MLlib:*Spark’s*core*ML*library* 
Spark:*Cluster*compuBng*system* 
designed*for*iteraBve*computaBon 
MLOpt 
MLI 
THANKS! 
QUESTIONS? 
ML base 
ML base 
ML base 
ML ML ML ML www.mlbase.org 
MLlib 
Apache*Spark

More Related Content

Viewers also liked

Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFMLconf
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...Sri Ambati
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 
Aacte Junio 2008
Aacte Junio 2008Aacte Junio 2008
Aacte Junio 2008roke
 
Serious Games und Social Media: Ein Zukunftsmarkt
Serious Games und Social Media: Ein ZukunftsmarktSerious Games und Social Media: Ein Zukunftsmarkt
Serious Games und Social Media: Ein ZukunftsmarktJohannes Konert
 
Eventi GEOWEB
Eventi GEOWEBEventi GEOWEB
Eventi GEOWEBGEOWEB
 
CFW Domestic Sprinkler Regulations - Bafsa Fire Sprinkler Wales
CFW Domestic Sprinkler Regulations - Bafsa Fire Sprinkler WalesCFW Domestic Sprinkler Regulations - Bafsa Fire Sprinkler Wales
CFW Domestic Sprinkler Regulations - Bafsa Fire Sprinkler WalesRae Davies
 
Aprendiendo A Ver La Escultura
Aprendiendo A Ver La EsculturaAprendiendo A Ver La Escultura
Aprendiendo A Ver La Esculturacarolinaperez_76
 
Syllabus propedéutica y terapéutica ocular ciclo 2 2015
Syllabus propedéutica y terapéutica ocular ciclo 2 2015Syllabus propedéutica y terapéutica ocular ciclo 2 2015
Syllabus propedéutica y terapéutica ocular ciclo 2 2015Universidad Técnica de Manabí
 
Hoja de afiliación
Hoja de afiliaciónHoja de afiliación
Hoja de afiliaciónfontaine18
 
Skills Portfolio 2010
Skills Portfolio 2010Skills Portfolio 2010
Skills Portfolio 2010JacquiBIUK
 
Presentación Progestion Occidente 2009 Gcv Mp3
Presentación Progestion Occidente 2009   Gcv   Mp3Presentación Progestion Occidente 2009   Gcv   Mp3
Presentación Progestion Occidente 2009 Gcv Mp3GECIVI
 
Ethical hacking Chapter 2 - TCP/IP - Eric Vanderburg
Ethical hacking   Chapter 2 - TCP/IP - Eric VanderburgEthical hacking   Chapter 2 - TCP/IP - Eric Vanderburg
Ethical hacking Chapter 2 - TCP/IP - Eric VanderburgEric Vanderburg
 

Viewers also liked (16)

Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SF
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
Aacte Junio 2008
Aacte Junio 2008Aacte Junio 2008
Aacte Junio 2008
 
Serious Games und Social Media: Ein Zukunftsmarkt
Serious Games und Social Media: Ein ZukunftsmarktSerious Games und Social Media: Ein Zukunftsmarkt
Serious Games und Social Media: Ein Zukunftsmarkt
 
Eventi GEOWEB
Eventi GEOWEBEventi GEOWEB
Eventi GEOWEB
 
Madres y blogs
Madres y blogsMadres y blogs
Madres y blogs
 
CFW Domestic Sprinkler Regulations - Bafsa Fire Sprinkler Wales
CFW Domestic Sprinkler Regulations - Bafsa Fire Sprinkler WalesCFW Domestic Sprinkler Regulations - Bafsa Fire Sprinkler Wales
CFW Domestic Sprinkler Regulations - Bafsa Fire Sprinkler Wales
 
Aprendiendo A Ver La Escultura
Aprendiendo A Ver La EsculturaAprendiendo A Ver La Escultura
Aprendiendo A Ver La Escultura
 
Syllabus propedéutica y terapéutica ocular ciclo 2 2015
Syllabus propedéutica y terapéutica ocular ciclo 2 2015Syllabus propedéutica y terapéutica ocular ciclo 2 2015
Syllabus propedéutica y terapéutica ocular ciclo 2 2015
 
Hoja de afiliación
Hoja de afiliaciónHoja de afiliación
Hoja de afiliación
 
Skills Portfolio 2010
Skills Portfolio 2010Skills Portfolio 2010
Skills Portfolio 2010
 
EXPEDIA.ES
EXPEDIA.ESEXPEDIA.ES
EXPEDIA.ES
 
Presentación Progestion Occidente 2009 Gcv Mp3
Presentación Progestion Occidente 2009   Gcv   Mp3Presentación Progestion Occidente 2009   Gcv   Mp3
Presentación Progestion Occidente 2009 Gcv Mp3
 
Ethical hacking Chapter 2 - TCP/IP - Eric Vanderburg
Ethical hacking   Chapter 2 - TCP/IP - Eric VanderburgEthical hacking   Chapter 2 - TCP/IP - Eric Vanderburg
Ethical hacking Chapter 2 - TCP/IP - Eric Vanderburg
 
Paseo Por Granada
Paseo Por GranadaPaseo Por Granada
Paseo Por Granada
 

Similar to Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF

Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)Marreddy P
 
[db tech showcase 2015 Sapporo HOKKAIDO] MySQL as document database!?
[db tech showcase 2015 Sapporo HOKKAIDO] MySQL as document database!?[db tech showcase 2015 Sapporo HOKKAIDO] MySQL as document database!?
[db tech showcase 2015 Sapporo HOKKAIDO] MySQL as document database!?Ryusuke Kajiyama
 
[Java Küche RDB 最前線 2015] MySQL 5.7技術アップデート
[Java Küche RDB 最前線 2015] MySQL 5.7技術アップデート[Java Küche RDB 最前線 2015] MySQL 5.7技術アップデート
[Java Küche RDB 最前線 2015] MySQL 5.7技術アップデートRyusuke Kajiyama
 
Prometheus as exposition format for eBPF programs running on Kubernetes
Prometheus as exposition format for eBPF programs running on KubernetesPrometheus as exposition format for eBPF programs running on Kubernetes
Prometheus as exposition format for eBPF programs running on KubernetesLeonardo Di Donato
 
EuroPython 2017 - Bonono - Simple ETL in python 3.5+
EuroPython 2017 - Bonono - Simple ETL in python 3.5+EuroPython 2017 - Bonono - Simple ETL in python 3.5+
EuroPython 2017 - Bonono - Simple ETL in python 3.5+Romain Dorgueil
 
Simple ETL in python 3.5+ with Bonobo, Romain Dorgueil
Simple ETL in python 3.5+ with Bonobo, Romain DorgueilSimple ETL in python 3.5+ with Bonobo, Romain Dorgueil
Simple ETL in python 3.5+ with Bonobo, Romain DorgueilPôle Systematic Paris-Region
 
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017Romain Dorgueil
 
Code instrumentation
Code instrumentationCode instrumentation
Code instrumentationBryan Reinero
 
10x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp0210x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp02promethius
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance ImprovementsRonald Bradford
 
Tech Talk - Konrad Gawda : P4 programming language
Tech Talk - Konrad Gawda : P4 programming languageTech Talk - Konrad Gawda : P4 programming language
Tech Talk - Konrad Gawda : P4 programming languageCodiLime
 
OpenCV祭り (配布用)
OpenCV祭り (配布用)OpenCV祭り (配布用)
OpenCV祭り (配布用)tomoaki0705
 
Recent Developments With ZopeSkel
Recent Developments With ZopeSkelRecent Developments With ZopeSkel
Recent Developments With ZopeSkelcbcunc
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesMarina Kolpakova
 
Learning & using new technology
Learning & using new technologyLearning & using new technology
Learning & using new technologyMichelle Crapo
 

Similar to Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF (20)

Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)
 
C++ Core Guidelines
C++ Core GuidelinesC++ Core Guidelines
C++ Core Guidelines
 
pm1
pm1pm1
pm1
 
[db tech showcase 2015 Sapporo HOKKAIDO] MySQL as document database!?
[db tech showcase 2015 Sapporo HOKKAIDO] MySQL as document database!?[db tech showcase 2015 Sapporo HOKKAIDO] MySQL as document database!?
[db tech showcase 2015 Sapporo HOKKAIDO] MySQL as document database!?
 
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
 
[Java Küche RDB 最前線 2015] MySQL 5.7技術アップデート
[Java Küche RDB 最前線 2015] MySQL 5.7技術アップデート[Java Küche RDB 最前線 2015] MySQL 5.7技術アップデート
[Java Küche RDB 最前線 2015] MySQL 5.7技術アップデート
 
Prometheus as exposition format for eBPF programs running on Kubernetes
Prometheus as exposition format for eBPF programs running on KubernetesPrometheus as exposition format for eBPF programs running on Kubernetes
Prometheus as exposition format for eBPF programs running on Kubernetes
 
EuroPython 2017 - Bonono - Simple ETL in python 3.5+
EuroPython 2017 - Bonono - Simple ETL in python 3.5+EuroPython 2017 - Bonono - Simple ETL in python 3.5+
EuroPython 2017 - Bonono - Simple ETL in python 3.5+
 
Fuzzing - Part 1
Fuzzing - Part 1Fuzzing - Part 1
Fuzzing - Part 1
 
Simple ETL in python 3.5+ with Bonobo, Romain Dorgueil
Simple ETL in python 3.5+ with Bonobo, Romain DorgueilSimple ETL in python 3.5+ with Bonobo, Romain Dorgueil
Simple ETL in python 3.5+ with Bonobo, Romain Dorgueil
 
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
 
Code instrumentation
Code instrumentationCode instrumentation
Code instrumentation
 
10x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp0210x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp02
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance Improvements
 
Jvm goes big_data_sfjava
Jvm goes big_data_sfjavaJvm goes big_data_sfjava
Jvm goes big_data_sfjava
 
Tech Talk - Konrad Gawda : P4 programming language
Tech Talk - Konrad Gawda : P4 programming languageTech Talk - Konrad Gawda : P4 programming language
Tech Talk - Konrad Gawda : P4 programming language
 
OpenCV祭り (配布用)
OpenCV祭り (配布用)OpenCV祭り (配布用)
OpenCV祭り (配布用)
 
Recent Developments With ZopeSkel
Recent Developments With ZopeSkelRecent Developments With ZopeSkel
Recent Developments With ZopeSkel
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
 
Learning & using new technology
Learning & using new technologyLearning & using new technology
Learning & using new technology
 

More from MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF

  • 2. Problem:"Scalable"implementa.ons" difficult"for"ML"Developers… Meta-Statistics ML Contract + Code ML Key Features t )JHIMFWFMMBOHVBHFGPSOVNFSJDBM DPNQVUBUJPO WJTVBMJ[BUJPO BOEBQ- QMJDBUJPOEFWFMPQNFOU t *OUFSBDUJWFFOWJSPONFOUGPSJUFSBUJWF FYQMPSBUJPO EFTJHO BOEQSPCMFN TPMWJOH t .BUIFNBUJDBMGVODUJPOTGPSMJOFBS BMHFCSB TUBUJTUJDT 'PVSJFSBOBMZTJT GJMUFSJOH PQUJNJ[BUJPO OVNFSJDBM JOUFHSBUJPO BOETPMWJOHPSEJOBSZ EJGGFSFOUJBMFRVBUJPOT t #VJMUJOHSBQIJDTGPSWJTVBMJ[JOHEBUB BOEUPPMTGPSDSFBUJOHDVTUPNQMPUT t %FWFMPQNFOUUPPMTGPSJNQSPWJOH DPEFRVBMJUZBOENBJOUBJOBCJMJUZ BOENBYJNJ[JOHQFSGPSNBODF t 5PPMTGPSCVJMEJOHBQQMJDBUJPOTXJUI DVTUPNHSBQIJDBMJOUFSGBDFT t 'VODUJPOTGPSJOUFHSBUJOH.5-# CBTFEBMHPSJUINTXJUIFYUFSOBMBQ- QMJDBUJPOTBOEMBOHVBHFTTVDIBT$ +BWB /5 BOE.JDSPTPGU®YDFM® The Language of Technical Computing MATLAB® is a high-level language and interactive environment for numerical com-putation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming languages, such as C/C++ or Java™. You can use MATLAB for a range of appli-cations, including signal processing and communications, image and video process-ing, control systems, test and measurement, computational finance, and computational biology. More than a million engineers and scientists in industry and academia use MATLAB, the language of technical computing. MATLAB Overview 2:04 Analyzing and visualizing data using the MATLAB desktop. The MATLAB environment also lets you write programs and develop algorithms and applications.
  • 4. Problem:Scalableimplementa.ons difficultforMLDevelopers… Meta-Statistics ML Contract + Code ML CHALLENGE:*Can*we*simplify* distributed*ML*development?
  • 5. Problem:MLisdifficult forEndUsers… Too*many*ways*to* preprocess… Too*many* knobs… Difficult*to* debug… Doesn’t*scale… Too*many* algorithms…
  • 6. Problem:MLisdifficult forEndUsers… Too*many*ways*to* preprocess… Too*many* knobs… Difficult*to* debug… CHALLENGE:*Can*we* automate*ML*pipeline* Doesn’t*scale… construcBon? Too*many* algorithms…
  • 7. MLbase 4 MLOpt MLI MLlib Apache*Spark MLbase'aims'to' simplify'development' and'deployment'of' scalable'ML'pipelines Spark:*Cluster*compuBng*system*designed*for*iteraBve*computaBon MLlib:*Spark’s*core*ML*library MLI:*API*to*simplify*ML*development MLOpt:*DeclaraBve*layer*to*automate*hyperparameter*tuning
  • 8. MLbase MLbase'aims'to' simplify'development' and'deployment'of' scalable'ML'pipelines MLOpt and MLI are experimental testbeds 4 MLOpt MLI MLlib Apache*Spark Spark:*Cluster*compuBng*system*designed*for*iteraBve*computaBon MLlib:*Spark’s*core*ML*library MLI:*API*to*simplify*ML*development MLOpt:*DeclaraBve*layer*to*automate*hyperparameter*tuning
  • 9. MLlib +**Scalable*and*fast** +**Simple*development*environment* +**Part*of*Spark’s*robust*ecosystem 5 SparkSQL Spark Streaming MLlib (machine learning) Apache Spark GraphX (graph)
  • 10. AcBve*Development Ini.alRelease • Developed*by*MLbase*team*in*AMPLab*(11*contributors) • Scala,*Java • Shipped*with*Spark*v0.8*(Sep*2013) 15monthslater… • 60+*contributors*from*various*organizaBons • Scala,*Java,*Python • Many*more*algorithms*and*ML*uBliBes • Improved*documentaBon*/*code*examples,*API*stability • Latest*release*part*of*Spark*v1.1*(Sept*2014)
  • 11. MLlib,*MLI*and*Roadmap • MLI:*Shield*ML*Developers*from*low]details* • Provide*familiar*mathemaBcal*operators*in*distributed*se^ng* (tables,*matrices,*opBmizaBon*primiBves)* • Standard*APIs*defining*ML*algorithms*and*feature*extractors • MLlib*v1.2*will*include*ML*API*inspired*by*MLI • Longer*term*for*MLlib* • Scalable*implementaBons*of*standard*ML*algorithms*and* underlying*opBmizaBon*primiBves* • Further*support*for*ML*pipeline*development*(including*hyper* parameter*tuning*using*ideas*from*MLOpt)
  • 12. MLlib,*MLI*and*Roadmap • MLI:*Shield*ML*Developers*from*low]details* • Provide*familiar*mathemaBcal*operators*in*distributed*se^ng* (tables,*matrices,*opBmizaBon*primiBves)* • Standard*APIs*defining*ML*algorithms*and*feature*extractors Feedback'and' • MLlib*v1.2*will*include*ML*API*inspired*by*MLI Contribu9ons'Encouraged! • Longer*term*for*MLlib* • Scalable*implementaBons*of*standard*ML*algorithms*and* underlying*opBmizaBon*primiBves* • Further*support*for*ML*pipeline*development*(including*hyper* parameter*tuning*using*ideas*from*MLOpt)
  • 14. SQL Result PAQ Model Planner for Large-scale Predictive Analytic Queries ML ✦ User*declaraBvely*specifies*task ✦ PAQ*=*PredicBve*AnalyBc*Query ✦ Search*through*MLlib*to*find*the* best*model/pipeline the develop-ment enabled a wide recommen-dations, speech driven in-terfaces. supported by SELECT e.sender, e.subject, e.message FROM Emails e WHERE e.user = ’Bob’ AND PREDICT(e.spam, e.message) = false GIVEN LabeledData
  • 15. A*Standard*ML*Pipeline ! Data Feature* ExtracBon Model* Training Final** Model ✦ In*pracBce,*model*building*is*an*iteraBve*process*of*conBnuous* refinement ✦ Our*grand*vision*is*to*automate*the*construcBon*of*these* pipelines
  • 16. Training*A*Model ✦ For*each*point*in*dataset* ✦ compute*gradient* ✦ update*model* ✦ repeat*unBl*converged ✦ Requires*mul$ple'passes ✦ Common*access*padern* ✦ Naive*Bayes,*Trees,*etc. ✦ Minutes*to*train*an*SVM*on*200GB* of*data*on*a*16]node*cluster
  • 17. The*Tricky*Part ✦ Algorithms* ✦ LogisBc*Regression,*SVM,*Tree] based,*etc.* ✦ Algorithm*hyper]parameters* ✦ Learning*Rate,*RegularizaBon,*etc. Algorithms Hyper* Parameters FeaturizaBon ✦ FeaturizaBon* ✦ Text:*n]grams,*TF]IDF* ✦ Images:*Gabor*filters,*random* convoluBons* ✦ Random*projecBon?*Scaling?
  • 18. A*Standard*ML*Pipeline ! Data Feature* ExtracBon Model* Training ✦ In*pracBce,*model*building*is*an*iteraBve*process*of*conBnuous* refinement ✦ Our*grand*vision*is*to*automate*the*construcBon*of*these* pipelines ✦ Start*with*one*aspect*of*the*pipeline*]*model*selecBon Final** Model AutomatedModelSelec.on
  • 19. One*Approach Learning* Rate RegularizaBon Bestanswer ✦ Try*it*all!* ✦ Search*over*all* hyperparameters,*algorithms,* features,*etc. ✦ Drawbacks* ✦ Expensive*to*compute*models* ✦ Hyperparameter*space*is*large ✦ Some*version*of*this*sBll* olen*done*in*pracBce!
  • 20. A*Beder*Approach ✦ Beder*resource*uBlizaBon* ✦ through*batching* ✦ Algorithmic*Speedups* ✦ via*early*stopping*/*bandits* ✦ Improved*Search* ✦ e.g.,*via*randomizaBon Learning* Rate RegularizaBon Bestanswer
  • 22. Beder*Resource*UBlizaBon ✦ Typical*model*update*requires*2]4*flops/double ✦ But*modern*memory*much*slower*than*processors* ✦ We*can*do*25*flops*/*double*read!* ✦ This*equates*to*6]8*model*updates*per*double*we* read,*assuming*models*fit*in*cache ✦ Trainmul.plemodelssimultaneously
  • 23. What*Do*We*See*In*Spark? ✦ 2x*and*5x*increase*in*models* trained/sec*with*batching* ✦ Overhead*from*virtualizaBon,* network,*etc.
  • 24. What*Do*We*See*In*Spark? ✦ These*numbers*are*with* vector]matrix*mulBplies ✦ Can*do*beder*when*rewriBng* in*terms*of*matrix]matrix* mulBplies
  • 26. Learning* Rate RegularizaBon Bestanswer Bandit*Search ✦ Each*point*is*a*trained*model ✦ Some*models*look*bad*early ✦ So*we*give*up*early! ✦ Can*frame*this*as*a*mul.Q armedbandit*problem ✦ Each*model*is*an*arm
  • 27. Bandit*Search ✦ Each*point*is*a*trained*model ✦ Some*models*look*bad*early ✦ So*we*give*up*early! ✦ Can*frame*this*as*a*mul.Q armedbandit*problem ✦ Each*model*is*an*arm
  • 29. What*Search*Method?* GRID NELDER_MEAD POWELL RANDOM SMAC SPEARMINT TPE 0.5 0.4 0.3 0.2 0.1 0.0 0.5 0.4 0.3 0.2 0.1 0.0 0.5 0.4 0.3 0.2 0.1 0.0 0.5 0.4 0.3 0.2 0.1 0.0 0.5 0.4 0.3 0.2 0.1 0.0 australian breast diabetes fourclass splice 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 Method and Maximum Calls Dataset and Validation Error Maximum Calls 16 81 256 625 Comparison of Search Methods Across Learning Problems ✦ Various*derivaBve]free*opBmizaBon*techniques* ✦ Simple*ones*(Grid,*Random)* ✦ Classic*DerivaBve]Free*(Nelder]Mead,*Powell’s*method)* ✦ Bayesian*(SMAC,*TPE,*Spearmint) ✦ What*should*we*do?* ✦ Tried*on*5*datasets,*opBmized*over*4*hyperparameters!
  • 30. What*Search*Method?* GRID NELDER_MEAD POWELL RANDOM SMAC SPEARMINT TPE 0.5 0.4 0.3 0.2 0.1 0.0 0.5 0.4 0.3 0.2 0.1 0.0 0.5 0.4 0.3 0.2 0.1 0.0 0.5 0.4 0.3 0.2 0.1 0.0 0.5 0.4 0.3 0.2 0.1 0.0 australian breast diabetes fourclass splice 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 16 81 256 625 Method and Maximum Calls Dataset and Validation Error Maximum Calls 16 81 256 625 Comparison of Search Methods Across Learning Problems
  • 31. Pu^ng*It*All*Together ●●●●●●●●●●●●●●●● ●●●● ●●●● ●●●● ●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●● ●●●●● ●● ●●●●●●●●●●●● ●●●●●● ●●●●●●● ●●● ● ● ●● ●●●●●●●●●●●●●●●●● ●●●●●●●●● 0.75 0.50 0.25 0 200 400 600 800 Time elapsed (m) Best Validation Error Seen So Far Search Method ● ● ● Grid − Unoptimized Random − Optimized TPE − Optimized Model Convergence Over Time ✦ First*version*of*MLbase*opBmizer ✦ 30GB*dense*images*(240K*x*16K) ✦ 2*model*families,*5*hyperparams ✦ Baseline:*grid*search ✦ Our*method:*combinaBon*of* ✦ Batching* ✦ Early*stopping*/*Bandit* ✦ Random*or*TPE* 20xspeedupcompared'to'grid'search' 15'minutes'vs'5'hours!'
  • 32. Does*It*Scale? ✦ 1.5TB*dataset*(1.2M*x*160K) ✦ 128*nodes,*thousands*of*passes* over*data ✦ Tried*32*models*in*15*hours* ✦ Good*answer*aler*11*hours ●●●●● ●●● ●●● ●● ●● ● ● ● 0.75 0.50 0.25 5 10 Time elapsed (h) Best Validation Error Seen So Far Convergence of Model Accuracy on 1.5TB Dataset
  • 33. Future*Work ! Data Feature* ExtracBon Model* Training Final** Model AutomatedMLPipelineConstruc.on
  • 34. Data Image! Parser Normalizer Convolver Pooler sqrt,mean Zipper Linear Solver Symmetric! Rectifier ident,abs ident,mean Global Patch! Extractor Patch Whitener KMeans! Clusterer Feature Extractor Label! Extractor Test! Feature Data Linear! Extractor Mapper Model Label! Extractor Test Error Error! Computer No Hyperparameters! A few Hyperparameters! Lotsa Hyperparameters Slide courtesy of Evan Sparks
  • 35. Other*Future*Work ✦ Ensembling ✦ Leverage*sampling ✦ Beder*parallelism*for*smaller*datasets ✦ MulBple*hypothesis*tesBng*issues
  • 36. MLOpt:*DeclaraBve*layer*that*aims*to* automate*ML*pipeline*construcBon* MLI:*API*to*simplify*ML*development* MLlib:*Spark’s*core*ML*library* Spark:*Cluster*compuBng*system* designed*for*iteraBve*computaBon MLOpt MLI THANKS! QUESTIONS? ML base ML base ML base ML ML ML ML www.mlbase.org MLlib Apache*Spark