SlideShare a Scribd company logo
An Open Source Machine Learning Server 
for Developers 
@PredictionIO #PredictionIO 
Simon Chan 
simon@prediction.io
Thank you for having me here today! 
• Simon Chan - CEO of PredictionIO 
• A small team of Data Scientists and Engineers 
• Mainly based in Silicon Valley, also London and Hong Kong
Top Github Open Source 
• Over 5000 developers engaged 
• Powering over 200 applications
Talk Focus: 
• Machine Learning - A (Very) Brief Review 
• Challenges We Face When Building PredictionIO
Machine Learning is Simple?
I am going to give an 
example that will make 
you… HUNGRY!
F FOOD Club – Menu 
FOOD 
CLUB
Coding time…. 
# Using PredictionIO 
# Collect Data 
cli = predictionio.EventClient("<my_app_id>") 
cli.record_user_action_on_item("buy", "John", “BulgogiA") 
# Predict top preferences 
eng = predictionio.EngineClient("<my_engine_url>") 
rec = eng.send_query({"uid" : "John", "n" : 5})
The Magic Behind: Engine 
1. Data Sourcing and Preparation 
2. Algorithm 
3. Serving 
4. Evaluation
Challenges and Solutions
Architectural Challenge 1 
Workflow Co-ordination on a Distributed Cluster
Needs: 
•Support multiple distributed engines 
•Support multiple algorithms to execute in parallel 
How to coordinate the workflow when you have 
more pending tasks than processing units?
Attempt #1 
Use a database system to store tasks, and 
have a pool of workers pull tasks from it. 
•Inefficient. Database becomes bottleneck 
and potentially single point of failure.
Attempt #2 
Use an Akka cluster. 
Akka is a toolkit and runtime for building highly 
concurrent, distributed, and fault tolerant event-driven 
applications on the JVM. 
•Fundamentally the same problem with the above. 
•Need to build management suite on top.
Solution 
Apache Spark: directed acyclic graph 
(DAG) scheduling 
Adapts to many different infrastructure: 
Apache Spark standalone cluster, Apache 
Hadoop 2 YARN, Apache Mesos. 
Source: http://upload.wikimedia.org/wikipedia/commons/3/39/Directed_acyclic_graph_3.svg
Solution Source Code: 
http://github.com/predictionio
Architectural Challenge 2 
Distributed In-memory Model Retrieval
Needs: 
•Engines produce models that are 
distributed across a cluster. 
Requires a way to serve these distributed 
in-memory models to queries in real-time.
Solution 
All PredictionIO engine instances are launched 
inside a “SparkContext”. 
A SparkContext represents the connection to a 
Spark cluster, and can be used to create RDDs, 
accumulators and broadcast variables on that 
cluster. 
Source: http://bighadoop.files.wordpress.com/2014/04/spark-architecture.png
•When an engine is local to a single 
machine, it loads the model to its memory. 
•When an engine is distributed, 
SparkContext will automatically load the 
model on a cluster.
Conceptual Code for the Solution 
val sc = SparkContext(conf) 
... 
val model = 
if (model_is_distributed) { 
if (model_is_persisted) { 
sc.objectFile(model_on_HDFS) 
} else { 
engine.algo.train() 
} 
} else { 
... 
} 
}
PredictionIO 0.8
Built-in Engines: 
•Item Recommendation 
•Item Rank 
•Item Similarity
Create an Engine Instance Project…. 
$ pio instance io.prediction.engines.itemrec 
$ cd io.prediction.engines.itemrec 
$ pio register
Collect Event Data…. 
cli = predictionio.EventClient("<app_id>") 
cli.record_user_action_on_item("like", "John", “bulgogi_12”) 
cli.record_user_action_on_item("view", "John", “bimbimbap_13”)
Configurate the Engine Instance settings 
in params/datasource.json 
{ 
"appId": <app_id>, 
"actions": [ 
"view", "like", ... 
], ... 
}
Train the Data Model 
$ pio train 
Deploy the Engine Instance 
$ pio deploy
Retrieve Prediction Results 
from predictionio import EngineClient 
client = EngineClient(url="http://localhost:8000") 
prediction = client.send_query({"uid": "John", "n": 3}) 
print prediction 
Output 
{u'items': [{u'272': 9.929327011108398}, {u'313': 
9.92607593536377}, {u’347': 9.92170524597168}]}
You can also…. 
• Change algorithm 
• Tune algorithm parameter 
• Compare and evaluate algorithm 
• Add custom business logics
SDKs for: 
• Python 
• Ruby 
• PHP 
• Java / Andriod 
• Scala 
• Node.js 
• iOS 
• Meteor 
• more….
Also, 
build your own Engine!
Applications 
of 
Machine Learning 
Speech Recognition 
Personal Newsfeed 
SPAM Filtering 
Recommendation 
Driverless Car 
Churn Prediction 
Ad Targeting 
Fraud Detection 
{
감사합니다 
Korean Documentation (Beta)! 
http://docs.prediction.io/kr 
- @PredictionIO 
- prediction.io - Newsletters 
- github.com/predictionio

More Related Content

What's hot

Big Wins with Small Data: PredictionIO in Ecommerce
Big Wins with Small Data: PredictionIO in EcommerceBig Wins with Small Data: PredictionIO in Ecommerce
Big Wins with Small Data: PredictionIO in Ecommerce
David Jones
 
Using Azure Machine Learning Models
Using Azure Machine Learning ModelsUsing Azure Machine Learning Models
Using Azure Machine Learning Models
Eng Teong Cheah
 
Azure によるスピードレイヤの分析アーキテクチャ
Azure によるスピードレイヤの分析アーキテクチャAzure によるスピードレイヤの分析アーキテクチャ
Azure によるスピードレイヤの分析アーキテクチャ
Deep Learning Lab(ディープラーニング・ラボ)
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Databricks
 
What is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays FinlandWhat is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays Finland
Maarten Balliauw
 
Azure Machine Learning tutorial
Azure Machine Learning tutorialAzure Machine Learning tutorial
Azure Machine Learning tutorial
Giacomo Lanciano
 
An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)
Julien SIMON
 
StreamSQL Feature Store (Apache Pulsar Summit)
StreamSQL Feature Store (Apache Pulsar Summit)StreamSQL Feature Store (Apache Pulsar Summit)
StreamSQL Feature Store (Apache Pulsar Summit)
Simba Khadder
 
Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...
Microsoft Tech Community
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
Andrzej Michałowski
 
Optimizing PV energy yield with Elasticsearch and graphQL
Optimizing PV energy yield with Elasticsearch and graphQLOptimizing PV energy yield with Elasticsearch and graphQL
Optimizing PV energy yield with Elasticsearch and graphQL
Chijioke “CJ” Ejimuda
 
Sergii Baidachnyi ITEM 2018
Sergii Baidachnyi ITEM 2018Sergii Baidachnyi ITEM 2018
Sergii Baidachnyi ITEM 2018
ITEM
 
Mastering Azure Monitor
Mastering Azure MonitorMastering Azure Monitor
Mastering Azure Monitor
Richard Conway
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflow
Databricks
 
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
Sri Ambati
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
Ross McNeely
 
Krish Azure AI webinar
Krish Azure AI webinarKrish Azure AI webinar
Krish Azure AI webinar
Kamal Pandey
 
App development with constraint layout, kotlin &amp; firebase
App development with constraint layout, kotlin &amp; firebaseApp development with constraint layout, kotlin &amp; firebase
App development with constraint layout, kotlin &amp; firebase
Pankaj Rai
 
Autonomous analytics on streaming data
Autonomous analytics on streaming dataAutonomous analytics on streaming data
Autonomous analytics on streaming data
Claudiu Barbura
 
Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21
Jim Dowling
 

What's hot (20)

Big Wins with Small Data: PredictionIO in Ecommerce
Big Wins with Small Data: PredictionIO in EcommerceBig Wins with Small Data: PredictionIO in Ecommerce
Big Wins with Small Data: PredictionIO in Ecommerce
 
Using Azure Machine Learning Models
Using Azure Machine Learning ModelsUsing Azure Machine Learning Models
Using Azure Machine Learning Models
 
Azure によるスピードレイヤの分析アーキテクチャ
Azure によるスピードレイヤの分析アーキテクチャAzure によるスピードレイヤの分析アーキテクチャ
Azure によるスピードレイヤの分析アーキテクチャ
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
 
What is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays FinlandWhat is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays Finland
 
Azure Machine Learning tutorial
Azure Machine Learning tutorialAzure Machine Learning tutorial
Azure Machine Learning tutorial
 
An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)
 
StreamSQL Feature Store (Apache Pulsar Summit)
StreamSQL Feature Store (Apache Pulsar Summit)StreamSQL Feature Store (Apache Pulsar Summit)
StreamSQL Feature Store (Apache Pulsar Summit)
 
Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
 
Optimizing PV energy yield with Elasticsearch and graphQL
Optimizing PV energy yield with Elasticsearch and graphQLOptimizing PV energy yield with Elasticsearch and graphQL
Optimizing PV energy yield with Elasticsearch and graphQL
 
Sergii Baidachnyi ITEM 2018
Sergii Baidachnyi ITEM 2018Sergii Baidachnyi ITEM 2018
Sergii Baidachnyi ITEM 2018
 
Mastering Azure Monitor
Mastering Azure MonitorMastering Azure Monitor
Mastering Azure Monitor
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflow
 
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
 
Krish Azure AI webinar
Krish Azure AI webinarKrish Azure AI webinar
Krish Azure AI webinar
 
App development with constraint layout, kotlin &amp; firebase
App development with constraint layout, kotlin &amp; firebaseApp development with constraint layout, kotlin &amp; firebase
App development with constraint layout, kotlin &amp; firebase
 
Autonomous analytics on streaming data
Autonomous analytics on streaming dataAutonomous analytics on streaming data
Autonomous analytics on streaming data
 
Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21
 

Viewers also liked

PredictionIO - The 1st International Conference on Predictive APIs and Apps
PredictionIO - The 1st International Conference on Predictive APIs and AppsPredictionIO - The 1st International Conference on Predictive APIs and Apps
PredictionIO - The 1st International Conference on Predictive APIs and Apps
predictionio
 
Prediction io–final 2014-jp-handout
Prediction io–final 2014-jp-handoutPrediction io–final 2014-jp-handout
Prediction io–final 2014-jp-handout
Ha Phuong
 
Discovery
DiscoveryDiscovery
Discovery
Pat Ferrel
 
Co-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and SparkCo-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and Spark
sscdotopen
 
Machine learning in the enterprise
Machine learning in the enterpriseMachine learning in the enterprise
Machine learning in the enterprise
Jesus Rodriguez
 
The Minister's Black Veil - in class notes
The Minister's Black Veil - in class notesThe Minister's Black Veil - in class notes
The Minister's Black Veil - in class notes
lramirezcruz
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine Learning
David Jones
 
PredictionIO
PredictionIOPredictionIO
PredictionIO
500 Startups
 
The Universal Recommender
The Universal RecommenderThe Universal Recommender
The Universal Recommender
Pat Ferrel
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine Learning
Varad Meru
 

Viewers also liked (10)

PredictionIO - The 1st International Conference on Predictive APIs and Apps
PredictionIO - The 1st International Conference on Predictive APIs and AppsPredictionIO - The 1st International Conference on Predictive APIs and Apps
PredictionIO - The 1st International Conference on Predictive APIs and Apps
 
Prediction io–final 2014-jp-handout
Prediction io–final 2014-jp-handoutPrediction io–final 2014-jp-handout
Prediction io–final 2014-jp-handout
 
Discovery
DiscoveryDiscovery
Discovery
 
Co-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and SparkCo-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and Spark
 
Machine learning in the enterprise
Machine learning in the enterpriseMachine learning in the enterprise
Machine learning in the enterprise
 
The Minister's Black Veil - in class notes
The Minister's Black Veil - in class notesThe Minister's Black Veil - in class notes
The Minister's Black Veil - in class notes
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine Learning
 
PredictionIO
PredictionIOPredictionIO
PredictionIO
 
The Universal Recommender
The Universal RecommenderThe Universal Recommender
The Universal Recommender
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine Learning
 

Similar to [2C2]PredictionIO

Prediction io 架構與整合 -DataCon.TW-2017
Prediction io 架構與整合 -DataCon.TW-2017Prediction io 架構與整合 -DataCon.TW-2017
Prediction io 架構與整合 -DataCon.TW-2017
William Lee
 
Story ofcorespring infodeck
Story ofcorespring infodeckStory ofcorespring infodeck
Story ofcorespring infodeck
Makarand Bhatambarekar
 
pio_present
pio_presentpio_present
pio_present
Gladson Manuel
 
WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...
Fabio Franzini
 
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018 Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
Codemotion
 
Introduction to Client Side Dev in SharePoint Workshop
Introduction to Client Side Dev in SharePoint WorkshopIntroduction to Client Side Dev in SharePoint Workshop
Introduction to Client Side Dev in SharePoint Workshop
Mark Rackley
 
Google App Engine for Java
Google App Engine for JavaGoogle App Engine for Java
Google App Engine for Java
Lars Vogel
 
Google Cloud Platform
Google Cloud Platform Google Cloud Platform
Google Cloud Platform
Francesco Marchitelli
 
gDayX 2013 - Advanced AngularJS - Nicolas Embleton
gDayX 2013 - Advanced AngularJS - Nicolas EmbletongDayX 2013 - Advanced AngularJS - Nicolas Embleton
gDayX 2013 - Advanced AngularJS - Nicolas Embleton
George Nguyen
 
Google App Engine for Java
Google App Engine for JavaGoogle App Engine for Java
Google App Engine for Java
Lars Vogel
 
App engine devfest_mexico_10
App engine devfest_mexico_10App engine devfest_mexico_10
App engine devfest_mexico_10
Chris Schalk
 
I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)
AZUG FR
 
Agile Machine Learning for Real-time Recommender Systems
Agile Machine Learning for Real-time Recommender SystemsAgile Machine Learning for Real-time Recommender Systems
Agile Machine Learning for Real-time Recommender Systems
Johann Schleier-Smith
 
DEVICE CHANNELS
DEVICE CHANNELSDEVICE CHANNELS
DEVICE CHANNELS
Assaf Biton
 
Deploying Machine Learning Models to Production
Deploying Machine Learning Models to ProductionDeploying Machine Learning Models to Production
Deploying Machine Learning Models to Production
Anass Bensrhir - Senior Data Scientist
 
React django
React djangoReact django
React django
Heber Silva
 
Google I/O 2021 Recap
Google I/O 2021 RecapGoogle I/O 2021 Recap
Google I/O 2021 Recap
furusin
 
Introduction to Backbone.js & Marionette.js
Introduction to Backbone.js & Marionette.jsIntroduction to Backbone.js & Marionette.js
Introduction to Backbone.js & Marionette.js
Return on Intelligence
 
using Mithril.js + postgREST to build and consume API's
using Mithril.js + postgREST to build and consume API'susing Mithril.js + postgREST to build and consume API's
using Mithril.js + postgREST to build and consume API's
Antônio Roberto Silva
 
Presentation Tier optimizations
Presentation Tier optimizationsPresentation Tier optimizations
Presentation Tier optimizations
Anup Hariharan Nair
 

Similar to [2C2]PredictionIO (20)

Prediction io 架構與整合 -DataCon.TW-2017
Prediction io 架構與整合 -DataCon.TW-2017Prediction io 架構與整合 -DataCon.TW-2017
Prediction io 架構與整合 -DataCon.TW-2017
 
Story ofcorespring infodeck
Story ofcorespring infodeckStory ofcorespring infodeck
Story ofcorespring infodeck
 
pio_present
pio_presentpio_present
pio_present
 
WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...
 
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018 Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
 
Introduction to Client Side Dev in SharePoint Workshop
Introduction to Client Side Dev in SharePoint WorkshopIntroduction to Client Side Dev in SharePoint Workshop
Introduction to Client Side Dev in SharePoint Workshop
 
Google App Engine for Java
Google App Engine for JavaGoogle App Engine for Java
Google App Engine for Java
 
Google Cloud Platform
Google Cloud Platform Google Cloud Platform
Google Cloud Platform
 
gDayX 2013 - Advanced AngularJS - Nicolas Embleton
gDayX 2013 - Advanced AngularJS - Nicolas EmbletongDayX 2013 - Advanced AngularJS - Nicolas Embleton
gDayX 2013 - Advanced AngularJS - Nicolas Embleton
 
Google App Engine for Java
Google App Engine for JavaGoogle App Engine for Java
Google App Engine for Java
 
App engine devfest_mexico_10
App engine devfest_mexico_10App engine devfest_mexico_10
App engine devfest_mexico_10
 
I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)
 
Agile Machine Learning for Real-time Recommender Systems
Agile Machine Learning for Real-time Recommender SystemsAgile Machine Learning for Real-time Recommender Systems
Agile Machine Learning for Real-time Recommender Systems
 
DEVICE CHANNELS
DEVICE CHANNELSDEVICE CHANNELS
DEVICE CHANNELS
 
Deploying Machine Learning Models to Production
Deploying Machine Learning Models to ProductionDeploying Machine Learning Models to Production
Deploying Machine Learning Models to Production
 
React django
React djangoReact django
React django
 
Google I/O 2021 Recap
Google I/O 2021 RecapGoogle I/O 2021 Recap
Google I/O 2021 Recap
 
Introduction to Backbone.js & Marionette.js
Introduction to Backbone.js & Marionette.jsIntroduction to Backbone.js & Marionette.js
Introduction to Backbone.js & Marionette.js
 
using Mithril.js + postgREST to build and consume API's
using Mithril.js + postgREST to build and consume API'susing Mithril.js + postgREST to build and consume API's
using Mithril.js + postgREST to build and consume API's
 
Presentation Tier optimizations
Presentation Tier optimizationsPresentation Tier optimizations
Presentation Tier optimizations
 

More from NAVER D2

[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다
NAVER D2
 
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
NAVER D2
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기
NAVER D2
 
[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발
NAVER D2
 
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
NAVER D2
 
[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A
NAVER D2
 
[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기
NAVER D2
 
[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning
NAVER D2
 
[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications
NAVER D2
 
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load BalancingOld version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
NAVER D2
 
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
NAVER D2
 
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
NAVER D2
 
[224]네이버 검색과 개인화
[224]네이버 검색과 개인화[224]네이버 검색과 개인화
[224]네이버 검색과 개인화
NAVER D2
 
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
NAVER D2
 
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
NAVER D2
 
[213] Fashion Visual Search
[213] Fashion Visual Search[213] Fashion Visual Search
[213] Fashion Visual Search
NAVER D2
 
[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화
NAVER D2
 
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
NAVER D2
 
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
NAVER D2
 
[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?
NAVER D2
 

More from NAVER D2 (20)

[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다
 
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기
 
[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발
 
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
 
[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A
 
[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기
 
[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning
 
[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications
 
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load BalancingOld version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
 
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
 
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
 
[224]네이버 검색과 개인화
[224]네이버 검색과 개인화[224]네이버 검색과 개인화
[224]네이버 검색과 개인화
 
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
 
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
 
[213] Fashion Visual Search
[213] Fashion Visual Search[213] Fashion Visual Search
[213] Fashion Visual Search
 
[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화
 
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
 
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
 
[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?
 

Recently uploaded

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Zilliz
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 

Recently uploaded (20)

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 

[2C2]PredictionIO

  • 1. An Open Source Machine Learning Server for Developers @PredictionIO #PredictionIO Simon Chan simon@prediction.io
  • 2. Thank you for having me here today! • Simon Chan - CEO of PredictionIO • A small team of Data Scientists and Engineers • Mainly based in Silicon Valley, also London and Hong Kong
  • 3. Top Github Open Source • Over 5000 developers engaged • Powering over 200 applications
  • 4. Talk Focus: • Machine Learning - A (Very) Brief Review • Challenges We Face When Building PredictionIO
  • 6. I am going to give an example that will make you… HUNGRY!
  • 7. F FOOD Club – Menu FOOD CLUB
  • 8.
  • 9. Coding time…. # Using PredictionIO # Collect Data cli = predictionio.EventClient("<my_app_id>") cli.record_user_action_on_item("buy", "John", “BulgogiA") # Predict top preferences eng = predictionio.EngineClient("<my_engine_url>") rec = eng.send_query({"uid" : "John", "n" : 5})
  • 10. The Magic Behind: Engine 1. Data Sourcing and Preparation 2. Algorithm 3. Serving 4. Evaluation
  • 11.
  • 12.
  • 14. Architectural Challenge 1 Workflow Co-ordination on a Distributed Cluster
  • 15. Needs: •Support multiple distributed engines •Support multiple algorithms to execute in parallel How to coordinate the workflow when you have more pending tasks than processing units?
  • 16. Attempt #1 Use a database system to store tasks, and have a pool of workers pull tasks from it. •Inefficient. Database becomes bottleneck and potentially single point of failure.
  • 17. Attempt #2 Use an Akka cluster. Akka is a toolkit and runtime for building highly concurrent, distributed, and fault tolerant event-driven applications on the JVM. •Fundamentally the same problem with the above. •Need to build management suite on top.
  • 18. Solution Apache Spark: directed acyclic graph (DAG) scheduling Adapts to many different infrastructure: Apache Spark standalone cluster, Apache Hadoop 2 YARN, Apache Mesos. Source: http://upload.wikimedia.org/wikipedia/commons/3/39/Directed_acyclic_graph_3.svg
  • 19. Solution Source Code: http://github.com/predictionio
  • 20. Architectural Challenge 2 Distributed In-memory Model Retrieval
  • 21. Needs: •Engines produce models that are distributed across a cluster. Requires a way to serve these distributed in-memory models to queries in real-time.
  • 22. Solution All PredictionIO engine instances are launched inside a “SparkContext”. A SparkContext represents the connection to a Spark cluster, and can be used to create RDDs, accumulators and broadcast variables on that cluster. Source: http://bighadoop.files.wordpress.com/2014/04/spark-architecture.png
  • 23. •When an engine is local to a single machine, it loads the model to its memory. •When an engine is distributed, SparkContext will automatically load the model on a cluster.
  • 24. Conceptual Code for the Solution val sc = SparkContext(conf) ... val model = if (model_is_distributed) { if (model_is_persisted) { sc.objectFile(model_on_HDFS) } else { engine.algo.train() } } else { ... } }
  • 26. Built-in Engines: •Item Recommendation •Item Rank •Item Similarity
  • 27. Create an Engine Instance Project…. $ pio instance io.prediction.engines.itemrec $ cd io.prediction.engines.itemrec $ pio register
  • 28. Collect Event Data…. cli = predictionio.EventClient("<app_id>") cli.record_user_action_on_item("like", "John", “bulgogi_12”) cli.record_user_action_on_item("view", "John", “bimbimbap_13”)
  • 29. Configurate the Engine Instance settings in params/datasource.json { "appId": <app_id>, "actions": [ "view", "like", ... ], ... }
  • 30. Train the Data Model $ pio train Deploy the Engine Instance $ pio deploy
  • 31. Retrieve Prediction Results from predictionio import EngineClient client = EngineClient(url="http://localhost:8000") prediction = client.send_query({"uid": "John", "n": 3}) print prediction Output {u'items': [{u'272': 9.929327011108398}, {u'313': 9.92607593536377}, {u’347': 9.92170524597168}]}
  • 32. You can also…. • Change algorithm • Tune algorithm parameter • Compare and evaluate algorithm • Add custom business logics
  • 33. SDKs for: • Python • Ruby • PHP • Java / Andriod • Scala • Node.js • iOS • Meteor • more….
  • 34. Also, build your own Engine!
  • 35. Applications of Machine Learning Speech Recognition Personal Newsfeed SPAM Filtering Recommendation Driverless Car Churn Prediction Ad Targeting Fraud Detection {
  • 36. 감사합니다 Korean Documentation (Beta)! http://docs.prediction.io/kr - @PredictionIO - prediction.io - Newsletters - github.com/predictionio