Fikrimuhal TRHUG 2016 Machine Learning

•Download as PPTX, PDF•

1 like•153 views

Sukru Hasdemir

Predictive Analytics is the next step after batch and stream processing.

Technology

Large-Scale Machine Learning
Şükrü Hasdemir, Fikrimuhal
TRHUG 2016

Contents
Batch, Stream, Predictive Analytics
Machine Learning
Apache Spark
ML Example: Recommender Systems
Machine Learning Performance

Machine Learning
• Batch processing shows what happened in
the past
• Stream processing shows what’s happening
now
• Machine Learning predicts the future

Apache Spark
Fast and general framework for Big Data analytics
Most active project in open source Big Data
Faster than Hadoop MapReduce due to “in-memory” computation
Can be used with Java, Scala, Python, R, interactive REPL, notebooks

Apache Spark
Spark included rich libraries for a variety of purposes.

Apache Spark
Coppatible with open source Big Data ecosystem
Hadoop YARN
Mesos
“Standalone”
Cloud:
AWS EMR
Azure HDInsight
Google Cloud Dataproc

Personalized Recommendation Systems
Taking into account personal preferences instead of offering the most popular
items to all users.
Applications: E-commerce, video, music, news…
Increases customer engagement and revenue
Amazon attributes 25% of its revenue to its recommendation system
Netflix Prize: $1M for %10 increase in recommender performance
Requires collection and analaysis of user-item interaction data. Machine
Learning, business rules.

Recommendation Algorithms
Content-Based
Filtering
Uses product features
Collaborative Filtering
Uses actions of other
users
Extrinsic/intrinsic
feedback
Neighborhood models
User/item based
Latent Factor Models

Matrix Factorization Model
Kaynak: https://databricks-training.s3.amazonaws.com/img/matrix_factorization.png

Real World: Performance
Cross-Validation, hyperparameter optimization
Better metrics: Ranking performance metrics
MAP, NDCG, precisionAt(k), …
IR evaluation methods for retrieving highly relevant documents. K. Jarvelin and J. Kekalainen
Online tests
Ensemble (hybrid) models

What's hot

Spark and the Enterprise by Tony BaerSpark Summit

Big Data in Production: Lessons from Running in the CloudJen Aman

Big data on_aws in korea by abhishek sinha (lunch and learn)Amazon Web Services Korea

Building a Just in Time Data Warehouse by Dan Morris and Jason PohlSpark Summit

Hadoop versus sparkPrwaTech

1.demystifying big data & hadoopdatabloginfo

Azure Databricks & Spark @ Techorama 2018Nathan Bijnens

Qubole presentation for the Cleveland Big Data and Hadoop Meetup Qubole

Operationalizing Machine Learning at Scale at StarbucksDatabricks

Prakash_Wagle_ResumePrakash Wagle

MCT Summit Azure automated Machine Learning Usama Wahab Khan Cloud, Data and AI

Big data & HadoopAkshansh Agarwal

Big dataMuhammad Noman Fazil

Building Real Time Targeting Capabilities - Ryan Zotti, Subbu Thiruppathy - C...Sri Ambati

Big Data in the Cloud Amazon Web Services

Splunk hunkbetaAhnku Toh

Cis 528presentation finalpriyalmistry4

Cis 528 big dataakashgandhi10

Scalable Machine Learning using R and Azure HDInsight - ParasharParashar Shah

Harnessing the Hadoop Ecosystem Optimizations in Apache HiveQubole

What's hot (20)

Spark and the Enterprise by Tony Baer

Big Data in Production: Lessons from Running in the Cloud

Big data on_aws in korea by abhishek sinha (lunch and learn)

Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Hadoop versus spark

1.demystifying big data & hadoop

Azure Databricks & Spark @ Techorama 2018

Qubole presentation for the Cleveland Big Data and Hadoop Meetup

Operationalizing Machine Learning at Scale at Starbucks

Prakash_Wagle_Resume

MCT Summit Azure automated Machine Learning

Big data & Hadoop

Big data

Building Real Time Targeting Capabilities - Ryan Zotti, Subbu Thiruppathy - C...

Big Data in the Cloud

Splunk hunkbeta

Cis 528presentation final

Cis 528 big data

Scalable Machine Learning using R and Azure HDInsight - Parashar

Harnessing the Hadoop Ecosystem Optimizations in Apache Hive

Viewers also liked

Introduction to Machine LearningLior Rokach

Introduction to Big Data/Machine LearningLars Marius Garshol

Machine LearningDarshan Ambhaikar

Machine learningInfoFarm

Machine learning pipeline with spark mldatamantra

Machine Learning with Spark MLlibTodd McGrath

Spark streaming with kafkaDori Waldman

Stories Behind Kaggle Competitions with Wendy Kan from KaggleSri Ambati

Lecture4 - Machine LearningAlbert Orriols-Puig

The Future of Machine LearningRussell Miles

Introduction to Machine Learning with Sparkdatamantra

A Beginner's Guide to Machine Learning with Scikit-LearnSarah Guido

Machine Learning and Data Mining: 10 Introduction to ClassificationPier Luca Lanzi

ETL to ML: Use Apache Spark as an end to end tool for Advanced AnalyticsMiklos Christine

Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...Spark Summit

林守德/Practical Issues in Machine Learning台灣資料科學年會

Machine LearningGirish Khanzode

Machine Learning by Example - Apache SparkMeeraj Kunnumpurath

Taking Spark Streaming to the Next Level with Datasets and DataFramesDatabricks

Lecture 1: What is Machine Learning?Marina Santini

Viewers also liked (20)

Introduction to Machine Learning

Introduction to Big Data/Machine Learning

Machine Learning

Machine learning

Machine learning pipeline with spark ml

Machine Learning with Spark MLlib

Spark streaming with kafka

Stories Behind Kaggle Competitions with Wendy Kan from Kaggle

Lecture4 - Machine Learning

The Future of Machine Learning

Introduction to Machine Learning with Spark

A Beginner's Guide to Machine Learning with Scikit-Learn

Machine Learning and Data Mining: 10 Introduction to Classification

ETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics

Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...

林守德/Practical Issues in Machine Learning

Machine Learning

Machine Learning by Example - Apache Spark

Taking Spark Streaming to the Next Level with Datasets and DataFrames

Lecture 1: What is Machine Learning?

Similar to Fikrimuhal TRHUG 2016 Machine Learning

Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...Impetus Technologies

Initiate Edinburgh 2019 - Big Data Meets AIAmazon Web Services

IOT.pptMvidhya9

10 things you need to know about SparkIBM Analytics

IBM Strategy for SparkMark Kerzner

Big data with javaStefan Angelov

INFO491FinalPaperJessica Morris

963Annu Ahmed

PPT5: Neuron Introductionakira-ai

Detailed guide to the Apache Spark FrameworkAegis Software Canada

963Annu Ahmed

Comparison among rdbms, hadoop and sparkAgnihotriGhosh2

Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15MLconf

Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMRProvectus

finap ppt conference.pptxSukhpreetSingh519414

Introduction to sparkHome

HPE Hadoop Solutions - From use cases to proposalDataWorks Summit

Recommendation engineVignesh Prajapati

Started with-apache-sparkHappiest Minds Technologies

TechEvent Databricks on AzureTrivadis

Similar to Fikrimuhal TRHUG 2016 Machine Learning (20)

Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...

Initiate Edinburgh 2019 - Big Data Meets AI

IOT.ppt

10 things you need to know about Spark

IBM Strategy for Spark

Big data with java

INFO491FinalPaper

963

PPT5: Neuron Introduction

Detailed guide to the Apache Spark Framework

963

Comparison among rdbms, hadoop and spark

Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15

Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR

finap ppt conference.pptx

Introduction to spark

HPE Hadoop Solutions - From use cases to proposal

Recommendation engine

Started with-apache-spark

TechEvent Databricks on Azure

Recently uploaded

Story boards and shot lists for my a level piececharlottematthew16

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Pigging Solutions in Pet Food ManufacturingPigging Solutions

Artificial intelligence in cctv survelliance.pptxhariprasad279825

APIForce Zurich 5 April Automation LPDGMarianaLemus7

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

Gen AI in Business - Global Trends Report 2024.pdfAddepto

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

costume and set research powerpoint presentationphoebematthew05

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

WordPress Websites for Engineers: Elevate Your Brandgvaughan

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Recently uploaded (20)

Story boards and shot lists for my a level piece

Unleash Your Potential - Namagunga Girls Coding Club

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

Pigging Solutions in Pet Food Manufacturing

Artificial intelligence in cctv survelliance.pptx

APIForce Zurich 5 April Automation LPDG

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)

Gen AI in Business - Global Trends Report 2024.pdf

Human Factors of XR: Using Human Factors to Design XR Systems

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack

Advanced Test Driven-Development @ php[tek] 2024

costume and set research powerpoint presentation

SIP trunking in Janus @ Kamailio World 2024

WordPress Websites for Engineers: Elevate Your Brand

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

Connect Wave/ connectwave Pitch Deck Presentation

DMCC Future of Trade Web3 - Special Edition

"Debugging python applications inside k8s environment", Andrii Soldatenko

Unraveling Multimodality with Large Language Models.pdf

Fikrimuhal TRHUG 2016 Machine Learning

1. Large-Scale Machine Learning Şükrü Hasdemir, Fikrimuhal TRHUG 2016

2. Contents Batch, Stream, Predictive Analytics Machine Learning Apache Spark ML Example: Recommender Systems Machine Learning Performance

3. Machine Learning • Batch processing shows what happened in the past • Stream processing shows what’s happening now • Machine Learning predicts the future

4. Machine Learning Is Used Everywhere

5. Apache Spark Fast and general framework for Big Data analytics Most active project in open source Big Data Faster than Hadoop MapReduce due to “in-memory” computation Can be used with Java, Scala, Python, R, interactive REPL, notebooks

6. Apache Spark Spark included rich libraries for a variety of purposes.

7. Apache Spark Coppatible with open source Big Data ecosystem Hadoop YARN Mesos “Standalone” Cloud: AWS EMR Azure HDInsight Google Cloud Dataproc

8. Personalized Recommendation Systems Taking into account personal preferences instead of offering the most popular items to all users. Applications: E-commerce, video, music, news… Increases customer engagement and revenue Amazon attributes 25% of its revenue to its recommendation system Netflix Prize: $1M for %10 increase in recommender performance Requires collection and analaysis of user-item interaction data. Machine Learning, business rules.

9. Recommendation Algorithms Content-Based Filtering Uses product features Collaborative Filtering Uses actions of other users Extrinsic/intrinsic feedback Neighborhood models User/item based Latent Factor Models

10. Matrix Factorization Model Kaynak: https://databricks-training.s3.amazonaws.com/img/matrix_factorization.png

11. Real World: Performance Cross-Validation, hyperparameter optimization Better metrics: Ranking performance metrics MAP, NDCG, precisionAt(k), … IR evaluation methods for retrieving highly relevant documents. K. Jarvelin and J. Kekalainen Online tests Ensemble (hybrid) models

12. Thank you!

Fikrimuhal TRHUG 2016 Machine Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Fikrimuhal TRHUG 2016 Machine Learning

Similar to Fikrimuhal TRHUG 2016 Machine Learning (20)

Recently uploaded

Recently uploaded (20)

Fikrimuhal TRHUG 2016 Machine Learning