SlideShare a Scribd company logo
1 of 9
Apache Spark:
The Analytics Operating System
Anjul Bhambhri
Vice President, IBM Big Data Engineering
Deep Blue SQL RISC
DNA Transistor Magnetic Tape Linux PC
Fortran DRAM Mainframe Watson
Floppy Disk UPC
Punch Card
IBM: 100 years of (supporting) innovation
The
Analytics
Operating System
Apache Spark
Enhance it! Offer it!
Leverage it!
Spark Technology
Center @ SF
On-prem and on
the cloud
Inside our products
At IBM, We Love Spark!
IBM Cloud Data Services
now featuring Spark is
open for data
IBM is Building on Apache Spark
• IBM Analytics
• IBM Commerce
• IBM Watson
• IBM Research
• IBM Cloud
Quarks from IBM
Announced Feb 2016
• Open-source platform for
building IoT applications
• Light-weight & embeddable
• Integrates with Spark
• Lambda Architecture and Spark enable efficient batch and streaming analytics
• Visualization at every step of data discovery enables better self service
The Weather Company clusters running hot:
 ~30 billion API requests per day
 ~120 million active mobile users
 #3 most active mobile user base
 Billions of events per day (1.3M/sec)
 ~360 PB of traffic daily
 Need to keep data forever
The use case:
Efficient batch + streaming analysis
Self-serve data science
BI / visualization tool support
An IBM Business
Spark for daily weather
Spark in Health Care
Health Care Data Lakes
 Improve how healthcare is delivered
 Collect and combine data from dozens of sources
 Clinical, Operational, Financial
 Inside and outside your enterprise
Benefits
 Better medical outcomes for patients
 Control cost and improve quality
SystemML on Spark
 Predictive Risk Modeling
 Right patient intervention relating to adverse health events
Spark in Telecom
The challenge:
 Improve customer satisfaction rates
 Multiple channels for customer interactions
 Very large data volumes
The need:
 Create a 360 degree view of a customer
 Stitch all interactions across channels –
“Customer Experience Journey”
 Classify interaction sentiment and take
necessary actions
• Spark Streaming brings all the data together
• Spark Core is used to process and transform text and voice data
• Spark MLLib algorithms stitch interactions on a journey and score “sentiment”
• Spark SQL drives interactive queries via visual dashboards
PUB / SUB
MQTT / WebSockets / Flume / Kafka
` ` `
Journey
Dashboards
Interaction & Journey Data
Voice &
Text Dat
a
Apache Spark:
The Analytics Operating System
THANK YOU!

More Related Content

What's hot

Spline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured StreamingSpline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured StreamingVaclav Kosar
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine kiran palaka
 
Spark Summit EU talk by Yiannis Gkoufas
Spark Summit EU talk by Yiannis GkoufasSpark Summit EU talk by Yiannis Gkoufas
Spark Summit EU talk by Yiannis GkoufasSpark Summit
 
Sydney Spark Meetup - September 2015
Sydney Spark Meetup - September 2015Sydney Spark Meetup - September 2015
Sydney Spark Meetup - September 2015Andy Huang
 
Spline 2 - Vision and Architecture Overview
Spline 2 - Vision and Architecture OverviewSpline 2 - Vision and Architecture Overview
Spline 2 - Vision and Architecture OverviewVaclav Kosar
 
Writing Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark APIWriting Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark APIDatabricks
 
Distributed ML in Apache Spark
Distributed ML in Apache SparkDistributed ML in Apache Spark
Distributed ML in Apache SparkDatabricks
 
Sydney Apache Spark Meetup - Spark Natural Language Processing
Sydney Apache Spark Meetup - Spark Natural Language ProcessingSydney Apache Spark Meetup - Spark Natural Language Processing
Sydney Apache Spark Meetup - Spark Natural Language ProcessingAndy Huang
 
Spark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan KesslerSpark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan KesslerSpark Summit
 
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
 Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi... Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...Databricks
 
Apache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectosApache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectosOpenSistemas
 
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...Databricks
 
Future of data visualization
Future of data visualizationFuture of data visualization
Future of data visualizationhadoopsphere
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureDatabricks
 
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0Databricks
 
 Kafka Streams VS Spark Structured Streaming - Modern Stream Processing Engin...
 Kafka Streams VS Spark Structured Streaming - Modern Stream Processing Engin... Kafka Streams VS Spark Structured Streaming - Modern Stream Processing Engin...
 Kafka Streams VS Spark Structured Streaming - Modern Stream Processing Engin...Jacek Laskowski
 
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakBuilding a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakHakka Labs
 
Spark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean WamplerSpark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean WamplerSpark Summit
 
How We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IOHow We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IODatabricks
 

What's hot (20)

Spline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured StreamingSpline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured Streaming
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine
 
Spark Summit EU talk by Yiannis Gkoufas
Spark Summit EU talk by Yiannis GkoufasSpark Summit EU talk by Yiannis Gkoufas
Spark Summit EU talk by Yiannis Gkoufas
 
Sydney Spark Meetup - September 2015
Sydney Spark Meetup - September 2015Sydney Spark Meetup - September 2015
Sydney Spark Meetup - September 2015
 
Spline 2 - Vision and Architecture Overview
Spline 2 - Vision and Architecture OverviewSpline 2 - Vision and Architecture Overview
Spline 2 - Vision and Architecture Overview
 
Writing Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark APIWriting Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark API
 
Distributed ML in Apache Spark
Distributed ML in Apache SparkDistributed ML in Apache Spark
Distributed ML in Apache Spark
 
Sydney Apache Spark Meetup - Spark Natural Language Processing
Sydney Apache Spark Meetup - Spark Natural Language ProcessingSydney Apache Spark Meetup - Spark Natural Language Processing
Sydney Apache Spark Meetup - Spark Natural Language Processing
 
Spark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan KesslerSpark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan Kessler
 
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
 Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi... Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
 
Apache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectosApache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectos
 
Apache HBase Workshop
Apache HBase WorkshopApache HBase Workshop
Apache HBase Workshop
 
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
 
Future of data visualization
Future of data visualizationFuture of data visualization
Future of data visualization
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and Architecture
 
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
 
 Kafka Streams VS Spark Structured Streaming - Modern Stream Processing Engin...
 Kafka Streams VS Spark Structured Streaming - Modern Stream Processing Engin... Kafka Streams VS Spark Structured Streaming - Modern Stream Processing Engin...
 Kafka Streams VS Spark Structured Streaming - Modern Stream Processing Engin...
 
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakBuilding a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe Crobak
 
Spark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean WamplerSpark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean Wampler
 
How We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IOHow We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IO
 

Viewers also liked

Spark Summit Presentation by Anjul Bhambhri
Spark Summit Presentation by Anjul BhambhriSpark Summit Presentation by Anjul Bhambhri
Spark Summit Presentation by Anjul BhambhriSpark Summit
 
Personal Data Law Update, Kazakhstan, 2015
Personal Data Law Update, Kazakhstan, 2015Personal Data Law Update, Kazakhstan, 2015
Personal Data Law Update, Kazakhstan, 2015Galina Pogrebnaya
 
Social Psychology comic presentation slides - fnbe 0315
Social Psychology comic presentation slides - fnbe 0315Social Psychology comic presentation slides - fnbe 0315
Social Psychology comic presentation slides - fnbe 0315kellyxc
 
Ca vs mba
Ca vs mbaCa vs mba
Ca vs mbaeduCBA
 
[ AULA 1 LV ] O CORTIÇO, ALUÍSIO AZEVEDO
[ AULA 1 LV ] O CORTIÇO, ALUÍSIO AZEVEDO[ AULA 1 LV ] O CORTIÇO, ALUÍSIO AZEVEDO
[ AULA 1 LV ] O CORTIÇO, ALUÍSIO AZEVEDOAline Gomes
 
Eras norman autoevaluacion
Eras norman autoevaluacion Eras norman autoevaluacion
Eras norman autoevaluacion Norman Lucero
 
Feb 24 CCCOER Advisory Mtg
Feb 24 CCCOER Advisory MtgFeb 24 CCCOER Advisory Mtg
Feb 24 CCCOER Advisory MtgUna Daly
 
Качество исследуемых лекарственных препаратов для терапии соматическими клетками
Качество исследуемых лекарственных препаратов для терапии соматическими клеткамиКачество исследуемых лекарственных препаратов для терапии соматическими клетками
Качество исследуемых лекарственных препаратов для терапии соматическими клеткамиPHARMADVISOR
 
Neu khong chi nho an dien cua dct
Neu khong chi nho an dien cua dctNeu khong chi nho an dien cua dct
Neu khong chi nho an dien cua dctco_doc_nhan
 

Viewers also liked (16)

Spark Summit Presentation by Anjul Bhambhri
Spark Summit Presentation by Anjul BhambhriSpark Summit Presentation by Anjul Bhambhri
Spark Summit Presentation by Anjul Bhambhri
 
Personal Data Law Update, Kazakhstan, 2015
Personal Data Law Update, Kazakhstan, 2015Personal Data Law Update, Kazakhstan, 2015
Personal Data Law Update, Kazakhstan, 2015
 
Social Psychology comic presentation slides - fnbe 0315
Social Psychology comic presentation slides - fnbe 0315Social Psychology comic presentation slides - fnbe 0315
Social Psychology comic presentation slides - fnbe 0315
 
NCTBS_TorontoReview
NCTBS_TorontoReviewNCTBS_TorontoReview
NCTBS_TorontoReview
 
Ly do toi tin
Ly do toi tinLy do toi tin
Ly do toi tin
 
Ca vs mba
Ca vs mbaCa vs mba
Ca vs mba
 
R comamnder pdf
R comamnder pdfR comamnder pdf
R comamnder pdf
 
[ AULA 1 LV ] O CORTIÇO, ALUÍSIO AZEVEDO
[ AULA 1 LV ] O CORTIÇO, ALUÍSIO AZEVEDO[ AULA 1 LV ] O CORTIÇO, ALUÍSIO AZEVEDO
[ AULA 1 LV ] O CORTIÇO, ALUÍSIO AZEVEDO
 
Eras norman autoevaluacion
Eras norman autoevaluacion Eras norman autoevaluacion
Eras norman autoevaluacion
 
Feb 24 CCCOER Advisory Mtg
Feb 24 CCCOER Advisory MtgFeb 24 CCCOER Advisory Mtg
Feb 24 CCCOER Advisory Mtg
 
new paper for khulna
new paper for khulnanew paper for khulna
new paper for khulna
 
Avelli MCS_pitch
Avelli MCS_pitchAvelli MCS_pitch
Avelli MCS_pitch
 
Proyecto de informatica
Proyecto de informaticaProyecto de informatica
Proyecto de informatica
 
Excel referencias
Excel  referenciasExcel  referencias
Excel referencias
 
Качество исследуемых лекарственных препаратов для терапии соматическими клетками
Качество исследуемых лекарственных препаратов для терапии соматическими клеткамиКачество исследуемых лекарственных препаратов для терапии соматическими клетками
Качество исследуемых лекарственных препаратов для терапии соматическими клетками
 
Neu khong chi nho an dien cua dct
Neu khong chi nho an dien cua dctNeu khong chi nho an dien cua dct
Neu khong chi nho an dien cua dct
 

Similar to Keynote at spark summit east anjul

Spark Summit EU: IBM Keynote
Spark Summit EU: IBM KeynoteSpark Summit EU: IBM Keynote
Spark Summit EU: IBM Keynotesparktc
 
Initiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIInitiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIAmazon Web Services
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022StreamNative
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkDatabricks
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLSingleStore
 
Keine Angst vorm Dinosaurier: Mainframe-Integration und -Offloading mit Confl...
Keine Angst vorm Dinosaurier: Mainframe-Integration und -Offloading mit Confl...Keine Angst vorm Dinosaurier: Mainframe-Integration und -Offloading mit Confl...
Keine Angst vorm Dinosaurier: Mainframe-Integration und -Offloading mit Confl...Precisely
 
Webinar - Big Data: Let's SMACK - Jorg Schad
Webinar - Big Data: Let's SMACK - Jorg SchadWebinar - Big Data: Let's SMACK - Jorg Schad
Webinar - Big Data: Let's SMACK - Jorg SchadCodemotion
 
Apache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesApache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesKai Wähner
 
Big Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Value Association
 
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...Impetus Technologies
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertconfluent
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Kai Wähner
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016StampedeCon
 
Time's Up! Getting Value from Big Data Now
Time's Up! Getting Value from Big Data NowTime's Up! Getting Value from Big Data Now
Time's Up! Getting Value from Big Data NowEric Kavanagh
 
SQL Saturday Redmond The Power Platform
SQL Saturday Redmond The Power Platform SQL Saturday Redmond The Power Platform
SQL Saturday Redmond The Power Platform Berkovich Consulting
 
Apache Spark Streaming -Real time web server log analytics
Apache Spark Streaming -Real time web server log analyticsApache Spark Streaming -Real time web server log analytics
Apache Spark Streaming -Real time web server log analyticsANKIT GUPTA
 
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per DayCyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per DayAmazon Web Services
 
Getting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseGetting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseSplunk
 
Thing you didn't know you could do in Spark
Thing you didn't know you could do in SparkThing you didn't know you could do in Spark
Thing you didn't know you could do in SparkSnappyData
 

Similar to Keynote at spark summit east anjul (20)

Spark Summit EU: IBM Keynote
Spark Summit EU: IBM KeynoteSpark Summit EU: IBM Keynote
Spark Summit EU: IBM Keynote
 
Initiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIInitiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AI
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQL
 
Keine Angst vorm Dinosaurier: Mainframe-Integration und -Offloading mit Confl...
Keine Angst vorm Dinosaurier: Mainframe-Integration und -Offloading mit Confl...Keine Angst vorm Dinosaurier: Mainframe-Integration und -Offloading mit Confl...
Keine Angst vorm Dinosaurier: Mainframe-Integration und -Offloading mit Confl...
 
Webinar - Big Data: Let's SMACK - Jorg Schad
Webinar - Big Data: Let's SMACK - Jorg SchadWebinar - Big Data: Let's SMACK - Jorg Schad
Webinar - Big Data: Let's SMACK - Jorg Schad
 
Apache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesApache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice Architectures
 
Big Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICS
 
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...
 
Analysing Data in Real-time
Analysing Data in Real-timeAnalysing Data in Real-time
Analysing Data in Real-time
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016
 
Time's Up! Getting Value from Big Data Now
Time's Up! Getting Value from Big Data NowTime's Up! Getting Value from Big Data Now
Time's Up! Getting Value from Big Data Now
 
SQL Saturday Redmond The Power Platform
SQL Saturday Redmond The Power Platform SQL Saturday Redmond The Power Platform
SQL Saturday Redmond The Power Platform
 
Apache Spark Streaming -Real time web server log analytics
Apache Spark Streaming -Real time web server log analyticsApache Spark Streaming -Real time web server log analytics
Apache Spark Streaming -Real time web server log analytics
 
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per DayCyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
 
Getting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseGetting Started with Splunk Enterprise
Getting Started with Splunk Enterprise
 
Thing you didn't know you could do in Spark
Thing you didn't know you could do in SparkThing you didn't know you could do in Spark
Thing you didn't know you could do in Spark
 

Recently uploaded

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Keynote at spark summit east anjul

  • 1. Apache Spark: The Analytics Operating System Anjul Bhambhri Vice President, IBM Big Data Engineering
  • 2. Deep Blue SQL RISC DNA Transistor Magnetic Tape Linux PC Fortran DRAM Mainframe Watson Floppy Disk UPC Punch Card IBM: 100 years of (supporting) innovation
  • 4. Enhance it! Offer it! Leverage it! Spark Technology Center @ SF On-prem and on the cloud Inside our products At IBM, We Love Spark! IBM Cloud Data Services now featuring Spark is open for data
  • 5. IBM is Building on Apache Spark • IBM Analytics • IBM Commerce • IBM Watson • IBM Research • IBM Cloud Quarks from IBM Announced Feb 2016 • Open-source platform for building IoT applications • Light-weight & embeddable • Integrates with Spark
  • 6. • Lambda Architecture and Spark enable efficient batch and streaming analytics • Visualization at every step of data discovery enables better self service The Weather Company clusters running hot:  ~30 billion API requests per day  ~120 million active mobile users  #3 most active mobile user base  Billions of events per day (1.3M/sec)  ~360 PB of traffic daily  Need to keep data forever The use case: Efficient batch + streaming analysis Self-serve data science BI / visualization tool support An IBM Business Spark for daily weather
  • 7. Spark in Health Care Health Care Data Lakes  Improve how healthcare is delivered  Collect and combine data from dozens of sources  Clinical, Operational, Financial  Inside and outside your enterprise Benefits  Better medical outcomes for patients  Control cost and improve quality SystemML on Spark  Predictive Risk Modeling  Right patient intervention relating to adverse health events
  • 8. Spark in Telecom The challenge:  Improve customer satisfaction rates  Multiple channels for customer interactions  Very large data volumes The need:  Create a 360 degree view of a customer  Stitch all interactions across channels – “Customer Experience Journey”  Classify interaction sentiment and take necessary actions • Spark Streaming brings all the data together • Spark Core is used to process and transform text and voice data • Spark MLLib algorithms stitch interactions on a journey and score “sentiment” • Spark SQL drives interactive queries via visual dashboards PUB / SUB MQTT / WebSockets / Flume / Kafka ` ` ` Journey Dashboards Interaction & Journey Data Voice & Text Dat a
  • 9. Apache Spark: The Analytics Operating System THANK YOU!