Lambda-less Stream Processing @Scale in LinkedIn

DataWorks Summit/Hadoop Summit
DataWorks Summit/Hadoop SummitDataWorks Summit/Hadoop Summit
Lambda-less Stream Processing
@Scale in LinkedIn
Yi Pan (Apache Samza PMC/Committer)
Kartik Paramasivam (Mgr -Streams Infra)
June, 2016
Agenda
• Rise of Stream Processing Applications
• Some Hard Problems in Stream Processing
–Data Accuracy
–Reprocessing
• Conclusion
Newsfeed
Cyber-security
Internet of Things
Agenda
• Rise of Stream Processing Applications
• Some Hard Problems in Stream Processing
–Data Accuracy
–Reprocessing
• Conclusion
Data Accuracy
• Can Stream Processing generate accurate
results?
–Yes.. but it is not trivial.
Case Study
Ads
HTML
1:00pm
AdViewEvents
AdQuality processor
Case Study
Ads
HTML
1:01pm
AdViewEvents
AdQuality processor
AdClickEvents
Case Study
Ads
HTML
1:01pm
AdViewEvents
AdQuality processor
AdClickEvents
Did AdClick
happen
within 2min
of AdView?
YesNo
Good AdBad Ad
Delays in Event
Stream
Ad Quality
Processor
(Samza)
Services Tier
Kafka
Services Tier
Ad Quality
Processor
(Samza)
KafkaMirrored
Yi
DATACENTER 1 DATACENTER 2
AdViewEvent
LB
Real Time
Processing
(Samza)
Services Tier
Kafka
Services Tier
Real Time
Processing
(Samza)
KafkaMirrored
Yi
DATACENTER 1 DATACENTER 2
AdClick Event
LB
Delays in Event
Stream
Late
Arrival
Real Time
Processing
(Samza)
Services Tier
Kafka
Services Tier
Real Time
Processing
(Samza)
KafkaMirrored
Yi
DATACENTER 1 DATACENTER 2
AdClick Event
LB
Delays in Event
stream
Out of
Order
Arrival
Lambda at
LinkedIn
Real Time
Processing
(Samza)
Batch
Processing
(Hadoop/Spark)
Voldem
ort R/O
Processing
Bulk
upload
Espresso
Services Tier
Ingestion Serving
Clients(browser,devices,..)
Kafka
• Basic Assumption : Batch jobs have full data-
set
• But, how about edges?
Data Accuracy - with Lambda
Smaller batch size == more edges!
Graph kudos to Stream Processing 101 from Tyler Akidau
(https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101)
10:00 11:00 12:00 13:00 system time
Fixing Lambda
Real Time
Processing
(Samza)
Batch
Processing
(Hadoop/Spark)
Voldemort
R/O
Processing
Bulk
upload
Espresso
Services Tier
Ingestion Serving
Clients(browser,devices, ….)
Kafka
Kafka
Audit
Check
Safe Start
Time
Observation
• Data Accuracy is still very hard with Lambda
–Additional system (e.g. Kafka Audit) has to be
used to safely start the batch jobs
• Duplication in Online/Offline system:
–Development cost
–Operational overhead
–Maintenance overhead
Going Lambda-less
• Handle late arrivals and out of order arrivals
• Eventually correct results
– Compute results at end of ‘window’.
– Re-compute when events arrives late
• Influenced by “Google MillWheel”
Going Lambda-less
AdViewEvent
AdClickEvent
AdQuality processor
1:00pm1:01pm1:01pm1:02pm1:02pm
1:00pm1:02pm
Window output is computed at the end of
window = (2min after the window is created)
window(“1:00pm”, “2min”)
Kafka
Kafka
Handling ‘late arrival’
1:00pm1:01pm1:01pm1:02pm1:02pm
1:00pm1:02pm
1:01pm
Late-arrival
Re-compute
window(“1:00pm”, “2min”)
Kafka
Kafka
AdViewEvent
AdClickEvent
AdQuality processor
Handling ‘out of order arrival’
1:01pm1:02pm
1:00pm1:02pm
null join result in
window(“1:00pm”, “2min”)
Kafka
Kafka
AdViewEvent
AdClickEvent
AdQuality processor
Handling ‘out of order arrival’
1:01pm1:02pm1:00pm1:01pm
1:00pm1:02pm
Re-compute
window(“1:00pm”, “2min”)
Out-of-order arrival
Kafka
Kafka
AdViewEvent
AdClickEvent
AdQuality processor
SamzaContainer-1
Samza based Solution
Kafka
AdClicks
SamzaContainer-0
Task1
Task2
Task3
AdView
Events are saved into RocksDB based local message
store which is backed up durably in Kafka
Kafka
Samza Job
Changelog
in Kafka
SamzaContainer-1
Performance
Kafka
AdClicks
SamzaContainer-0
Task1
Task2
Task3
AdView
Performance of Samza’s local RocksDB store:
- 1.1 Million TPS (read/write) on single machine (ssd)
- Largest production job has 1.5 Terabyte of local state
Kafka
Samza Job
Changelog
in Kafka
Agenda
• Rise of Stream Processing Applications
• Some Hard Problems in Stream Processing
– Data Accuracy
–Reprocessing
• Conclusion
Reprocessing
• What is reprocessing ?
–Process events that happened in the past.
Case Study : Title Standardization
LinkedIn
Profile
change ‘Title’ :
Before: Architect
After: Chief
Technology
Nerd
Title
Standardizer
Search Ads ….
Title Standardizer -
Implementation
output
Member
Database
(espresso)
Profile
Updates
(Samza) Title-
Standardizer
Machine Learning
model
Kafka
Databus
Reprocessing - dealing with bugs
output
Member
Database
(espresso)
Profile
Updates
(Samza) Title-
Standardizer
Kafka
Databus
rewind 4 hours
Machine Learning
model
Reprocessing - entire Dataset
output
Member
Database
(espresso)
Profile
Updates
(Samza) Title-
Standardizer
Kafka
Databus
Bootstrap
Backup
Database
Backup
(NFS)
set offset=0
Machine Learning
model (NEW)
Reprocessing - entire Dataset
Profile
Updates
(Samza) Title-
Standardizer
(Samza) Title-
Standardizer
Bootstrap
Backup Machine Learning
model (NEW)
output
Kafka
Databus
Databus
Member
Database
(espresso)
Database
Backup
(NFS)
set offset=0
Reprocessing - entire Dataset
Profile
Updates
(Samza) Title-
Standardizer
(Samza) Title-
Standardizer
BootstrapBackup
Machine Learning
model (NEW)
output
Kafka
Databus
Databus
(Samza)
Merge and
Store
Results
Reprocessing- Caveats
• Stream processors are fast.. They can DOS the
system if you reprocess
– Control max-concurrency of your job
– Quotas for Kafka, Databases
• Reprocessing a 100 TB source ?
–Capacity ?
–Saturation of NICs, Top of rack switches
Reprocessing larger datasets
Profile
Updates
(Samza) Title-
Standardizer
Machine Learning
model
output
Kafka
Databus
(Samza)
Merge and
Store
Results
Database
Dump in
HDFS
(Samza) Title-
Standardizer
ML Model in
HDFS
Hadoop
Experimentation
Database
Dump in HDFS
(Samza) Title-
Standardizer
Hadoop
ML Model in
HDFS
Output in HDFS
● Offline experimentation before pushing the logic online
○ Most datasets are already available in Hadoop (at LinkedIn)
○ Fast Iteration with minimum impact to production
Conclusion
1.It is possible to avoid code
duplication(hot/cold path) to support
– Accuracy
–Reprocessing
2. Some Lambda related problems still linger
when reprocessing entire datasets
–e.g. merging online/reprocessing results
References
• MillWheel: http://research.google.com/pubs/pub41378.html
• DataFlow: http://research.google.com/pubs/pub41378.html
• Samza: http://samza.apache.org/
• Window Operator in Samza: https://issues.apache.org/jira/browse/SAMZA-552
• Lambda Architecture: https://www.manning.com/books/big-data
• Stream Processing 101: https://www.oreilly.com/ideas/the-world-beyond-batch-
streaming-101
• Stream Processing 102: https://www.oreilly.com/ideas/the-world-beyond-batch-
streaming-102
Thank You!
1 of 38

Recommended

What's new in SQL on Hadoop and Beyond by
What's new in SQL on Hadoop and BeyondWhat's new in SQL on Hadoop and Beyond
What's new in SQL on Hadoop and BeyondDataWorks Summit/Hadoop Summit
708 views25 slides
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre... by
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...Data Con LA
362 views21 slides
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks by
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksData Con LA
2.5K views20 slides
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn by
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn confluent
2.3K views56 slides
Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc... by
Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc...Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc...
Designing and Implementing a Real-time Data Lake with Dynamically Changing Sc...Databricks
850 views75 slides
WhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms by
WhereHows: Taming Metadata for 150K Datasets Over 9 Data PlatformsWhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms
WhereHows: Taming Metadata for 150K Datasets Over 9 Data PlatformsMars Lan
1.8K views26 slides

More Related Content

What's hot

Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse by
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseData Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseDataWorks Summit
998 views34 slides
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO) by
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)Spark Summit
6.7K views23 slides
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard by
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardParis Data Engineers !
1.3K views42 slides
Quark Virtualization Engine for Analytics by
Quark Virtualization Engine for Analytics Quark Virtualization Engine for Analytics
Quark Virtualization Engine for Analytics DataWorks Summit/Hadoop Summit
942 views21 slides
Big Data Computing Architecture by
Big Data Computing ArchitectureBig Data Computing Architecture
Big Data Computing ArchitectureGang Tao
732 views38 slides
Big Data Ready Enterprise by
Big Data Ready Enterprise Big Data Ready Enterprise
Big Data Ready Enterprise DataWorks Summit/Hadoop Summit
2.2K views25 slides

What's hot(19)

Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse by DataWorks Summit
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseData Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
DataWorks Summit998 views
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO) by Spark Summit
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
Spark Summit6.7K views
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard by Paris Data Engineers !
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Big Data Computing Architecture by Gang Tao
Big Data Computing ArchitectureBig Data Computing Architecture
Big Data Computing Architecture
Gang Tao732 views
Visual Mapping of Clickstream Data by DataWorks Summit
Visual Mapping of Clickstream DataVisual Mapping of Clickstream Data
Visual Mapping of Clickstream Data
DataWorks Summit5.7K views
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi... by Data Con LA
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Data Con LA452 views
Big Data Day LA 2015 - Introducing N1QL: SQL for Documents by Jeff Morris of ... by Data Con LA
Big Data Day LA 2015 - Introducing N1QL: SQL for Documents by Jeff Morris of ...Big Data Day LA 2015 - Introducing N1QL: SQL for Documents by Jeff Morris of ...
Big Data Day LA 2015 - Introducing N1QL: SQL for Documents by Jeff Morris of ...
Data Con LA500 views
Big Telco - Yousun Jeong by Spark Summit
Big Telco - Yousun JeongBig Telco - Yousun Jeong
Big Telco - Yousun Jeong
Spark Summit4.3K views
From Batch to Streaming ET(L) with Apache Apex by DataWorks Summit
From Batch to Streaming ET(L) with Apache ApexFrom Batch to Streaming ET(L) with Apache Apex
From Batch to Streaming ET(L) with Apache Apex
DataWorks Summit505 views
"Who Moved my Data? - Why tracking changes and sources of data is critical to... by Cask Data
"Who Moved my Data? - Why tracking changes and sources of data is critical to..."Who Moved my Data? - Why tracking changes and sources of data is critical to...
"Who Moved my Data? - Why tracking changes and sources of data is critical to...
Cask Data414 views
Realtime streaming architecture in INFINARIO by Jozo Kovac
Realtime streaming architecture in INFINARIORealtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIO
Jozo Kovac1.4K views
Spark meetup - Zoomdata Streaming by Zoomdata
Spark meetup  - Zoomdata StreamingSpark meetup  - Zoomdata Streaming
Spark meetup - Zoomdata Streaming
Zoomdata888 views
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1 by Ognjen Antonic
03-NOV-1510-Ognjen-Antonic-Telemach-stream-103-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
Ognjen Antonic379 views

Similar to Lambda-less Stream Processing @Scale in LinkedIn

Essential Ingredients of Realtime Stream Processing @ Scale by
Essential Ingredients of Realtime Stream Processing @ ScaleEssential Ingredients of Realtime Stream Processing @ Scale
Essential Ingredients of Realtime Stream Processing @ ScaleKartik Paramasivam
1.3K views47 slides
ApacheCon BigData - What it takes to process a trillion events a day? by
ApacheCon BigData - What it takes to process a trillion events a day?ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?Jagadish Venkatraman
181 views70 slides
Essential ingredients for real time stream processing @Scale by Kartik pParam... by
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Big Data Spain
389 views48 slides
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability by
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
141 views24 slides
Apache Kafka - Scalable Message-Processing and more ! by
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
1.8K views42 slides
stream-processing-at-linkedin-with-apache-samza by
stream-processing-at-linkedin-with-apache-samzastream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samzaAbhishek Shivanna
478 views61 slides

Similar to Lambda-less Stream Processing @Scale in LinkedIn (20)

Essential Ingredients of Realtime Stream Processing @ Scale by Kartik Paramasivam
Essential Ingredients of Realtime Stream Processing @ ScaleEssential Ingredients of Realtime Stream Processing @ Scale
Essential Ingredients of Realtime Stream Processing @ Scale
Kartik Paramasivam1.3K views
ApacheCon BigData - What it takes to process a trillion events a day? by Jagadish Venkatraman
ApacheCon BigData - What it takes to process a trillion events a day?ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?
Essential ingredients for real time stream processing @Scale by Kartik pParam... by Big Data Spain
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Big Data Spain389 views
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability by ScyllaDB
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
ScyllaDB141 views
Apache Kafka - Scalable Message-Processing and more ! by Guido Schmutz
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz1.8K views
stream-processing-at-linkedin-with-apache-samza by Abhishek Shivanna
stream-processing-at-linkedin-with-apache-samzastream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samza
Abhishek Shivanna478 views
The Evolution of Trillion-level Real-time Messaging System in BIGO - Puslar ... by StreamNative
The Evolution of Trillion-level Real-time Messaging System in BIGO  - Puslar ...The Evolution of Trillion-level Real-time Messaging System in BIGO  - Puslar ...
The Evolution of Trillion-level Real-time Messaging System in BIGO - Puslar ...
StreamNative81 views
AWS Lambda Supports Parallelization Factor for Kinesis and DynamoDB Event Sou... by Swapnil Pawar
AWS Lambda Supports Parallelization Factor for Kinesis and DynamoDB Event Sou...AWS Lambda Supports Parallelization Factor for Kinesis and DynamoDB Event Sou...
AWS Lambda Supports Parallelization Factor for Kinesis and DynamoDB Event Sou...
Swapnil Pawar3.2K views
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv by Amazon Web Services
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Amazon Web Services4.6K views
Samza at LinkedIn by Venu Ryali
Samza at LinkedInSamza at LinkedIn
Samza at LinkedIn
Venu Ryali109 views
Scalable Stream Processing with Apache Samza by Prateek Maheshwari
Scalable Stream Processing with Apache SamzaScalable Stream Processing with Apache Samza
Scalable Stream Processing with Apache Samza
Prateek Maheshwari142 views
Raleigh DevDay 2017: Real time data processing using AWS Lambda by Amazon Web Services
Raleigh DevDay 2017: Real time data processing using AWS LambdaRaleigh DevDay 2017: Real time data processing using AWS Lambda
Raleigh DevDay 2017: Real time data processing using AWS Lambda
London hug-samza by huguk
London hug-samzaLondon hug-samza
London hug-samza
huguk2.6K views
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN by blueboxtraveler
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNApache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
blueboxtraveler7.6K views
Samza at LinkedIn: Taking Stream Processing to the Next Level by Martin Kleppmann
Samza at LinkedIn: Taking Stream Processing to the Next LevelSamza at LinkedIn: Taking Stream Processing to the Next Level
Samza at LinkedIn: Taking Stream Processing to the Next Level
Martin Kleppmann4.4K views
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013 by Amazon Web Services
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Amazon Web Services26.8K views
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job by Lightbend
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the JobAkka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Lightbend19.4K views
Netflix Keystone—Cloud scale event processing pipeline by Monal Daxini
Netflix Keystone—Cloud scale event processing pipelineNetflix Keystone—Cloud scale event processing pipeline
Netflix Keystone—Cloud scale event processing pipeline
Monal Daxini3.3K views
Data Con LA 2022 Keynote by Data Con LA
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA10 views

More from DataWorks Summit/Hadoop Summit

Running Apache Spark & Apache Zeppelin in Production by
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionDataWorks Summit/Hadoop Summit
9.6K views28 slides
State of Security: Apache Spark & Apache Zeppelin by
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinDataWorks Summit/Hadoop Summit
3.2K views25 slides
Unleashing the Power of Apache Atlas with Apache Ranger by
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
6.8K views33 slides
Enabling Digital Diagnostics with a Data Science Platform by
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
1.4K views10 slides
Revolutionize Text Mining with Spark and Zeppelin by
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinDataWorks Summit/Hadoop Summit
2.1K views28 slides
Double Your Hadoop Performance with Hortonworks SmartSense by
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
1K views28 slides

More from DataWorks Summit/Hadoop Summit(20)

Recently uploaded

TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensorssugiuralab
19 views15 slides
Data Integrity for Banking and Financial Services by
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial ServicesPrecisely
21 views26 slides
Vertical User Stories by
Vertical User StoriesVertical User Stories
Vertical User StoriesMoisés Armani Ramírez
14 views16 slides
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdfDr. Jimmy Schwarzkopf
19 views29 slides
Scaling Knowledge Graph Architectures with AI by
Scaling Knowledge Graph Architectures with AIScaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AIEnterprise Knowledge
30 views15 slides

Recently uploaded(20)

TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab19 views
Data Integrity for Banking and Financial Services by Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely21 views
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by Dr. Jimmy Schwarzkopf
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf
HTTP headers that make your website go faster - devs.gent November 2023 by Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn22 views
Serverless computing with Google Cloud (2023-24) by wesley chun
Serverless computing with Google Cloud (2023-24)Serverless computing with Google Cloud (2023-24)
Serverless computing with Google Cloud (2023-24)
wesley chun11 views
Transcript: The Details of Description Techniques tips and tangents on altern... by BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada136 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker37 views
handbook for web 3 adoption.pdf by Liveplex
handbook for web 3 adoption.pdfhandbook for web 3 adoption.pdf
handbook for web 3 adoption.pdf
Liveplex22 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman33 views
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma39 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software263 views

Lambda-less Stream Processing @Scale in LinkedIn