SlideShare a Scribd company logo
Building Streaming Applications
With
Apache Storm 1.1
Meetup
Hortonworks, April 20th, 2017
Presenters
• Sriharsha Chintalapani, Storm & Kafka
Committer, PMC @ Hortonworks
• Karthik Deivasigamani, Walmart Labs
• Roshan Naik, Storm Contributor, Flume
Commiter @ Hortonworks
• Hugo Louro, Storm Committer, PMC @
Hortonworks
Apache Storm
Apache Storm Brief History
• 2010 - First Streaming Framework - Backtype
• 2011 – Acquired by and Deployed at Twitter
• 2013 - Open Sourced into Apache
• Present – Large Scale Production Deployments
– Yahoo 3500+ Nodes
– Alibaba 1PB of Data per Day
Prior Releases Highlights
• 0.9.x
• Storm becomes an Apache TLP
• First Official Apache Release
• Expanded Kafka, HDFS, HBase Integration
• 0.10.x
• Multi Tenancy
• Rolling Upgrades
• Improved Logging (Log4j2)
• JDBC, Event Hubs, Hive Integration
Prior Releases Highlights
• 1.0
– Pacemaker (Replaces Zookeeper for Heartbeats)
– Security (Kerberos/Digest Authentication)
– Nimbus HA (Eliminates Single Point of Failure)
– Supervisor Health Checks
– Resource Aware Scheduler
Prior Releases Highlights
• 1.0
– Stateful Bolts
– Automatic Checkpointing/Snapshots
• ABS [2], Chandy-Lamport [3] Algorithms
– Streaming Windows
• Sliding, Tumbling, Watermarks, Out of Order Tuples
– Dynamic Log Levels
– Distributed Log Search
– Worker Profiling
– Solr, Cassandra, Elastic Search, MQTT Integration
Apache Storm 1.1.0
March 29, 2017
• Streaming SQL
• Improved Apache Kafka Integration
• PMML Support (Machine Learning)
• Druid Integration
• OpenTSDB Integration
Apache Storm 1.1.0
March 29, 2017
• AWS Kinesis Support
• HDFS Spout
• Other Enhancements
–Flux
–Topology Deployment
–Resource Aware Scheduler
Streaming SQL
• Apache Calcite for Query Parsing/Planning
• Define Topology Using SQL Like Query
• SQL Compiled and Transformed onto a Trident
Topology
• Streaming Onto/From Arbitrary Data Sources
– Kafka, Redis, HDFS, MongoDB
– Extensible Implementing ISqlTridentDataSource
Streaming SQL
• Tuple Filtering
• Projections
• CSV, TSV, and Avro input/output formats
• User Defined Functions (UDFs)
• User fine control of Parallelism of Generated
Components
Streaming SQL - Aggregate UDF
Streaming SQL – Example [1]
• Read Apache HTTPD server logs from Kafka
• Filter out everything but error log events
• Write the error events onto a Kafka topic
Streaming SQL – Example [1]
Improved Apache Kafka Integration
• Enhanced configuration API
• Support Consumer Groups
• Pluggable Translators Kafka Record ->Tuple
• Support for Topic Wildcards
• Support Multiple Streams, Topics/Stream
• Trident Kafka supporting Kafka 0.10 onwards
• Integrates with Secure Kafka Environments
Improved Apache Kafka Integration
PMML Support (Machine Learning)
• Predictive Model Markup Language
• Describes Model Learned by ML algorithms
• PmmlPredictorBolt Computes Predicted Scores
for Live Tuples according to PMML Model
• PMML Model Uploaded or Downloaded from
Distributed Cache
PMML Support (Machine Learning)
Storm 1.1.0 Improvements
• Flux
– Visualization in Storm UI
• Specify the resource requirements (Memory/CPU) for
individual topology components (Spouts/Bolts)
• Topology Deployment
– Alternative to Uber Jar
– storm jar --jars /path/to/local/jar --artifacts `resolve Maven
dependencies` -- arfifactRepository `additional Maven
repos`
• Specify the resource requirements (Memory/CPU) for
individual topology components (Spouts/Bolts)
Try Storm 1.1.0
https://hortonworks.com/hadoop-tutorial/processing-
trucking-iot-data-with-apache-storm/
Apache Storm 2.0
• Storm Code entirely in Java (no more Clojure)
• Performance Improvements
• Worker/Threading Model Redesign
• Apache Beam Integration
• Bounded Spouts
• Metrics Enhancements
• Worker-Classloader Isolation
• Improved Backpressure
• Dynamic Topology Updates
References
• [1] Taylor Goetz Presentation @ DataWorks/Hadoop Summit, Munich 2017
• [2] http://arxiv.org/pdf/1506.08603v1.pdf
• [3] http://research.microsoft.com/en-us/um/people/lamport/pubs/chandy.pdf

More Related Content

What's hot

Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Michael Noll
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
Gwen (Chen) Shapira
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
P. Taylor Goetz
 
Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016
Gwen (Chen) Shapira
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
Rahul Jain
 
Kafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka MeetupKafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka Meetup
Gwen (Chen) Shapira
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
P. Taylor Goetz
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Erik Onnen
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Amazon Web Services
 
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Michael Noll
 
Spark streaming + kafka 0.10
Spark streaming + kafka 0.10Spark streaming + kafka 0.10
Spark streaming + kafka 0.10
Joan Viladrosa Riera
 
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Chris Fregly
 
Tuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache SparkTuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache Spark
Databricks
 
Stream processing using Apache Storm - Big Data Meetup Athens 2016
Stream processing using Apache Storm - Big Data Meetup Athens 2016Stream processing using Apache Storm - Big Data Meetup Athens 2016
Stream processing using Apache Storm - Big Data Meetup Athens 2016
Adrianos Dadis
 
Real Time Data Streaming using Kafka & Storm
Real Time Data Streaming using Kafka & StormReal Time Data Streaming using Kafka & Storm
Real Time Data Streaming using Kafka & Storm
Ran Silberman
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
PivotalOpenSourceHub
 
Kafka and Spark Streaming
Kafka and Spark StreamingKafka and Spark Streaming
Kafka and Spark Streaming
datamantra
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using Storm
Nati Shalom
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
Joe Stein
 
Apache Storm In Retail Context
Apache Storm In Retail ContextApache Storm In Retail Context
Apache Storm In Retail Context
Karthik Deivasigamani
 

What's hot (20)

Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
Kafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka MeetupKafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka Meetup
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
 
Spark streaming + kafka 0.10
Spark streaming + kafka 0.10Spark streaming + kafka 0.10
Spark streaming + kafka 0.10
 
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
 
Tuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache SparkTuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache Spark
 
Stream processing using Apache Storm - Big Data Meetup Athens 2016
Stream processing using Apache Storm - Big Data Meetup Athens 2016Stream processing using Apache Storm - Big Data Meetup Athens 2016
Stream processing using Apache Storm - Big Data Meetup Athens 2016
 
Real Time Data Streaming using Kafka & Storm
Real Time Data Streaming using Kafka & StormReal Time Data Streaming using Kafka & Storm
Real Time Data Streaming using Kafka & Storm
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
 
Kafka and Spark Streaming
Kafka and Spark StreamingKafka and Spark Streaming
Kafka and Spark Streaming
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using Storm
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Apache Storm In Retail Context
Apache Storm In Retail ContextApache Storm In Retail Context
Apache Storm In Retail Context
 

Similar to Building Streaming Applications with Apache Storm 1.1

Past, Present, and Future of Apache Storm
Past, Present, and Future of Apache StormPast, Present, and Future of Apache Storm
Past, Present, and Future of Apache Storm
P. Taylor Goetz
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
Timothy Spann
 
messaging.pptx
messaging.pptxmessaging.pptx
messaging.pptx
NParakh1
 
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Timothy Spann
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Amazon Web Services
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
Evans Ye
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Chris Fregly
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
Timothy Spann
 
Introduction to Apache NiFi And Storm
Introduction to Apache NiFi And StormIntroduction to Apache NiFi And Storm
Introduction to Apache NiFi And Storm
Jungtaek Lim
 
PMIx Tiered Storage Support
PMIx Tiered Storage SupportPMIx Tiered Storage Support
PMIx Tiered Storage Support
rcastain
 
Stinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksStinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of Hortonworks
Data Con LA
 
What's New in IBM Streams V4.1
What's New in IBM Streams V4.1What's New in IBM Streams V4.1
What's New in IBM Streams V4.1
lisanl
 
Search On Hadoop Frontier Meetup
Search On Hadoop Frontier MeetupSearch On Hadoop Frontier Meetup
Search On Hadoop Frontier Meetupgregchanan
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Kai Wähner
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis
 
DEVNET-1106 Upcoming Services in OpenStack
DEVNET-1106	Upcoming Services in OpenStackDEVNET-1106	Upcoming Services in OpenStack
DEVNET-1106 Upcoming Services in OpenStack
Cisco DevNet
 
Apache Content Technologies
Apache Content TechnologiesApache Content Technologies
Apache Content Technologies
gagravarr
 

Similar to Building Streaming Applications with Apache Storm 1.1 (20)

Past, Present, and Future of Apache Storm
Past, Present, and Future of Apache StormPast, Present, and Future of Apache Storm
Past, Present, and Future of Apache Storm
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
messaging.pptx
messaging.pptxmessaging.pptx
messaging.pptx
 
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 
Introduction to Apache NiFi And Storm
Introduction to Apache NiFi And StormIntroduction to Apache NiFi And Storm
Introduction to Apache NiFi And Storm
 
PMIx Tiered Storage Support
PMIx Tiered Storage SupportPMIx Tiered Storage Support
PMIx Tiered Storage Support
 
Stinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksStinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of Hortonworks
 
What's New in IBM Streams V4.1
What's New in IBM Streams V4.1What's New in IBM Streams V4.1
What's New in IBM Streams V4.1
 
Search On Hadoop Frontier Meetup
Search On Hadoop Frontier MeetupSearch On Hadoop Frontier Meetup
Search On Hadoop Frontier Meetup
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
 
DEVNET-1106 Upcoming Services in OpenStack
DEVNET-1106	Upcoming Services in OpenStackDEVNET-1106	Upcoming Services in OpenStack
DEVNET-1106 Upcoming Services in OpenStack
 
Apache Content Technologies
Apache Content TechnologiesApache Content Technologies
Apache Content Technologies
 

Recently uploaded

哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 

Recently uploaded (20)

哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 

Building Streaming Applications with Apache Storm 1.1

  • 1. Building Streaming Applications With Apache Storm 1.1 Meetup Hortonworks, April 20th, 2017
  • 2. Presenters • Sriharsha Chintalapani, Storm & Kafka Committer, PMC @ Hortonworks • Karthik Deivasigamani, Walmart Labs • Roshan Naik, Storm Contributor, Flume Commiter @ Hortonworks • Hugo Louro, Storm Committer, PMC @ Hortonworks
  • 4. Apache Storm Brief History • 2010 - First Streaming Framework - Backtype • 2011 – Acquired by and Deployed at Twitter • 2013 - Open Sourced into Apache • Present – Large Scale Production Deployments – Yahoo 3500+ Nodes – Alibaba 1PB of Data per Day
  • 5. Prior Releases Highlights • 0.9.x • Storm becomes an Apache TLP • First Official Apache Release • Expanded Kafka, HDFS, HBase Integration • 0.10.x • Multi Tenancy • Rolling Upgrades • Improved Logging (Log4j2) • JDBC, Event Hubs, Hive Integration
  • 6. Prior Releases Highlights • 1.0 – Pacemaker (Replaces Zookeeper for Heartbeats) – Security (Kerberos/Digest Authentication) – Nimbus HA (Eliminates Single Point of Failure) – Supervisor Health Checks – Resource Aware Scheduler
  • 7. Prior Releases Highlights • 1.0 – Stateful Bolts – Automatic Checkpointing/Snapshots • ABS [2], Chandy-Lamport [3] Algorithms – Streaming Windows • Sliding, Tumbling, Watermarks, Out of Order Tuples – Dynamic Log Levels – Distributed Log Search – Worker Profiling – Solr, Cassandra, Elastic Search, MQTT Integration
  • 8. Apache Storm 1.1.0 March 29, 2017 • Streaming SQL • Improved Apache Kafka Integration • PMML Support (Machine Learning) • Druid Integration • OpenTSDB Integration
  • 9. Apache Storm 1.1.0 March 29, 2017 • AWS Kinesis Support • HDFS Spout • Other Enhancements –Flux –Topology Deployment –Resource Aware Scheduler
  • 10. Streaming SQL • Apache Calcite for Query Parsing/Planning • Define Topology Using SQL Like Query • SQL Compiled and Transformed onto a Trident Topology • Streaming Onto/From Arbitrary Data Sources – Kafka, Redis, HDFS, MongoDB – Extensible Implementing ISqlTridentDataSource
  • 11. Streaming SQL • Tuple Filtering • Projections • CSV, TSV, and Avro input/output formats • User Defined Functions (UDFs) • User fine control of Parallelism of Generated Components
  • 12. Streaming SQL - Aggregate UDF
  • 13. Streaming SQL – Example [1] • Read Apache HTTPD server logs from Kafka • Filter out everything but error log events • Write the error events onto a Kafka topic
  • 14. Streaming SQL – Example [1]
  • 15. Improved Apache Kafka Integration • Enhanced configuration API • Support Consumer Groups • Pluggable Translators Kafka Record ->Tuple • Support for Topic Wildcards • Support Multiple Streams, Topics/Stream • Trident Kafka supporting Kafka 0.10 onwards • Integrates with Secure Kafka Environments
  • 16. Improved Apache Kafka Integration
  • 17. PMML Support (Machine Learning) • Predictive Model Markup Language • Describes Model Learned by ML algorithms • PmmlPredictorBolt Computes Predicted Scores for Live Tuples according to PMML Model • PMML Model Uploaded or Downloaded from Distributed Cache
  • 19. Storm 1.1.0 Improvements • Flux – Visualization in Storm UI • Specify the resource requirements (Memory/CPU) for individual topology components (Spouts/Bolts) • Topology Deployment – Alternative to Uber Jar – storm jar --jars /path/to/local/jar --artifacts `resolve Maven dependencies` -- arfifactRepository `additional Maven repos` • Specify the resource requirements (Memory/CPU) for individual topology components (Spouts/Bolts)
  • 21. Apache Storm 2.0 • Storm Code entirely in Java (no more Clojure) • Performance Improvements • Worker/Threading Model Redesign • Apache Beam Integration • Bounded Spouts • Metrics Enhancements • Worker-Classloader Isolation • Improved Backpressure • Dynamic Topology Updates
  • 22. References • [1] Taylor Goetz Presentation @ DataWorks/Hadoop Summit, Munich 2017 • [2] http://arxiv.org/pdf/1506.08603v1.pdf • [3] http://research.microsoft.com/en-us/um/people/lamport/pubs/chandy.pdf