SlideShare a Scribd company logo
Real-time Stream Processing
using
Apache Apex
Bhupesh Chawda
bhupesh@apache.org
DataTorrent Software
Apache Apex - Stream Processing
● YARN - Native - Uses Hadoop YARN framework for resource negotiation
● Highly Scalable - Scales statically as well as dynamically
● Highly Performant - Can reach single digit millisecond end-to-end latency
● Fault Tolerant - Automatically recovers from failures - without manual intervention
● Stateful - Guarantees that no state will be lost
● Easily Operable - Exposes an easy API for developing Operators (part of an
application) and Applications
Project History
● Project development started in 2012 at DataTorrent
● Open-sourced in July 2015
● Apache Apex started incubation in August 2015
● 50+ committers from Apple, GE, Capital One, DirecTV, Silver Spring Networks,
Barclays, Ampool and DataTorrent
● Mentors from Class Software, MapR and Hortonworks
● Soon to be a top level Apache project
Apex Platform Overview
An Apex Application is a DAG
(Directed Acyclic Graph)
● A DAG is composed of vertices (Operators) and edges (Streams).
● A Stream is a sequence of data tuples which connects operators at end-points called Ports
● An Operator takes one or more input streams, performs computations & emits one or more output streams
● Each operator is USER’s business logic, or built-in operator from our open source library
● Operator may have multiple instances that run in parallel
Hadoop 1.0 vs 2.0 - YARN
Apex as a YARN Application
● YARN (Hadoop 2.0) replaces MapReduce with
a more generic Resource Management
Framework.
● Apex uses YARN for resource management
and HDFS for storing any persistent storage
Support for Windowing
● Apex splits incoming tuples into finite time slices - Streaming Windows
○ Transparent to the user
○ Apex Default = 500 ms
● Checkpointing and book-keeping done at Streaming window boundary
● Applications may need to perform computations in windows - Application Windows
○ Specified as a multiple of Streaming Window size
○ Call backs to user operator logic
■ beginWindow(long windowId)
■ endWindow()
○ Example - An application which identifies some aggregates and emits them every minute. Here
application window size = 60 secs = 30 Streaming Windows
● Sliding and Tumbling Application windows are supported natively
Buffer Server
● Staging area for outgoing tuples
● Downstream operators connect to upstream Buffer Server to subscribe for tuples
● Plays a role in recovery by replaying data to the downstream operator from a
particular checkpoint
● Spooling to disk is also supported
Fault Tolerance - Checkpointing
● During checkpointing all operator state is written to HDFS asynchronously
● This is decentralized and happens independently for each operator
● If all operators in the DAG have checkpointed a particular window, then that window
is said to be committed and all previous checkpoints are purged
O1 O2 O3 O4
3 3 3 2Checkpoint # --->
Committed Window # = 120180 180 180 120
Checkpoint
Window # --->
Committed Checkpoint # = 2
Checkpoint Window = 60 Streaming Windows
Recovery
● Apex Application Master detects the failure
of an operator based on the missing heart
beats from the operators or if windows are
not progressing
● All downstream operators from the failed
operator are restarted from the last
committed checkpoint to recover from
their states.
● Data is replayed from the same checkpoint
by the Buffer Server
● Recovery is automatic and does not require
manual intervention.
Scalability - Partitioning
● Operators can be “replicated” (partitioned) into
multiple instances to cope up with high speed
input streams.
● Can be specified at Application launch time
● User can control the distribution of tuples to
downstream partitions.
● Automatic Unifier to unify the tuples
Scalability - Dynamic scaling
● Auto scaling is also supported. Number of partitions may automatically increase or
decrease based on the incoming load. Can be customized by the user
● User has to define the trigger for auto scaling:
○ Example - Increase partitions if latency goes above 100 ms.
Apex Processing Semantics
● AT_LEAST_ONCE (default): Windows are processed at least once
● AT_MOST_ONCE: Windows are processed at most once
○ During recovery, all downstream operators are fast-forwarded to the window of latest checkpoint
● EXACTLY_ONCE: Windows are processed exactly once
○ Checkpoint every window
○ Checkpointing becomes blocking
Apex Guarantees
● Apex guarantees No loss of data and computational state - Checkpointed
periodically
● Automatic recovery ensures that processing resumes from where it left off
● Order of incoming data is guaranteed to be maintained
○ Not applicable in case of partitioning of operators
● Events in a window are always replayed in the same window in case of failures
Application Specification
1. Add Operators
2. Add Streams
Logical and Physical DAGs
Apex Malhar Library
1. Performance requirements
a. A system which can provide a very very low latency for decision making
(40 ms)
b. Ability to handle large volumes of data and ever changing rules (1,000
events per 20 ms burst)
c. 99.5% uptime. Which is about 1.5 days downtime in an year
➔ Apex achieved:
◆ 2 ms latency against the requirement of 40ms
◆ Was able to handle 2,000 events burst against requirement of 1,000
events burst at a net rate of 70,000 events/s.
◆ 99.9995% uptime against requirement of and 99.5% uptime and
2. Relevant Roadmap
3. Enterprise grade
4. Have a healthy and diverse community and committers, i.e. not controlled by one
vendor
Talk Slides: http://www.slideshare.net/ilganeli/nextgen-decision-making-in-under-2ms
DataTorrent Blog: https://www.datatorrent.com/blog/next-gen-decision-making-in-under-
2-milliseconds/
Decision Making in < 2ms
Decision making in < 2ms contd..
● Comparison finally boiled down to
○ Apache Storm
○ Apache Flink
○ Apache Apex
● Some problems in Storm and Flink among others
○ Nimbus is a single point of failure
○ Bolts / Spouts / Operators share a JVM. Hard to debug
○ No dynamic topologies
○ Restarting entire topologies in case of failures
Resources
● Mailing List
○ Developers dev@apex.incubator.apache.org
○ Users users@apex.incubator.apache.org
● Apache Apex http://apex.apache.org/
● Github
○ Apex Core: http://github.com/apache/incubator-apex-core
○ Apex Malhar: http://github.com/apache/incubator-apex-malhar
● DataTorrent: http://www.datatorrent.com
● Twitter @ApacheApex Follow - https://twitter.com/apacheapex
● Facebook https://www.facebook.com/ApacheApex/
● Meetup http://www.meetup.com/topics/apache-apex
● Startup Program Free Enterprise License for Startups, Universities, Non-Profits
Thank you!
Please send your questions to bhupesh@apache.org

More Related Content

What's hot

Apex as yarn application
Apex as yarn applicationApex as yarn application
Apex as yarn application
Chinmay Kolhatkar
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Apache Apex Meetup at Cask
Apache Apex Meetup at CaskApache Apex Meetup at Cask
Apache Apex Meetup at Cask
Apache Apex
 
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
Apache Apex
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and ApplicationsApache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
Thomas Weise
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application
Apache Apex
 
Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)
Apache Apex
 
Apache Apex connector with Kafka 0.9 consumer API
Apache Apex connector with Kafka 0.9 consumer APIApache Apex connector with Kafka 0.9 consumer API
Apache Apex connector with Kafka 0.9 consumer API
Apache Apex
 
Reactive systems
Reactive systemsReactive systems
Reactive systems
Naresh Chintalcheru
 
Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)
Apache Apex
 
Capital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 msCapital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 ms
Apache Apex
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Apache Apex
 
DataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application MeetupDataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application Meetup
Thomas Weise
 
Fault-Tolerant File Input & Output
Fault-Tolerant File Input & OutputFault-Tolerant File Input & Output
Fault-Tolerant File Input & Output
Apache Apex
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
Apache Apex
 
Autonomous workload rebalancing in kafka
Autonomous workload rebalancing in kafkaAutonomous workload rebalancing in kafka
Autonomous workload rebalancing in kafka
Indrajeet Kumar
 
Sql disaster recovery
Sql disaster recoverySql disaster recovery
Sql disaster recovery
Sqlperfomance
 
Apache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engineApache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engine
Tianlun Zhang
 
Spark Meetup:DataScience@Concur - Reacting to RT events to control throughput
Spark Meetup:DataScience@Concur - Reacting to RT events to control throughputSpark Meetup:DataScience@Concur - Reacting to RT events to control throughput
Spark Meetup:DataScience@Concur - Reacting to RT events to control throughput
Anikate Singh
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big Data
Apache Apex
 

What's hot (20)

Apex as yarn application
Apex as yarn applicationApex as yarn application
Apex as yarn application
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 
Apache Apex Meetup at Cask
Apache Apex Meetup at CaskApache Apex Meetup at Cask
Apache Apex Meetup at Cask
 
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and ApplicationsApache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application
 
Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)
 
Apache Apex connector with Kafka 0.9 consumer API
Apache Apex connector with Kafka 0.9 consumer APIApache Apex connector with Kafka 0.9 consumer API
Apache Apex connector with Kafka 0.9 consumer API
 
Reactive systems
Reactive systemsReactive systems
Reactive systems
 
Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)
 
Capital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 msCapital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 ms
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 
DataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application MeetupDataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application Meetup
 
Fault-Tolerant File Input & Output
Fault-Tolerant File Input & OutputFault-Tolerant File Input & Output
Fault-Tolerant File Input & Output
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
Autonomous workload rebalancing in kafka
Autonomous workload rebalancing in kafkaAutonomous workload rebalancing in kafka
Autonomous workload rebalancing in kafka
 
Sql disaster recovery
Sql disaster recoverySql disaster recovery
Sql disaster recovery
 
Apache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engineApache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engine
 
Spark Meetup:DataScience@Concur - Reacting to RT events to control throughput
Spark Meetup:DataScience@Concur - Reacting to RT events to control throughputSpark Meetup:DataScience@Concur - Reacting to RT events to control throughput
Spark Meetup:DataScience@Concur - Reacting to RT events to control throughput
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big Data
 

Viewers also liked

kakadu fishing tours - Surf & Sun
kakadu fishing tours - Surf & Sunkakadu fishing tours - Surf & Sun
kakadu fishing tours - Surf & Sun
Ashley Smith
 
About Middleton with Surf and Sun
About Middleton with Surf and SunAbout Middleton with Surf and Sun
About Middleton with Surf and Sun
Ashley Smith
 
Presentacion unidad 2
Presentacion unidad 2Presentacion unidad 2
Presentacion unidad 2
abelardo
 
Surf Lessons at Middleton SA 15th July 2013 with Surf & Sun
Surf Lessons at Middleton SA 15th July 2013  with Surf & Sun Surf Lessons at Middleton SA 15th July 2013  with Surf & Sun
Surf Lessons at Middleton SA 15th July 2013 with Surf & Sun
Ashley Smith
 
Great Accommodations in Victor Harbor while out for a Surfing Vacation
Great Accommodations in Victor Harbor while out for a Surfing VacationGreat Accommodations in Victor Harbor while out for a Surfing Vacation
Great Accommodations in Victor Harbor while out for a Surfing Vacation
Ashley Smith
 
LENGUA. Tema 4
LENGUA. Tema 4LENGUA. Tema 4
LENGUA. Tema 4
anacanoHBS
 
Surf Lessons Middleton SA 15th October 2013 with Surf & Sun
Surf Lessons Middleton SA 15th October 2013 with Surf & Sun Surf Lessons Middleton SA 15th October 2013 with Surf & Sun
Surf Lessons Middleton SA 15th October 2013 with Surf & Sun
Ashley Smith
 
Surf Lessons at Middleton SA 7th December 2013 with Surf & Sun
Surf Lessons at Middleton SA 7th December 2013 with Surf & Sun Surf Lessons at Middleton SA 7th December 2013 with Surf & Sun
Surf Lessons at Middleton SA 7th December 2013 with Surf & Sun
Ashley Smith
 
Potencial germinativo
Potencial germinativoPotencial germinativo
Potencial germinativo
BRYAN EDSON VILCATOMA HUANUQUEÑO
 
A Fantastic Day to Enjoy Customer Photos from Middleton - Surf & Sun
A Fantastic Day to Enjoy Customer Photos from  Middleton - Surf & SunA Fantastic Day to Enjoy Customer Photos from  Middleton - Surf & Sun
A Fantastic Day to Enjoy Customer Photos from Middleton - Surf & Sun
Ashley Smith
 
Surf Lessons at Middleton SA 22nd December with Surf & Sun
Surf Lessons at Middleton SA 22nd December   with Surf & Sun Surf Lessons at Middleton SA 22nd December   with Surf & Sun
Surf Lessons at Middleton SA 22nd December with Surf & Sun
Ashley Smith
 
Kenya Go To Market Report
Kenya Go To Market ReportKenya Go To Market Report
Kenya Go To Market ReportKerri Klidas
 
Carpeta virtual
Carpeta virtualCarpeta virtual
Carpeta virtual
Alex Jonathan Yagloa
 
Surf Lessons Middleton SA 5th October 2013 with Surf & Sun
Surf Lessons Middleton SA 5th October 2013 with Surf & Sun Surf Lessons Middleton SA 5th October 2013 with Surf & Sun
Surf Lessons Middleton SA 5th October 2013 with Surf & Sun
Ashley Smith
 
Surf & Sun : Surf Lesson Activity at Middleton SA 2014
Surf & Sun : Surf  Lesson Activity at Middleton SA 2014Surf & Sun : Surf  Lesson Activity at Middleton SA 2014
Surf & Sun : Surf Lesson Activity at Middleton SA 2014
Ashley Smith
 
Surf Lessons at Middleton SA 17th January 2014 with Surf & Sun
Surf Lessons at Middleton SA 17th January 2014 with Surf & SunSurf Lessons at Middleton SA 17th January 2014 with Surf & Sun
Surf Lessons at Middleton SA 17th January 2014 with Surf & Sun
Ashley Smith
 

Viewers also liked (17)

kakadu fishing tours - Surf & Sun
kakadu fishing tours - Surf & Sunkakadu fishing tours - Surf & Sun
kakadu fishing tours - Surf & Sun
 
About Middleton with Surf and Sun
About Middleton with Surf and SunAbout Middleton with Surf and Sun
About Middleton with Surf and Sun
 
Presentacion unidad 2
Presentacion unidad 2Presentacion unidad 2
Presentacion unidad 2
 
Surf Lessons at Middleton SA 15th July 2013 with Surf & Sun
Surf Lessons at Middleton SA 15th July 2013  with Surf & Sun Surf Lessons at Middleton SA 15th July 2013  with Surf & Sun
Surf Lessons at Middleton SA 15th July 2013 with Surf & Sun
 
Great Accommodations in Victor Harbor while out for a Surfing Vacation
Great Accommodations in Victor Harbor while out for a Surfing VacationGreat Accommodations in Victor Harbor while out for a Surfing Vacation
Great Accommodations in Victor Harbor while out for a Surfing Vacation
 
Stephanie Stevenson Resume
Stephanie Stevenson ResumeStephanie Stevenson Resume
Stephanie Stevenson Resume
 
LENGUA. Tema 4
LENGUA. Tema 4LENGUA. Tema 4
LENGUA. Tema 4
 
Surf Lessons Middleton SA 15th October 2013 with Surf & Sun
Surf Lessons Middleton SA 15th October 2013 with Surf & Sun Surf Lessons Middleton SA 15th October 2013 with Surf & Sun
Surf Lessons Middleton SA 15th October 2013 with Surf & Sun
 
Surf Lessons at Middleton SA 7th December 2013 with Surf & Sun
Surf Lessons at Middleton SA 7th December 2013 with Surf & Sun Surf Lessons at Middleton SA 7th December 2013 with Surf & Sun
Surf Lessons at Middleton SA 7th December 2013 with Surf & Sun
 
Potencial germinativo
Potencial germinativoPotencial germinativo
Potencial germinativo
 
A Fantastic Day to Enjoy Customer Photos from Middleton - Surf & Sun
A Fantastic Day to Enjoy Customer Photos from  Middleton - Surf & SunA Fantastic Day to Enjoy Customer Photos from  Middleton - Surf & Sun
A Fantastic Day to Enjoy Customer Photos from Middleton - Surf & Sun
 
Surf Lessons at Middleton SA 22nd December with Surf & Sun
Surf Lessons at Middleton SA 22nd December   with Surf & Sun Surf Lessons at Middleton SA 22nd December   with Surf & Sun
Surf Lessons at Middleton SA 22nd December with Surf & Sun
 
Kenya Go To Market Report
Kenya Go To Market ReportKenya Go To Market Report
Kenya Go To Market Report
 
Carpeta virtual
Carpeta virtualCarpeta virtual
Carpeta virtual
 
Surf Lessons Middleton SA 5th October 2013 with Surf & Sun
Surf Lessons Middleton SA 5th October 2013 with Surf & Sun Surf Lessons Middleton SA 5th October 2013 with Surf & Sun
Surf Lessons Middleton SA 5th October 2013 with Surf & Sun
 
Surf & Sun : Surf Lesson Activity at Middleton SA 2014
Surf & Sun : Surf  Lesson Activity at Middleton SA 2014Surf & Sun : Surf  Lesson Activity at Middleton SA 2014
Surf & Sun : Surf Lesson Activity at Middleton SA 2014
 
Surf Lessons at Middleton SA 17th January 2014 with Surf & Sun
Surf Lessons at Middleton SA 17th January 2014 with Surf & SunSurf Lessons at Middleton SA 17th January 2014 with Surf & Sun
Surf Lessons at Middleton SA 17th January 2014 with Surf & Sun
 

Similar to Introduction to Apache Apex - CoDS 2016

Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Apache Apex
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
DataWorks Summit/Hadoop Summit
 
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra TagareActionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Apache Apex
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
Apache Apex
 
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - HackacIntro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Apache Apex
 
An adaptive and eventually self healing framework for geo-distributed real-ti...
An adaptive and eventually self healing framework for geo-distributed real-ti...An adaptive and eventually self healing framework for geo-distributed real-ti...
An adaptive and eventually self healing framework for geo-distributed real-ti...
Angad Singh
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas Weise
Big Data Spain
 
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
Apache Apex
 
Stateful streaming data pipelines
Stateful streaming data pipelinesStateful streaming data pipelines
Stateful streaming data pipelines
Timothy Farkas
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
Comsysto Reply GmbH
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Dataconomy Media
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUGslandelle
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
confluent
 
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
Yahoo Developer Network
 
Apache Apex - Hadoop Users Group
Apache Apex - Hadoop Users GroupApache Apex - Hadoop Users Group
Apache Apex - Hadoop Users Group
Pramod Immaneni
 
Apache flink
Apache flinkApache flink
Apache flink
pranay kumar
 
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
InfluxData
 

Similar to Introduction to Apache Apex - CoDS 2016 (20)

Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra TagareActionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - HackacIntro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
 
An adaptive and eventually self healing framework for geo-distributed real-ti...
An adaptive and eventually self healing framework for geo-distributed real-ti...An adaptive and eventually self healing framework for geo-distributed real-ti...
An adaptive and eventually self healing framework for geo-distributed real-ti...
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas Weise
 
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
 
Stateful streaming data pipelines
Stateful streaming data pipelinesStateful streaming data pipelines
Stateful streaming data pipelines
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
 
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
 
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
 
Apache Apex - Hadoop Users Group
Apache Apex - Hadoop Users GroupApache Apex - Hadoop Users Group
Apache Apex - Hadoop Users Group
 
Apache flink
Apache flinkApache flink
Apache flink
 
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 

Introduction to Apache Apex - CoDS 2016

  • 1. Real-time Stream Processing using Apache Apex Bhupesh Chawda bhupesh@apache.org DataTorrent Software
  • 2. Apache Apex - Stream Processing ● YARN - Native - Uses Hadoop YARN framework for resource negotiation ● Highly Scalable - Scales statically as well as dynamically ● Highly Performant - Can reach single digit millisecond end-to-end latency ● Fault Tolerant - Automatically recovers from failures - without manual intervention ● Stateful - Guarantees that no state will be lost ● Easily Operable - Exposes an easy API for developing Operators (part of an application) and Applications
  • 3. Project History ● Project development started in 2012 at DataTorrent ● Open-sourced in July 2015 ● Apache Apex started incubation in August 2015 ● 50+ committers from Apple, GE, Capital One, DirecTV, Silver Spring Networks, Barclays, Ampool and DataTorrent ● Mentors from Class Software, MapR and Hortonworks ● Soon to be a top level Apache project
  • 5. An Apex Application is a DAG (Directed Acyclic Graph) ● A DAG is composed of vertices (Operators) and edges (Streams). ● A Stream is a sequence of data tuples which connects operators at end-points called Ports ● An Operator takes one or more input streams, performs computations & emits one or more output streams ● Each operator is USER’s business logic, or built-in operator from our open source library ● Operator may have multiple instances that run in parallel
  • 6. Hadoop 1.0 vs 2.0 - YARN
  • 7. Apex as a YARN Application ● YARN (Hadoop 2.0) replaces MapReduce with a more generic Resource Management Framework. ● Apex uses YARN for resource management and HDFS for storing any persistent storage
  • 8. Support for Windowing ● Apex splits incoming tuples into finite time slices - Streaming Windows ○ Transparent to the user ○ Apex Default = 500 ms ● Checkpointing and book-keeping done at Streaming window boundary ● Applications may need to perform computations in windows - Application Windows ○ Specified as a multiple of Streaming Window size ○ Call backs to user operator logic ■ beginWindow(long windowId) ■ endWindow() ○ Example - An application which identifies some aggregates and emits them every minute. Here application window size = 60 secs = 30 Streaming Windows ● Sliding and Tumbling Application windows are supported natively
  • 9. Buffer Server ● Staging area for outgoing tuples ● Downstream operators connect to upstream Buffer Server to subscribe for tuples ● Plays a role in recovery by replaying data to the downstream operator from a particular checkpoint ● Spooling to disk is also supported
  • 10. Fault Tolerance - Checkpointing ● During checkpointing all operator state is written to HDFS asynchronously ● This is decentralized and happens independently for each operator ● If all operators in the DAG have checkpointed a particular window, then that window is said to be committed and all previous checkpoints are purged O1 O2 O3 O4 3 3 3 2Checkpoint # ---> Committed Window # = 120180 180 180 120 Checkpoint Window # ---> Committed Checkpoint # = 2 Checkpoint Window = 60 Streaming Windows
  • 11. Recovery ● Apex Application Master detects the failure of an operator based on the missing heart beats from the operators or if windows are not progressing ● All downstream operators from the failed operator are restarted from the last committed checkpoint to recover from their states. ● Data is replayed from the same checkpoint by the Buffer Server ● Recovery is automatic and does not require manual intervention.
  • 12. Scalability - Partitioning ● Operators can be “replicated” (partitioned) into multiple instances to cope up with high speed input streams. ● Can be specified at Application launch time ● User can control the distribution of tuples to downstream partitions. ● Automatic Unifier to unify the tuples
  • 13. Scalability - Dynamic scaling ● Auto scaling is also supported. Number of partitions may automatically increase or decrease based on the incoming load. Can be customized by the user ● User has to define the trigger for auto scaling: ○ Example - Increase partitions if latency goes above 100 ms.
  • 14. Apex Processing Semantics ● AT_LEAST_ONCE (default): Windows are processed at least once ● AT_MOST_ONCE: Windows are processed at most once ○ During recovery, all downstream operators are fast-forwarded to the window of latest checkpoint ● EXACTLY_ONCE: Windows are processed exactly once ○ Checkpoint every window ○ Checkpointing becomes blocking
  • 15. Apex Guarantees ● Apex guarantees No loss of data and computational state - Checkpointed periodically ● Automatic recovery ensures that processing resumes from where it left off ● Order of incoming data is guaranteed to be maintained ○ Not applicable in case of partitioning of operators ● Events in a window are always replayed in the same window in case of failures
  • 16. Application Specification 1. Add Operators 2. Add Streams
  • 19. 1. Performance requirements a. A system which can provide a very very low latency for decision making (40 ms) b. Ability to handle large volumes of data and ever changing rules (1,000 events per 20 ms burst) c. 99.5% uptime. Which is about 1.5 days downtime in an year ➔ Apex achieved: ◆ 2 ms latency against the requirement of 40ms ◆ Was able to handle 2,000 events burst against requirement of 1,000 events burst at a net rate of 70,000 events/s. ◆ 99.9995% uptime against requirement of and 99.5% uptime and 2. Relevant Roadmap 3. Enterprise grade 4. Have a healthy and diverse community and committers, i.e. not controlled by one vendor Talk Slides: http://www.slideshare.net/ilganeli/nextgen-decision-making-in-under-2ms DataTorrent Blog: https://www.datatorrent.com/blog/next-gen-decision-making-in-under- 2-milliseconds/ Decision Making in < 2ms
  • 20. Decision making in < 2ms contd.. ● Comparison finally boiled down to ○ Apache Storm ○ Apache Flink ○ Apache Apex ● Some problems in Storm and Flink among others ○ Nimbus is a single point of failure ○ Bolts / Spouts / Operators share a JVM. Hard to debug ○ No dynamic topologies ○ Restarting entire topologies in case of failures
  • 21. Resources ● Mailing List ○ Developers dev@apex.incubator.apache.org ○ Users users@apex.incubator.apache.org ● Apache Apex http://apex.apache.org/ ● Github ○ Apex Core: http://github.com/apache/incubator-apex-core ○ Apex Malhar: http://github.com/apache/incubator-apex-malhar ● DataTorrent: http://www.datatorrent.com ● Twitter @ApacheApex Follow - https://twitter.com/apacheapex ● Facebook https://www.facebook.com/ApacheApex/ ● Meetup http://www.meetup.com/topics/apache-apex ● Startup Program Free Enterprise License for Startups, Universities, Non-Profits
  • 22. Thank you! Please send your questions to bhupesh@apache.org