SlideShare a Scribd company logo
Apache Apex as
YARN Application
Chinmay Kolhatkar (chinmay@apache.org)
Mar 22, 2016
Apache Apex Meetup
Agenda
• Directed Acyclic Graph
• Apex as a YARN Application
• Application Components of Apex
• Lifecycle of Apex as a YARN Application
Apache Apex Meetup
Directed Acyclic Graph (DAG)
• Defines compute stages of streaming application
• Defines tuple flow across Operators via Stream
Compute
1
Apache Apex Meetup
Compute
3
Compute
2
Compute
4
DAG Components
• Tuple
● Atomic data that flows over a stream
• Operator
● Basic compute unit per tuple
• Stream
● Connector abstraction between operators
● Tuples flow over this
Operator
1
Operator
2
Apache Apex Meetup
Stream
tuple
3
tuple
1
tuple
2
DAG Types
O1 O2
O3
O4
Physical DAG
Apache Apex Meetup
O5
Logical DAG
• Logical Plan
● Logical representation of computation
● Defines operators, streams and dataflow
• Physical Plan
● Deployable plan on cluster
● Contains partition information
of operators
● Has ready-to-deploy serialized operator
instances
O1
P1
O1
P2
O1
P3
O2
P1
O2
P2
O2
P3
U
O3
O4
O5
Apex as YARN application
Node
ResourceManager
(AsM + Scheduler)
NM Node NM Node NM
YarnClient
AppMaster
YarnContainer
YarnContainer
YarnContainer
StrAM
(AppMaster)
YarnContainer
StrAMChild
O1 O2
YarnContainer
StrAMChild
O3
DTCLI
StrAMClient
YarnClient
Apache Apex Meetup
ClientRM
Protocol
AMRM
Protocol
ContainerManager
Protocol
ContainerManager
Protocol
ClientRM
Protocol
AMRM
Protocol
ContainerManager
Protocol
Application Components of Apex - StrAMClient
• Part of dtcli client interface
• Invoked by “launch” command of dtcli
• Tasks:
● Copy required the application package files into HDFS
● Validate Logical Plan
● Serialize Logical plan to HDFS
● Launch Application Master i.e. StrAM
Apache Apex Meetup
Application Components of Apex - StrAM
• Streaming Application Master
• Started by StrAMClient on a YarnContainer
• Tasks:
● Convert logical plan to physical plan
● Serialize operators to HDFS
● Request for resources to ResourceManager
● Start StrAMChild in YarnContainer(s)
● Monitor StrAMChild using ContainerManager protocol
● Generate Application statistics
● Host results on WebService (dtManage)
● Fault Tolerance
● Checkpointing/Committing Application States
● Support Security
● Shutdown Application
Apache Apex Meetup
Application Components of Apex - StrAMChild
• Deployed on YarnContainer
• Started by NodeManager as instructed by StrAM
• Instance of StreamingContainer
• Contains Operators (compute-related)
• Contains BufferServer (stream-related)
• Tasks:
● Regularly send heartbeat to StrAM
● Execute commands from StrAM
● Shutdown or Kill self if instructed
● Manage lifecycle of an Operator
● Network communication using BufferServer
Apache Apex Meetup
Lifecycle of Apex/YARN Application - Start
Node
ResourceManager
(AsM + Scheduler)
NM Node NM Node NM
DTCLI/
StrAMClient
(YarnClient)
1) Access cluster information
HDFS
3) Submit Application to RM
StrAM
(AppMaster)
4) StrAM Registers with RM
5) StrAM sends heartbeats regularly
6) StrAM request containers with specifications
7) StrAMChild reads
serialized operator
from HDFS
8) StrAMChild starts
operator lifecycle
Apache Apex Meetup
2) Copies files from HDFS
ClientRMProtocol
AMRMProtocol
YarnContainer
StrAMChild
O2
O1
YarnContainer
StrAMChild
O3
YarnContainer
StrAMChild
O4ContainerManager
Protocol
ContainerManager
Protocol
Lifecycle of Apex/YARN Application - Running
Node
ResourceManager
(AsM + Scheduler)
NM Node NM Node NM
DTCLI/
StrAMClient
(YarnClient)
HDFS
StrAM
(AppMaster)
Apache Apex Meetup
ClientRMProtocol
AMRMProtocol
YarnContainer
StrAMChild
O2
O1
YarnContainer
StrAMChild
O3
YarnContainer
StrAMChild
O4ContainerManager
Protocol
ContainerManager
Protocol
1) StrAMChild sends
heartbeats
2) StrAMChild sends operator
data
3) StrAM send regular
heartbeats to RM
4) Query status of application
Lifecycle of Apex/YARN Application - Shutdown
Node
ResourceManager
(AsM + Scheduler)
NM Node NM Node NM
DTCLI/
StrAMClient
(YarnClient)
HDFS
StrAM
(AppMaster)
Apache Apex Meetup
ClientRMProtocol
AMRMProtocol
YarnContainer
StrAMChild
O2
O1
YarnContainer
StrAMChild
O3
YarnContainer
StrAMChild
O4ContainerManager
Protocol
ContainerManager
Protocol
1) Connect on WebService
REST API
3) Send shutdown signal to
StrAMChild
4) StrAMChild finishes
operator lifecycle
5) Check if all containers are freed
6) StrAM unregisters itself
7) StrAM exits
8) Check if application has
shutdown
2) Send command to StrAM
Lifecycle of Apex/YARN Application - Kill
Node
ResourceManager
(AsM + Scheduler)
NM Node NM Node NM
DTCLI/
StrAMClient
(YarnClient)
HDFS
StrAM
(AppMaster)
Apache Apex Meetup
ClientRMProtocol
AMRMProtocol
YarnContainer
StrAMChild
O2
O1
YarnContainer
StrAMChild
O3
YarnContainer
StrAMChild
O4ContainerManager
Protocol
ContainerManager
Protocol
1) Send kill-app command to YARN
2) RM kills all containers
Summary – Apex platform
• Enables YARN to be used for Streaming Applications
• Takes care of YARN specific work
• User can focus on business logic defined in Operators
Apache Apex Meetup
15
Apache Apex Meetup
Resources
Apache Apex Meetup
• Apache Apex website - http://apex.incubator.apache.org/
• Subscribe - http://apex.incubator.apache.org/community.html
• Download - http://apex.incubator.apache.org/downloads.html
• Twitter - @ApacheApex; Follow - https://twitter.com/apacheapex
• Facebook - https://www.facebook.com/ApacheApex/
• Meetup - http://www.meetup.com/topics/apache-apex
• Startup Program – Free Enterprise License for startups, Universities, Non-Profits
Upcoming events...
Apache Apex Meetup
• March 24th 9am PST - Fault Tolerance and Processing Semantics with Apache
Apex
• March 28th 6pm PST - Low-latency ingestion and analytics with Apache Kafka
and Apache Apex (Hadoop)
• ...

More Related Content

What's hot

DataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application MeetupDataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application Meetup
Thomas Weise
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
Apache Apex
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
Apache Apex
 
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra TagareActionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Apache Apex
 
Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)
Apache Apex
 
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - HackacIntro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Apache Apex
 
Java High Level Stream API
Java High Level Stream APIJava High Level Stream API
Java High Level Stream API
Apache Apex
 
Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)
Apache Apex
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
Apache Apex
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big Data
Apache Apex
 
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
Apache Apex
 
Building your first aplication using Apache Apex
Building your first aplication using Apache ApexBuilding your first aplication using Apache Apex
Building your first aplication using Apache Apex
Yogi Devendra Vyavahare
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Ingestion file copy using apex
Ingestion   file copy using apexIngestion   file copy using apex
Ingestion file copy using apex
Apache Apex
 
Extending The Yahoo Streaming Benchmark to Apache Apex
Extending The Yahoo Streaming Benchmark to Apache ApexExtending The Yahoo Streaming Benchmark to Apache Apex
Extending The Yahoo Streaming Benchmark to Apache Apex
Apache Apex
 
Apache Apex Meetup at Cask
Apache Apex Meetup at CaskApache Apex Meetup at Cask
Apache Apex Meetup at Cask
Apache Apex
 
Fault Tolerance and Processing Semantics in Apache Apex
Fault Tolerance and Processing Semantics in Apache ApexFault Tolerance and Processing Semantics in Apache Apex
Fault Tolerance and Processing Semantics in Apache Apex
Apache Apex Organizer
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Apache Apex
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application
Apache Apex
 
Ingesting Data from Kafka to JDBC with Transformation and Enrichment
Ingesting Data from Kafka to JDBC with Transformation and EnrichmentIngesting Data from Kafka to JDBC with Transformation and Enrichment
Ingesting Data from Kafka to JDBC with Transformation and Enrichment
Apache Apex
 

What's hot (20)

DataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application MeetupDataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application Meetup
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra TagareActionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
 
Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)
 
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - HackacIntro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
 
Java High Level Stream API
Java High Level Stream APIJava High Level Stream API
Java High Level Stream API
 
Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big Data
 
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
 
Building your first aplication using Apache Apex
Building your first aplication using Apache ApexBuilding your first aplication using Apache Apex
Building your first aplication using Apache Apex
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
 
Ingestion file copy using apex
Ingestion   file copy using apexIngestion   file copy using apex
Ingestion file copy using apex
 
Extending The Yahoo Streaming Benchmark to Apache Apex
Extending The Yahoo Streaming Benchmark to Apache ApexExtending The Yahoo Streaming Benchmark to Apache Apex
Extending The Yahoo Streaming Benchmark to Apache Apex
 
Apache Apex Meetup at Cask
Apache Apex Meetup at CaskApache Apex Meetup at Cask
Apache Apex Meetup at Cask
 
Fault Tolerance and Processing Semantics in Apache Apex
Fault Tolerance and Processing Semantics in Apache ApexFault Tolerance and Processing Semantics in Apache Apex
Fault Tolerance and Processing Semantics in Apache Apex
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application
 
Ingesting Data from Kafka to JDBC with Transformation and Enrichment
Ingesting Data from Kafka to JDBC with Transformation and EnrichmentIngesting Data from Kafka to JDBC with Transformation and Enrichment
Ingesting Data from Kafka to JDBC with Transformation and Enrichment
 

Similar to Apache Apex as a YARN Apllication

Apache Apex as YARN Application
Apache Apex as YARN ApplicationApache Apex as YARN Application
Apache Apex as YARN Application
Chinmay Kolhatkar
 
Spark on yarn
Spark on yarnSpark on yarn
Spark on yarn
datamantra
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache Apex
Apache Apex
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
Chinmay Kolhatkar
 
BigDataSpain 2016: Stream Processing Applications with Apache Apex
BigDataSpain 2016: Stream Processing Applications with Apache ApexBigDataSpain 2016: Stream Processing Applications with Apache Apex
BigDataSpain 2016: Stream Processing Applications with Apache Apex
Thomas Weise
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
Comsysto Reply GmbH
 
Flink Streaming @BudapestData
Flink Streaming @BudapestDataFlink Streaming @BudapestData
Flink Streaming @BudapestData
Gyula Fóra
 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
Yuuki Takano
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
DataWorks Summit/Hadoop Summit
 
Stream Processing use cases and applications with Apache Apex by Thomas Weise
Stream Processing use cases and applications with Apache Apex by Thomas WeiseStream Processing use cases and applications with Apache Apex by Thomas Weise
Stream Processing use cases and applications with Apache Apex by Thomas Weise
Big Data Spain
 
Apache Apex - BufferServer
Apache Apex - BufferServerApache Apex - BufferServer
Apache Apex - BufferServer
Pradeep Dalvi
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
Brendan Gregg
 
Practical 7 - Using Wireshark Tutorial and Hands-on
Practical 7 - Using Wireshark Tutorial and Hands-onPractical 7 - Using Wireshark Tutorial and Hands-on
Practical 7 - Using Wireshark Tutorial and Hands-on
QaisSaifQassim
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight Overview
Jacques Nadeau
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
Evan Chan
 
Acl Tcam
Acl TcamAcl Tcam
Acl Tcam
amit_monty
 
BigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache ApexBigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache Apex
Thomas Weise
 
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
Zhijie Shen
 
BKK16-106 ODP Project Update
BKK16-106 ODP Project UpdateBKK16-106 ODP Project Update
BKK16-106 ODP Project Update
Linaro
 
Apache Storm
Apache StormApache Storm
Apache Storm
Rajind Ruparathna
 

Similar to Apache Apex as a YARN Apllication (20)

Apache Apex as YARN Application
Apache Apex as YARN ApplicationApache Apex as YARN Application
Apache Apex as YARN Application
 
Spark on yarn
Spark on yarnSpark on yarn
Spark on yarn
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache Apex
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
BigDataSpain 2016: Stream Processing Applications with Apache Apex
BigDataSpain 2016: Stream Processing Applications with Apache ApexBigDataSpain 2016: Stream Processing Applications with Apache Apex
BigDataSpain 2016: Stream Processing Applications with Apache Apex
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
 
Flink Streaming @BudapestData
Flink Streaming @BudapestDataFlink Streaming @BudapestData
Flink Streaming @BudapestData
 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
Stream Processing use cases and applications with Apache Apex by Thomas Weise
Stream Processing use cases and applications with Apache Apex by Thomas WeiseStream Processing use cases and applications with Apache Apex by Thomas Weise
Stream Processing use cases and applications with Apache Apex by Thomas Weise
 
Apache Apex - BufferServer
Apache Apex - BufferServerApache Apex - BufferServer
Apache Apex - BufferServer
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
 
Practical 7 - Using Wireshark Tutorial and Hands-on
Practical 7 - Using Wireshark Tutorial and Hands-onPractical 7 - Using Wireshark Tutorial and Hands-on
Practical 7 - Using Wireshark Tutorial and Hands-on
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight Overview
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
 
Acl Tcam
Acl TcamAcl Tcam
Acl Tcam
 
BigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache ApexBigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache Apex
 
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
 
BKK16-106 ODP Project Update
BKK16-106 ODP Project UpdateBKK16-106 ODP Project Update
BKK16-106 ODP Project Update
 
Apache Storm
Apache StormApache Storm
Apache Storm
 

More from Apache Apex

Low Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache ApexLow Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache Apex
Apache Apex
 
From Batch to Streaming with Apache Apex Dataworks Summit 2017
From Batch to Streaming with Apache Apex Dataworks Summit 2017From Batch to Streaming with Apache Apex Dataworks Summit 2017
From Batch to Streaming with Apache Apex Dataworks Summit 2017
Apache Apex
 
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Apex
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFS
Apache Apex
 
Introduction to Real-Time Data Processing
Introduction to Real-Time Data ProcessingIntroduction to Real-Time Data Processing
Introduction to Real-Time Data Processing
Apache Apex
 
Introduction to Yarn
Introduction to YarnIntroduction to Yarn
Introduction to Yarn
Apache Apex
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
Apache Apex
 
HDFS Internals
HDFS InternalsHDFS Internals
HDFS Internals
Apache Apex
 
Intro to Big Data Hadoop
Intro to Big Data HadoopIntro to Big Data Hadoop
Intro to Big Data Hadoop
Apache Apex
 
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data TransformationsKafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Apache Apex
 
Building Your First Apache Apex (Next Gen Big Data/Hadoop) Application
Building Your First Apache Apex (Next Gen Big Data/Hadoop) ApplicationBuilding Your First Apache Apex (Next Gen Big Data/Hadoop) Application
Building Your First Apache Apex (Next Gen Big Data/Hadoop) Application
Apache Apex
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Apache Apex
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
Apache Apex
 
Apache Beam (incubating)
Apache Beam (incubating)Apache Beam (incubating)
Apache Beam (incubating)
Apache Apex
 
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache ApexMaking sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
Apache Apex
 
Apache Apex & Bigtop
Apache Apex & BigtopApache Apex & Bigtop
Apache Apex & Bigtop
Apache Apex
 
Building Your First Apache Apex Application
Building Your First Apache Apex ApplicationBuilding Your First Apache Apex Application
Building Your First Apache Apex Application
Apache Apex
 

More from Apache Apex (17)

Low Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache ApexLow Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache Apex
 
From Batch to Streaming with Apache Apex Dataworks Summit 2017
From Batch to Streaming with Apache Apex Dataworks Summit 2017From Batch to Streaming with Apache Apex Dataworks Summit 2017
From Batch to Streaming with Apache Apex Dataworks Summit 2017
 
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFS
 
Introduction to Real-Time Data Processing
Introduction to Real-Time Data ProcessingIntroduction to Real-Time Data Processing
Introduction to Real-Time Data Processing
 
Introduction to Yarn
Introduction to YarnIntroduction to Yarn
Introduction to Yarn
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 
HDFS Internals
HDFS InternalsHDFS Internals
HDFS Internals
 
Intro to Big Data Hadoop
Intro to Big Data HadoopIntro to Big Data Hadoop
Intro to Big Data Hadoop
 
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data TransformationsKafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
 
Building Your First Apache Apex (Next Gen Big Data/Hadoop) Application
Building Your First Apache Apex (Next Gen Big Data/Hadoop) ApplicationBuilding Your First Apache Apex (Next Gen Big Data/Hadoop) Application
Building Your First Apache Apex (Next Gen Big Data/Hadoop) Application
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Apache Beam (incubating)
Apache Beam (incubating)Apache Beam (incubating)
Apache Beam (incubating)
 
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache ApexMaking sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
 
Apache Apex & Bigtop
Apache Apex & BigtopApache Apex & Bigtop
Apache Apex & Bigtop
 
Building Your First Apache Apex Application
Building Your First Apache Apex ApplicationBuilding Your First Apache Apex Application
Building Your First Apache Apex Application
 

Recently uploaded

ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
Reetu63
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
kgyxske
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
kalichargn70th171
 
Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
Severalnines
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
dakas1
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)
alowpalsadig
 
ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.
Maitrey Patel
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
Yara Milbes
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
Liberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptxLiberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptx
Massimo Artizzu
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
TMU毕业证书精仿办理
TMU毕业证书精仿办理TMU毕业证书精仿办理
TMU毕业证书精仿办理
aeeva
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
Boost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management AppsBoost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management Apps
Jhone kinadey
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
kalichargn70th171
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
dakas1
 

Recently uploaded (20)

ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
 
Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)
 
ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
Liberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptxLiberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptx
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
TMU毕业证书精仿办理
TMU毕业证书精仿办理TMU毕业证书精仿办理
TMU毕业证书精仿办理
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
Boost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management AppsBoost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management Apps
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
 

Apache Apex as a YARN Apllication

  • 1. Apache Apex as YARN Application Chinmay Kolhatkar (chinmay@apache.org) Mar 22, 2016 Apache Apex Meetup
  • 2. Agenda • Directed Acyclic Graph • Apex as a YARN Application • Application Components of Apex • Lifecycle of Apex as a YARN Application Apache Apex Meetup
  • 3. Directed Acyclic Graph (DAG) • Defines compute stages of streaming application • Defines tuple flow across Operators via Stream Compute 1 Apache Apex Meetup Compute 3 Compute 2 Compute 4
  • 4. DAG Components • Tuple ● Atomic data that flows over a stream • Operator ● Basic compute unit per tuple • Stream ● Connector abstraction between operators ● Tuples flow over this Operator 1 Operator 2 Apache Apex Meetup Stream tuple 3 tuple 1 tuple 2
  • 5. DAG Types O1 O2 O3 O4 Physical DAG Apache Apex Meetup O5 Logical DAG • Logical Plan ● Logical representation of computation ● Defines operators, streams and dataflow • Physical Plan ● Deployable plan on cluster ● Contains partition information of operators ● Has ready-to-deploy serialized operator instances O1 P1 O1 P2 O1 P3 O2 P1 O2 P2 O2 P3 U O3 O4 O5
  • 6. Apex as YARN application Node ResourceManager (AsM + Scheduler) NM Node NM Node NM YarnClient AppMaster YarnContainer YarnContainer YarnContainer StrAM (AppMaster) YarnContainer StrAMChild O1 O2 YarnContainer StrAMChild O3 DTCLI StrAMClient YarnClient Apache Apex Meetup ClientRM Protocol AMRM Protocol ContainerManager Protocol ContainerManager Protocol ClientRM Protocol AMRM Protocol ContainerManager Protocol
  • 7. Application Components of Apex - StrAMClient • Part of dtcli client interface • Invoked by “launch” command of dtcli • Tasks: ● Copy required the application package files into HDFS ● Validate Logical Plan ● Serialize Logical plan to HDFS ● Launch Application Master i.e. StrAM Apache Apex Meetup
  • 8. Application Components of Apex - StrAM • Streaming Application Master • Started by StrAMClient on a YarnContainer • Tasks: ● Convert logical plan to physical plan ● Serialize operators to HDFS ● Request for resources to ResourceManager ● Start StrAMChild in YarnContainer(s) ● Monitor StrAMChild using ContainerManager protocol ● Generate Application statistics ● Host results on WebService (dtManage) ● Fault Tolerance ● Checkpointing/Committing Application States ● Support Security ● Shutdown Application Apache Apex Meetup
  • 9. Application Components of Apex - StrAMChild • Deployed on YarnContainer • Started by NodeManager as instructed by StrAM • Instance of StreamingContainer • Contains Operators (compute-related) • Contains BufferServer (stream-related) • Tasks: ● Regularly send heartbeat to StrAM ● Execute commands from StrAM ● Shutdown or Kill self if instructed ● Manage lifecycle of an Operator ● Network communication using BufferServer Apache Apex Meetup
  • 10. Lifecycle of Apex/YARN Application - Start Node ResourceManager (AsM + Scheduler) NM Node NM Node NM DTCLI/ StrAMClient (YarnClient) 1) Access cluster information HDFS 3) Submit Application to RM StrAM (AppMaster) 4) StrAM Registers with RM 5) StrAM sends heartbeats regularly 6) StrAM request containers with specifications 7) StrAMChild reads serialized operator from HDFS 8) StrAMChild starts operator lifecycle Apache Apex Meetup 2) Copies files from HDFS ClientRMProtocol AMRMProtocol YarnContainer StrAMChild O2 O1 YarnContainer StrAMChild O3 YarnContainer StrAMChild O4ContainerManager Protocol ContainerManager Protocol
  • 11. Lifecycle of Apex/YARN Application - Running Node ResourceManager (AsM + Scheduler) NM Node NM Node NM DTCLI/ StrAMClient (YarnClient) HDFS StrAM (AppMaster) Apache Apex Meetup ClientRMProtocol AMRMProtocol YarnContainer StrAMChild O2 O1 YarnContainer StrAMChild O3 YarnContainer StrAMChild O4ContainerManager Protocol ContainerManager Protocol 1) StrAMChild sends heartbeats 2) StrAMChild sends operator data 3) StrAM send regular heartbeats to RM 4) Query status of application
  • 12. Lifecycle of Apex/YARN Application - Shutdown Node ResourceManager (AsM + Scheduler) NM Node NM Node NM DTCLI/ StrAMClient (YarnClient) HDFS StrAM (AppMaster) Apache Apex Meetup ClientRMProtocol AMRMProtocol YarnContainer StrAMChild O2 O1 YarnContainer StrAMChild O3 YarnContainer StrAMChild O4ContainerManager Protocol ContainerManager Protocol 1) Connect on WebService REST API 3) Send shutdown signal to StrAMChild 4) StrAMChild finishes operator lifecycle 5) Check if all containers are freed 6) StrAM unregisters itself 7) StrAM exits 8) Check if application has shutdown 2) Send command to StrAM
  • 13. Lifecycle of Apex/YARN Application - Kill Node ResourceManager (AsM + Scheduler) NM Node NM Node NM DTCLI/ StrAMClient (YarnClient) HDFS StrAM (AppMaster) Apache Apex Meetup ClientRMProtocol AMRMProtocol YarnContainer StrAMChild O2 O1 YarnContainer StrAMChild O3 YarnContainer StrAMChild O4ContainerManager Protocol ContainerManager Protocol 1) Send kill-app command to YARN 2) RM kills all containers
  • 14. Summary – Apex platform • Enables YARN to be used for Streaming Applications • Takes care of YARN specific work • User can focus on business logic defined in Operators Apache Apex Meetup
  • 16. Resources Apache Apex Meetup • Apache Apex website - http://apex.incubator.apache.org/ • Subscribe - http://apex.incubator.apache.org/community.html • Download - http://apex.incubator.apache.org/downloads.html • Twitter - @ApacheApex; Follow - https://twitter.com/apacheapex • Facebook - https://www.facebook.com/ApacheApex/ • Meetup - http://www.meetup.com/topics/apache-apex • Startup Program – Free Enterprise License for startups, Universities, Non-Profits
  • 17. Upcoming events... Apache Apex Meetup • March 24th 9am PST - Fault Tolerance and Processing Semantics with Apache Apex • March 28th 6pm PST - Low-latency ingestion and analytics with Apache Kafka and Apache Apex (Hadoop) • ...