SlideShare a Scribd company logo
1 of 34
Download to read offline
Building Modern Data Pipelines Using
Apache Pulsar, Heron and BookKeeper
Karthik	Ramasamy	
Cofounder
Streamlio
2
Information Age
Streaming	Data	is	key	
Ká !
3
Increasingly Connected World
Internet of Things
30	B	connected	devices	by	2020
Health Care
153	Exabytes	(2013)	->	2314	Exabytes	(2020)
Machine Data
40%	of	digital	universe	by	2020
Connected Vehicles
Data	transferred	per	vehicle	per	month	
4	MB	->	5	GB
Digital Assistants (Predictive Analytics)
$2B	(2012)	->	$6.5B	(2019)	[1]	
Siri/Cortana/Google	Now
Augmented/Virtual Reality
$150B	by	2020	[2]	
Oculus/HoloLens/Magic	Leap
Ñ
!+
>
4
Traditional Batch Processing
Challenges	
Introduces	too	much	“decision	latency”	
Responses	are	delivered	“a[er	the	fact”	
Maximum	value	of	the	idenfied	situaon	is	lost	
Decisions	are	made	on	old	and	stale	data	
Data	at	Rest
Store Analyze Act
5
The New Era: Streaming Data/Fast Data
Events	are	analyzed	and	processed	in	real-me	as	they	arrive	
Decisions	are	mely,	contextual	and	based	on	fresh	data	
Decision	latency	is	eliminated	
Data	in	moon
Modern	Data	Pipelines Streaming	Data	Pipelines
6
Streaming Data Pipeline
A	Streaming	Data	Pipeline		
captures	events	for	analysis,		
process	as	they	arrive	and		
produce	con=nuous	results
7
Streaming Data Pipeline
Data	Lake
Product	Features
Adhoc	Features
Counng
Machine	Learning
Extract	Transform	Load	(ETL)
Click	Data
Engagement	Data
Email	sends
User	events
8
Streaming Use Cases
Algorithmic	trading	
Online	fraud	detecon	
Geo	fencing	
Proximity/locaon	tracking	
Intrusion	detecon	systems	
Traffic	management
Real	me	recommendaons	
Churn	detecon	
Internet	of	things	
Social	media/data	analycs	
Gaming	data	feed	
IoT	Edge	Analycs
9
Streaming Stack
REAL TIME
STACK
Collectors
s
Compute
J
Messaging
a
Storage
b
10
State of the World
Aggregation
Systems
Messaging
Systems
Result
Engine
HDFS
Queryable
Engines
11
Towards Simplification & Unification
Data	Ingeson Data	Processing
Results	StorageData	Storage
12
Why Apache Pulsar?
Ordering	
Guaranteed	ordering
Mul=-tenancy	
A	single	cluster	can	support	many	
tenants	and	use	cases
High	throughput	
Can	reach	1.8	M	messages/s	in	a	
single	par==on
Durability	
Data	replicated	and	synced	to	disk
Geo-replica=on	
Out	of	box	support	for	geographically	
distributed	applica=ons
Unified	messaging	model	
Support	both	Topic	&	Queue	
seman=c	in	a	single	model
Delivery	Guarantees	
At	least	once,	at	most	once	and	effec=vely	
once
Low	Latency	
Low	publish	latency	of	5ms	at	99pct
Highly	scalable	
Can	support	millions	of	topics
13
Pulsar Architecture
Stateless	Serving
BROKER	
Clients interact only with brokers
No state is stored in brokers
BOOKIES	
Apache BookKeeper as the storage
Storage is append only
Provides high performance, low latency
Durability	
No data loss. fsync before acknowledgement
14
Pulsar Architecture
Separa=on	of	Storage	and	Serving
SERVING
Brokers can be added independently
Traffic can be shifted quickly across brokers
STORAGE	
Bookies can be added independently
New bookies will ramp up traffic quickly
15
Segment Centric Architecture
16
Pulsar Architecture
Geo	Replica=on
GEO REPLICATION
Asynchronous replication
Integrated in the broker message flow
Simple configuration to add/remove regions
Topic	(T1) Topic	(T1)
Topic	(T1)
Subscripon	
(S1)
Subscripon	
(S1)
Producer		
(P1)
Consumer		
(C1)
Producer		
(P3)
Producer		
(P2)
Consumer		
(C2)
Data	Center	A Data	Center	B
Data	Center	C
17
Pulsar in Production
3+	years	
Serves	2.3	million	topics	
100	billion	messages/day	
Average	latency	<	5	ms	
99%	15	ms	(strong	durability	guarantees)	
Zero	data	loss	
80+	applicaons	
Self	served	provisioning	
Full-mesh	cross-datacenter	replicaon	-	
8+	data	centers
18
Heron Design Goals
Efficiency	
Reduce	resource	consumpon
Support	for	diverse	workloads	
Throughput	vs	latency	sensive
Support	for	mul=ple	seman=cs	
Atmost	once,	Atleast	once,	
Effecvely	once
Na=ve	Mul=-Language	Support	
C++,	Java,	Python
Task	Isola=on	
Ease	of	debug-ability/isolaon/profiling	
Support	for	back	pressure	
Topologies	should	be	self	adjusng
Use	of	containers	
Runs	in	schedulers	-	Kubernetes	&	DCOS	&	
many	more
Mul=-level	APIs	
Procedural,	Funconal	and	Declarave	for	
diverse	applicaons
Diverse	deployment	models	
Run	as	a	service	or	pure	library
19
Heron Data Model
%
%
%
%
%
Spout 1
Spout 2
Bolt 1
Bolt 2
Bolt 3
Bolt 4
Bolt 5
20
Writing Heron Topologies
Procedural - Low Level API
Directly	write	your	spouts	and	bolts
Functional - Mid Level API
Use	of	maps,	flat	maps,	transform,	windows
Declarative - SQL (in the works)
Use	of	declarave	language	-	specify	what	you	
want,	system	will	figure	it	out.
,
%
21
Topology Execution
Topology Master
ZK
Cluster
Stream
Manager
I1 I2 I3 I4
Stream
Manager
I1 I2 I3 I4
Logical Plan,
Physical Plan and
Execution State
Sync Physical Plan
DATA CONTAINER DATA CONTAINER
Metrics
Manager
Metrics
Manager
Metrics
Manager
Health
Manager
MASTER
CONTAINER
22
Heron @Twitter
LARGEST	CLUSTER
100’s	of	TOPOLOGIES
BILLIONS	OF	MESSAGES100’s	OF	TERABYTESREDUCED	INCIDENTS
GOOD	NIGHT	SLEEP
3X - 5X reduction in resource usage
23
Heron Topology Complexity
24
Heron Topology Scale
CONTAINERS - 1 TO 600 INSTANCES - 10 TO 6000
25
Heron Happy Facts :)
!	No	more	pages	during	midnight	for	Heron	team	
"	Very	rare	incidents	for	Heron	customer	teams	
#	Easy	to	debug	during	incident	for	quick	turn	around	
$	Reduced	resource	ulizaon	saving	cost
26
Observations
Computaon	across	batch/streaming	is	similar	
Expressed	as	DAGS	
Run	in	parallel	on	the	cluster	
Intermediate	results	need	not	be	materialized	
Funconal/Declarave	APIs	
Storage	is	the	key	
Messaging/Storage	are	two	faces	of	the	same	coin	
They	serve	the	same	data
27
Storage Requirements
Requirements	for	a	real-=me	storage	placorm
Be	able	to	write	and	read	streams	of	records	with	low	latency,	storage	durability	
Data	storage	should	be	durable,	consistent	and	fault	tolerant	
Enable	clients	to	stream	or	tail	ledgers	to	propagate	data	as	they’re	wriren	
Store	and	provide	access	to	both	historic	and	real-me	data
28
BookKeeper in Pipelines
Durable	Messaging,	Scalable	Compute	and	Stream	Storage
29
BookKeeper in Production
Enterprise	Grade	Stream	Storage
4+	years	at	Twirer	and	Yahoo,	2+	years	at	Salesforce	
Mulple	use	cases	from	messaging	to	storage	
Database	replicaon,	Message	store,	Stream	compung	…	
600+	bookies	in	one	single	cluster	
Data	is	stored	from	days	to	a	year	
Millions	of	log	streams	
1	trillion	records/day,	17	PB/day
30
Companies using the projects
Enterprise	Grade	Stream	Storage
31
Streamlio - Unified Architecture
Interactive
Querying
Storm API
Trident/Apache
Beam
SQL
Application
Builder
Pulsar
API
BK/
HDFS
API
Kubernetes
Metadata
Management
Operational
Monitoring
Chargeback
Security
Authentication
Quota
Management
Rules
Engine
Kafka
API
33
GET	IN	TOUCH
C O N T A C T 	 U S
@kramasamy
955	Alma	Street,	Palo	Alto,	CA
karthik@streaml.io
Modern Data Pipelines

More Related Content

What's hot

Pinterest - Big Data Machine Learning Platform at Pinterest
Pinterest - Big Data Machine Learning Platform at PinterestPinterest - Big Data Machine Learning Platform at Pinterest
Pinterest - Big Data Machine Learning Platform at PinterestAlluxio, Inc.
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Databricks
 
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...Databricks
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Tutorial - Modern Real Time Streaming Architectures
Tutorial - Modern Real Time Streaming ArchitecturesTutorial - Modern Real Time Streaming Architectures
Tutorial - Modern Real Time Streaming ArchitecturesKarthik Ramasamy
 
Building real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyKishore Gopalakrishna
 
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...Flink Forward
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data PipelineManish Kumar
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Yael Garten
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheDremio Corporation
 
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Anastasija Nikiforova
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta LakeKnoldus Inc.
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solidLars Albertsson
 
Overview on Azure Machine Learning
Overview on Azure Machine LearningOverview on Azure Machine Learning
Overview on Azure Machine LearningJames Serra
 

What's hot (20)

Pinterest - Big Data Machine Learning Platform at Pinterest
Pinterest - Big Data Machine Learning Platform at PinterestPinterest - Big Data Machine Learning Platform at Pinterest
Pinterest - Big Data Machine Learning Platform at Pinterest
 
Hadoop 기반 빅데이터 이해
Hadoop 기반 빅데이터 이해Hadoop 기반 빅데이터 이해
Hadoop 기반 빅데이터 이해
 
Spark etl
Spark etlSpark etl
Spark etl
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
 
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Tutorial - Modern Real Time Streaming Architectures
Tutorial - Modern Real Time Streaming ArchitecturesTutorial - Modern Real Time Streaming Architectures
Tutorial - Modern Real Time Streaming Architectures
 
Building real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case study
 
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data Pipeline
 
From Data Warehouse to Lakehouse
From Data Warehouse to LakehouseFrom Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta Lake
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solid
 
Overview on Azure Machine Learning
Overview on Azure Machine LearningOverview on Azure Machine Learning
Overview on Azure Machine Learning
 

Similar to Modern Data Pipelines

Streaming Pipelines in Kubernetes Using Apache Pulsar, Heron and BookKeeper
Streaming Pipelines in Kubernetes Using Apache Pulsar, Heron and BookKeeperStreaming Pipelines in Kubernetes Using Apache Pulsar, Heron and BookKeeper
Streaming Pipelines in Kubernetes Using Apache Pulsar, Heron and BookKeeperKarthik Ramasamy
 
DataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big DataDataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big DataDATAVERSITY
 
Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big DataData-Ed: Demystifying Big Data
Data-Ed: Demystifying Big DataData Blueprint
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its ChallengesKathirvel Ayyaswamy
 
Modern data integration | Diyotta
Modern data integration | Diyotta Modern data integration | Diyotta
Modern data integration | Diyotta diyotta
 
Harness the Power of Big Data with Oracle
Harness the Power of Big Data with OracleHarness the Power of Big Data with Oracle
Harness the Power of Big Data with OracleSai Janakiram Penumuru
 
Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?Jennifer Walker
 
Analytics in IoT
Analytics in IoTAnalytics in IoT
Analytics in IoTwesley Dias
 
Sean gately internet of things
Sean gately   internet of thingsSean gately   internet of things
Sean gately internet of thingsProductCamp SoCal
 
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and OpportunitiesKenny Huang Ph.D.
 
State of Big Data Markets
State of Big Data MarketsState of Big Data Markets
State of Big Data MarketsKyle Redinger
 
Presentación Marcos Grilanda /Amazon Web Services - eCommerce Day Santiago 2017
Presentación Marcos Grilanda /Amazon Web Services - eCommerce Day Santiago 2017Presentación Marcos Grilanda /Amazon Web Services - eCommerce Day Santiago 2017
Presentación Marcos Grilanda /Amazon Web Services - eCommerce Day Santiago 2017eCommerce Institute
 
Matthew Johnston - Big Data Futures Outlook BCM
Matthew Johnston - Big Data Futures Outlook BCMMatthew Johnston - Big Data Futures Outlook BCM
Matthew Johnston - Big Data Futures Outlook BCMHoi Lan Leong
 
Big data and Internet
Big data and InternetBig data and Internet
Big data and InternetSanoj Kumar
 

Similar to Modern Data Pipelines (20)

Streaming Pipelines in Kubernetes Using Apache Pulsar, Heron and BookKeeper
Streaming Pipelines in Kubernetes Using Apache Pulsar, Heron and BookKeeperStreaming Pipelines in Kubernetes Using Apache Pulsar, Heron and BookKeeper
Streaming Pipelines in Kubernetes Using Apache Pulsar, Heron and BookKeeper
 
Big Data et eGovernment
Big Data et eGovernmentBig Data et eGovernment
Big Data et eGovernment
 
DataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big DataDataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big Data
 
Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big DataData-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data
 
S B Goyal
S B GoyalS B Goyal
S B Goyal
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its Challenges
 
Modern data integration | Diyotta
Modern data integration | Diyotta Modern data integration | Diyotta
Modern data integration | Diyotta
 
Harness the Power of Big Data with Oracle
Harness the Power of Big Data with OracleHarness the Power of Big Data with Oracle
Harness the Power of Big Data with Oracle
 
Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?
 
Information Overload Phenomena
Information Overload PhenomenaInformation Overload Phenomena
Information Overload Phenomena
 
Analytics in IoT
Analytics in IoTAnalytics in IoT
Analytics in IoT
 
Sean gately internet of things
Sean gately   internet of thingsSean gately   internet of things
Sean gately internet of things
 
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and Opportunities
 
Big Data
Big DataBig Data
Big Data
 
Big Data
Big DataBig Data
Big Data
 
State of Big Data Markets
State of Big Data MarketsState of Big Data Markets
State of Big Data Markets
 
Presentación Marcos Grilanda /Amazon Web Services - eCommerce Day Santiago 2017
Presentación Marcos Grilanda /Amazon Web Services - eCommerce Day Santiago 2017Presentación Marcos Grilanda /Amazon Web Services - eCommerce Day Santiago 2017
Presentación Marcos Grilanda /Amazon Web Services - eCommerce Day Santiago 2017
 
Matthew Johnston - Big Data Futures Outlook BCM
Matthew Johnston - Big Data Futures Outlook BCMMatthew Johnston - Big Data Futures Outlook BCM
Matthew Johnston - Big Data Futures Outlook BCM
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big data and Internet
Big data and InternetBig data and Internet
Big data and Internet
 

More from Karthik Ramasamy

Scaling Apache Pulsar to 10 PB/day
Scaling Apache Pulsar to 10 PB/dayScaling Apache Pulsar to 10 PB/day
Scaling Apache Pulsar to 10 PB/dayKarthik Ramasamy
 
Pulsar summit-keynote-final
Pulsar summit-keynote-finalPulsar summit-keynote-final
Pulsar summit-keynote-finalKarthik Ramasamy
 
Apache Pulsar Seattle - Meetup
Apache Pulsar Seattle - MeetupApache Pulsar Seattle - Meetup
Apache Pulsar Seattle - MeetupKarthik Ramasamy
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarKarthik Ramasamy
 
Creating Data Fabric for #IOT with Apache Pulsar
Creating Data Fabric for #IOT with Apache PulsarCreating Data Fabric for #IOT with Apache Pulsar
Creating Data Fabric for #IOT with Apache PulsarKarthik Ramasamy
 
Linked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarLinked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarKarthik Ramasamy
 
Exactly once in Apache Heron
Exactly once in Apache HeronExactly once in Apache Heron
Exactly once in Apache HeronKarthik Ramasamy
 
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...Karthik Ramasamy
 
Storm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paperStorm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paperKarthik Ramasamy
 
Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Karthik Ramasamy
 

More from Karthik Ramasamy (11)

Scaling Apache Pulsar to 10 PB/day
Scaling Apache Pulsar to 10 PB/dayScaling Apache Pulsar to 10 PB/day
Scaling Apache Pulsar to 10 PB/day
 
Apache Pulsar @Splunk
Apache Pulsar @SplunkApache Pulsar @Splunk
Apache Pulsar @Splunk
 
Pulsar summit-keynote-final
Pulsar summit-keynote-finalPulsar summit-keynote-final
Pulsar summit-keynote-final
 
Apache Pulsar Seattle - Meetup
Apache Pulsar Seattle - MeetupApache Pulsar Seattle - Meetup
Apache Pulsar Seattle - Meetup
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
 
Creating Data Fabric for #IOT with Apache Pulsar
Creating Data Fabric for #IOT with Apache PulsarCreating Data Fabric for #IOT with Apache Pulsar
Creating Data Fabric for #IOT with Apache Pulsar
 
Linked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarLinked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache Pulsar
 
Exactly once in Apache Heron
Exactly once in Apache HeronExactly once in Apache Heron
Exactly once in Apache Heron
 
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
 
Storm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paperStorm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paper
 
Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014
 

Recently uploaded

Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATIONLakpaYanziSherpa
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdftheeltifs
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schscnajjemba
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 

Recently uploaded (20)

Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 

Modern Data Pipelines