SlideShare a Scribd company logo
Hadoop Cluster
Hadoop Cluster
Speed Layer
Storm/Spark (Real
time processing)
Batch Layer Analytics
ELT
Kafka Cluster (Mirror)
Mirroring
Kafka Cluster
Hadoop Client for
Camus
Hive/Pig
DWH
HBase
Master Plan (Lambda Architecture)
Variety Sources
REST/IS
Kafka Cluster A
REST/IS
REST/IS
Kafka Cluster B
Mirroring
Kafka Cluster C
Kafka Cluster D
Topic : “Data”
Topic : “Data”
Kafka (Mirroring) Configuration
Integration of all the sources
coming to cluster A and B with
topic Name “Data”
Integration of all the sources
coming to cluster A and B with
topic Name “Data”
Host Configuration
-Quad-Core AMD Opteron(TM)
Processor
-8GB RAM
-320Gb
Batch Layer (Writing the data
into hadoop cluster)
Speed Layer (Feeding data from
Kafka to Storm )

More Related Content

What's hot

Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
Patrick McFadin
 

What's hot (20)

Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
 
Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015
 
Streaming Big Data & Analytics For Scale
Streaming Big Data & Analytics For ScaleStreaming Big Data & Analytics For Scale
Streaming Big Data & Analytics For Scale
 
Real-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackReal-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stack
 
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise SearchAn Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache spark
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data Product
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and Cassandra
 
Reactive app using actor model & apache spark
Reactive app using actor model & apache sparkReactive app using actor model & apache spark
Reactive app using actor model & apache spark
 
The How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache SparkThe How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache Spark
 
Real Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark StreamingReal Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark Streaming
 
Getting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache MesosGetting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache Mesos
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
 
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
 
Lambda Architecture Using SQL
Lambda Architecture Using SQLLambda Architecture Using SQL
Lambda Architecture Using SQL
 
Alpine academy apache spark series #1 introduction to cluster computing wit...
Alpine academy apache spark series #1   introduction to cluster computing wit...Alpine academy apache spark series #1   introduction to cluster computing wit...
Alpine academy apache spark series #1 introduction to cluster computing wit...
 
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
 

Viewers also liked

Salesforce API Series: Fast Parallel Data Loading with the Bulk API Webinar
Salesforce API Series: Fast Parallel Data Loading with the Bulk API WebinarSalesforce API Series: Fast Parallel Data Loading with the Bulk API Webinar
Salesforce API Series: Fast Parallel Data Loading with the Bulk API Webinar
Salesforce Developers
 

Viewers also liked (20)

Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
 
Demystifying salesforce for developers
Demystifying salesforce for developersDemystifying salesforce for developers
Demystifying salesforce for developers
 
Extreme Salesforce Data Volumes Webinar
Extreme Salesforce Data Volumes WebinarExtreme Salesforce Data Volumes Webinar
Extreme Salesforce Data Volumes Webinar
 
How Apache Kafka is transforming Hadoop, Spark and Storm
How Apache Kafka is transforming Hadoop, Spark and StormHow Apache Kafka is transforming Hadoop, Spark and Storm
How Apache Kafka is transforming Hadoop, Spark and Storm
 
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
 
7가지 동시성 모델 람다아키텍처
7가지 동시성 모델  람다아키텍처7가지 동시성 모델  람다아키텍처
7가지 동시성 모델 람다아키텍처
 
Handling of Large Data by Salesforce
Handling of Large Data by SalesforceHandling of Large Data by Salesforce
Handling of Large Data by Salesforce
 
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
 
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
 
Machine learning at Scale with Apache Spark
Machine learning at Scale with Apache SparkMachine learning at Scale with Apache Spark
Machine learning at Scale with Apache Spark
 
Salesforce REST API
Salesforce  REST API Salesforce  REST API
Salesforce REST API
 
Understanding the Salesforce Architecture: How We Do the Magic We Do
Understanding the Salesforce Architecture: How We Do the Magic We DoUnderstanding the Salesforce Architecture: How We Do the Magic We Do
Understanding the Salesforce Architecture: How We Do the Magic We Do
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
Salesforce API Series: Fast Parallel Data Loading with the Bulk API Webinar
Salesforce API Series: Fast Parallel Data Loading with the Bulk API WebinarSalesforce API Series: Fast Parallel Data Loading with the Bulk API Webinar
Salesforce API Series: Fast Parallel Data Loading with the Bulk API Webinar
 
Building a Lambda Architecture with Elasticsearch at Yieldbot
Building a Lambda Architecture with Elasticsearch at YieldbotBuilding a Lambda Architecture with Elasticsearch at Yieldbot
Building a Lambda Architecture with Elasticsearch at Yieldbot
 
Introduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability MeetupIntroduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability Meetup
 
Microservice-based Architecture on the Salesforce App Cloud
Microservice-based Architecture on the Salesforce App CloudMicroservice-based Architecture on the Salesforce App Cloud
Microservice-based Architecture on the Salesforce App Cloud
 
Large Data Management Strategies
Large Data Management StrategiesLarge Data Management Strategies
Large Data Management Strategies
 
람다아키텍처
람다아키텍처람다아키텍처
람다아키텍처
 
Kafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesKafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier Architectures
 

Kafka Lambda architecture with mirroring

  • 1. Hadoop Cluster Hadoop Cluster Speed Layer Storm/Spark (Real time processing) Batch Layer Analytics ELT Kafka Cluster (Mirror) Mirroring Kafka Cluster Hadoop Client for Camus Hive/Pig DWH HBase Master Plan (Lambda Architecture) Variety Sources REST/IS
  • 2. Kafka Cluster A REST/IS REST/IS Kafka Cluster B Mirroring Kafka Cluster C Kafka Cluster D Topic : “Data” Topic : “Data” Kafka (Mirroring) Configuration Integration of all the sources coming to cluster A and B with topic Name “Data” Integration of all the sources coming to cluster A and B with topic Name “Data” Host Configuration -Quad-Core AMD Opteron(TM) Processor -8GB RAM -320Gb Batch Layer (Writing the data into hadoop cluster) Speed Layer (Feeding data from Kafka to Storm )