SlideShare a Scribd company logo
1 of 34
Untangling Cluster Management with Helix

Helix team @ LinkedIn
Kishore Gopalakrishna
http://www.linkedin.com/in/kgopalak
@kishoreg1980
     Recruiting Solutions                  1
Outline


 What is Helix
 Use case 1: distributed data store
 Architecture
 Use case 2: consumer group
 Helix at LinkedIn
 Q&A


                                       2
What is Helix




  Cluster management framework for distributed systems
  using declarative state model




                                                         3
Distributed system examples




                              4
Motivation

 A system starts out simple…
 …but gets complex in the real world
 …as you address real requirements

                          Application

                           client library
  Scale
  Failover
  Bootstrapping
                           Call Routing
                             System

          Replica 1                         …

          Replica 2                         …
                                                5
Motivation




 These are cluster management problems
  Helix solves them once…
     Scale
  …so you can focus on your system
     Failover
  Bootstrapping




                                          6
Outline


 What is Helix
 Use case 1: distributed data store
 Architecture
 Use case 2: consumer group
 Helix at LinkedIn
 Q&A


                                       7
Use-Case: Distributed Data Store

 Distributed




                          P.1




      Node 1            Node 2     Node 3


                                            8
Use-Case: Distributed Data Store

 Distributed
 Partitioned




  P.1    P.2     P.3   P.5     P.6    P.7   P.9    P.1     P.11
                                                   0
  P.4                  P.8                  P.1
                                            2



        Node 1               Node 2               Node 3


                                                                  9
Use-Case: Distributed Data Store

 Distributed
 Partitioned
 Replicated




  P.1    P.2     P.3   P.5      P.6    P.7   P.9    P.1     P.11
                                                    0
  P.4    P.5     P.6   P.8      P.1    P.2   P.1    P.3     P.4
                                             2
  P.9    P.1           P.11     P.1          P.7    P.8
         0                      2

        Node 1                Node 2               Node 3


                                                                   10
Partition Layout

 Highly Available
 Master accepts writes
 Balanced distribution
                                                            Master
                                                            Slave




  P.1    P.2     P.3   P.5      P.6    P.7   P.9    P.1       P.11
                                                    0
  P.4    P.5     P.6   P.8      P.1    P.2   P.1    P.3       P.4
                                             2
  P.9    P.1           P.11     P.1          P.7    P.8
         0                      2

        Node 1                Node 2               Node 3


                                                                     11
Failover




                                                            Master
                                                            Slave




  P.1    P.2     P.3   P.5      P.6    P.7   P.9    P.1       P.11
                                                    0
  P.4    P.5     P.6   P.8      P.1    P.2   P.1    P.3       P.4
                                             2
  P.9    P.1           P.11     P.1          P.7    P.8
         0                      2

        Node 1                Node 2               Node 3
Add Capacity


  P.1    P.5     P.9


  P.1    P.1     P.8
  0      2
                                                            Master
        Node 4                                              Slave




  P.1    P.2     P.3   P.5      P.6    P.7   P.9    P.1       P.11
                                                    0
  P.4    P.5     P.6   P.8      P.1    P.2   P.1    P.3       P.4
                                             2
  P.9    P.1           P.11     P.1          P.7    P.8
         0                      2

        Node 1                Node 2               Node 3
Use-case requirements

  • Partition constraints
     • 1 master per partition
     • Balance partitions across cluster
     • No single-point-of-failure: replicas on different nodes
  • Handle failures: transfer mastership
  • Elasticity
     • Distribute workload across added nodes
      Minimize partition movement
  • Meet SLAs
      Throttle concurrent data movement




                                                                 14
Recruiting Solutions   ‹#›
Generalizing cluster management



                   STATE MACHINE




          CONSTRAINTS              OBJECTIVE

                                               16
Outline


 What is Helix
 Use case 1: distributed data store
 Architecture
 Use case 2: consumer group
 Helix at LinkedIn
 Q&A


                                       17
Helix Based System Roles

                                                                                 PARTICIPANT
    IDEAL STATE

                                                                                 SPECTATOR
                                    Controller


                                                       Parition routing
                                                             logic
   CURRENT STATE
                         RESPONSE        COMMAND




   P.1     P.2     P.3          P.5        P.6   P.7       P.9       P.1   P.1
                                                                     0     1

   P.4     P.5     P.6          P.8        P.1   P.2       P.1       P.3   P.4
                                           P.1
                                                           2

   P.9     P.1                  P.1        P.1             P.7       P.8
           0                    1          2


         Node 1                       Node 2                     Node 3

                                                                                       18
Controller Execution Flow



             N1   P1   P2               SLAVE              N1   P1   P2
                                          S
             N2   P2   P3                                  N2   P2   P3


             N3   P3   P1                                  N3   P3   P1

                                                                           N1
                             O                        M
                            OFFLINE               MASTER

                                      REBALANCER                           N2

                                                            P1:OS
                                                           P1:SM
             N1   P1   P2

                                                                           N3
             N2   P2   P3
                                      ZooKeeper

SPECTATORS   N3   P3   P1



                                                           MESSAGE QUEUE
Controller fault tolerance




                             Zookeeper




               Controller    Controller   Controller
                  1             2            3




               LEADER        STANDBY      STANDBY




                                                       20
Controller fault tolerance




                             Zookeeper




               Controller    Controller   Controller
                  1             2            3




               OFFLINE       LEADER       STANDBY




                                                       21
Participant Plug-in code




                           22
Spectator Plug-in code




                         23
Benefits

 Cluster operations “just work”
   – Bootstrapping
   – Failover
   – Add nodes
 Global vs Local
   – Helix Controller
        Global knowledge
        Makes cluster decisions
   – Participant
        Local knowledge
        Follows orders




                                   24
Outline


 What is Helix
 Use case 1: distributed data store
 Architecture
 Use case 2: consumer group
 Helix at LinkedIn
 Q&A


                                       25
consumer group




                 26
Consumer group: Scaling




                          27
Consumer group: Fault tolerance




                                  28
Consumer group: state model


                   ONLINE     MAX=1




                   OFFLINE


                                      29
Outline


 What is Helix
 Use case 1: distributed data store
 Architecture
 Use case 2: consumer group
 Helix at LinkedIn
 Q&A


                                       30
Helix usage at LinkedIn (Pictures)

 Espresso
   – a timeline-consistent, distributed data store
 Databus
   – a change data capture service
 Search as a Service
   – a multi-tenant service for multiple search applications
 More planned




                                                               31
Summary

 Building Distributed Data Systems is hard
   – Abstraction and modularity is key
 Helix: A Generic framework for Cluster Management
 Simple programming model: declarative state machine




                                                        32
Helix: Future Roadmap


• Features
   • Span multiple data centers
   • Load balancing


• Announcement
   • Open source: https://github.com/linkedin/helix
   • Apache incubation
   • New contributors
Questions?




             34

More Related Content

What's hot

What's hot (20)

Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Apache Helix DevOps & LSPE-IN Meetup
Apache Helix DevOps & LSPE-IN Meetup Apache Helix DevOps & LSPE-IN Meetup
Apache Helix DevOps & LSPE-IN Meetup
 
Introduction to Distributed Tracing
Introduction to Distributed TracingIntroduction to Distributed Tracing
Introduction to Distributed Tracing
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
Combining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observabilityCombining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observability
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Plan 9のお話
Plan 9のお話Plan 9のお話
Plan 9のお話
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
kafka
kafkakafka
kafka
 
Microservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsMicroservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native Apps
 
Harbor RegistryのReplication機能
Harbor RegistryのReplication機能Harbor RegistryのReplication機能
Harbor RegistryのReplication機能
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
 
Service Mesh - Observability
Service Mesh - ObservabilityService Mesh - Observability
Service Mesh - Observability
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Reactive stream processing using Akka streams
Reactive stream processing using Akka streams Reactive stream processing using Akka streams
Reactive stream processing using Akka streams
 
[234]멀티테넌트 하둡 클러스터 운영 경험기
[234]멀티테넌트 하둡 클러스터 운영 경험기[234]멀티테넌트 하둡 클러스터 운영 경험기
[234]멀티테넌트 하둡 클러스터 운영 경험기
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Airflow Clustering and High Availability
Airflow Clustering and High AvailabilityAirflow Clustering and High Availability
Airflow Clustering and High Availability
 

More from Kishore Gopalakrishna

More from Kishore Gopalakrishna (6)

History of Apache Pinot
History of Apache Pinot History of Apache Pinot
History of Apache Pinot
 
Building real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case study
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastore
 
Multi-Tenant Data Cloud with YARN & Helix
Multi-Tenant Data Cloud with YARN & HelixMulti-Tenant Data Cloud with YARN & Helix
Multi-Tenant Data Cloud with YARN & Helix
 
Untangling cluster management with Helix
Untangling cluster management with HelixUntangling cluster management with Helix
Untangling cluster management with Helix
 
Data driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache HelixData driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache Helix
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Apache Helix presentation at SOCC 2012

  • 1. Untangling Cluster Management with Helix Helix team @ LinkedIn Kishore Gopalakrishna http://www.linkedin.com/in/kgopalak @kishoreg1980 Recruiting Solutions 1
  • 2. Outline  What is Helix  Use case 1: distributed data store  Architecture  Use case 2: consumer group  Helix at LinkedIn  Q&A 2
  • 3. What is Helix Cluster management framework for distributed systems using declarative state model 3
  • 5. Motivation  A system starts out simple…  …but gets complex in the real world  …as you address real requirements Application client library  Scale  Failover  Bootstrapping Call Routing System Replica 1 … Replica 2 … 5
  • 6. Motivation  These are cluster management problems   Helix solves them once… Scale   …so you can focus on your system Failover  Bootstrapping 6
  • 7. Outline  What is Helix  Use case 1: distributed data store  Architecture  Use case 2: consumer group  Helix at LinkedIn  Q&A 7
  • 8. Use-Case: Distributed Data Store  Distributed P.1 Node 1 Node 2 Node 3 8
  • 9. Use-Case: Distributed Data Store  Distributed  Partitioned P.1 P.2 P.3 P.5 P.6 P.7 P.9 P.1 P.11 0 P.4 P.8 P.1 2 Node 1 Node 2 Node 3 9
  • 10. Use-Case: Distributed Data Store  Distributed  Partitioned  Replicated P.1 P.2 P.3 P.5 P.6 P.7 P.9 P.1 P.11 0 P.4 P.5 P.6 P.8 P.1 P.2 P.1 P.3 P.4 2 P.9 P.1 P.11 P.1 P.7 P.8 0 2 Node 1 Node 2 Node 3 10
  • 11. Partition Layout  Highly Available  Master accepts writes  Balanced distribution Master Slave P.1 P.2 P.3 P.5 P.6 P.7 P.9 P.1 P.11 0 P.4 P.5 P.6 P.8 P.1 P.2 P.1 P.3 P.4 2 P.9 P.1 P.11 P.1 P.7 P.8 0 2 Node 1 Node 2 Node 3 11
  • 12. Failover Master Slave P.1 P.2 P.3 P.5 P.6 P.7 P.9 P.1 P.11 0 P.4 P.5 P.6 P.8 P.1 P.2 P.1 P.3 P.4 2 P.9 P.1 P.11 P.1 P.7 P.8 0 2 Node 1 Node 2 Node 3
  • 13. Add Capacity P.1 P.5 P.9 P.1 P.1 P.8 0 2 Master Node 4 Slave P.1 P.2 P.3 P.5 P.6 P.7 P.9 P.1 P.11 0 P.4 P.5 P.6 P.8 P.1 P.2 P.1 P.3 P.4 2 P.9 P.1 P.11 P.1 P.7 P.8 0 2 Node 1 Node 2 Node 3
  • 14. Use-case requirements • Partition constraints • 1 master per partition • Balance partitions across cluster • No single-point-of-failure: replicas on different nodes • Handle failures: transfer mastership • Elasticity • Distribute workload across added nodes  Minimize partition movement • Meet SLAs  Throttle concurrent data movement 14
  • 16. Generalizing cluster management STATE MACHINE CONSTRAINTS OBJECTIVE 16
  • 17. Outline  What is Helix  Use case 1: distributed data store  Architecture  Use case 2: consumer group  Helix at LinkedIn  Q&A 17
  • 18. Helix Based System Roles PARTICIPANT IDEAL STATE SPECTATOR Controller Parition routing logic CURRENT STATE RESPONSE COMMAND P.1 P.2 P.3 P.5 P.6 P.7 P.9 P.1 P.1 0 1 P.4 P.5 P.6 P.8 P.1 P.2 P.1 P.3 P.4 P.1 2 P.9 P.1 P.1 P.1 P.7 P.8 0 1 2 Node 1 Node 2 Node 3 18
  • 19. Controller Execution Flow N1 P1 P2 SLAVE N1 P1 P2 S N2 P2 P3 N2 P2 P3 N3 P3 P1 N3 P3 P1 N1 O M OFFLINE MASTER REBALANCER N2 P1:OS P1:SM N1 P1 P2 N3 N2 P2 P3 ZooKeeper SPECTATORS N3 P3 P1 MESSAGE QUEUE
  • 20. Controller fault tolerance Zookeeper Controller Controller Controller 1 2 3 LEADER STANDBY STANDBY 20
  • 21. Controller fault tolerance Zookeeper Controller Controller Controller 1 2 3 OFFLINE LEADER STANDBY 21
  • 24. Benefits  Cluster operations “just work” – Bootstrapping – Failover – Add nodes  Global vs Local – Helix Controller  Global knowledge  Makes cluster decisions – Participant  Local knowledge  Follows orders 24
  • 25. Outline  What is Helix  Use case 1: distributed data store  Architecture  Use case 2: consumer group  Helix at LinkedIn  Q&A 25
  • 28. Consumer group: Fault tolerance 28
  • 29. Consumer group: state model ONLINE MAX=1 OFFLINE 29
  • 30. Outline  What is Helix  Use case 1: distributed data store  Architecture  Use case 2: consumer group  Helix at LinkedIn  Q&A 30
  • 31. Helix usage at LinkedIn (Pictures)  Espresso – a timeline-consistent, distributed data store  Databus – a change data capture service  Search as a Service – a multi-tenant service for multiple search applications  More planned 31
  • 32. Summary  Building Distributed Data Systems is hard – Abstraction and modularity is key  Helix: A Generic framework for Cluster Management  Simple programming model: declarative state machine 32
  • 33. Helix: Future Roadmap • Features • Span multiple data centers • Load balancing • Announcement • Open source: https://github.com/linkedin/helix • Apache incubation • New contributors

Editor's Notes

  1. Partitioned queue consumption, lets say there are 6 queues and some consumers to consume form these queues.The requirement is simple, the number of queues must be equally divided among the consumers. On top of the we need partition affinity while consuming instead of randomly picking up from any queue.