SlideShare a Scribd company logo
1 of 23
Download to read offline
© Hortonworks Inc. 2017
Apache Ratis
In Search of a Usable Raft Library
Tsz-Wo Nicholas Sze
11/15/2017
Brown Bag Lunch Talk
Page 1
© Hortonworks Inc. 2017
About Me
• Tsz-Wo Nicholas Sze, Ph.D.
– Software Engineer at Hortonworks
– PMC member/Committer of Apache Hadoop
– Active contributor and committer of Apache Ratis
– Ph.D. from University of Maryland, College Park
– MPhil & BEng from Hong Kong University of Sci & Tech
Page 2
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Agenda
• A brief introduction of RAFT
• Apache Ratis
– Features and use cases
– Demo
• Project Status
– Current development
– Future works
Page 3
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Consensus
• What is consensus?
– Multiple servers to agree a value
• Typical use cases
– Log replication
– Replicated state machines
Page 4
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Consensus Algorithms
• Paxos (1990)
– Work but hard to understand
– Hard to implement (correctly)
• Raft (2014)
• “In Search of an Understandable Consensus Algorithm”
– by Diego Ongaro and John Ousterhout
– USENIX ATC’14, https://raft.github.io
• Motivations
– Easy to understand
– Easy to prove
– Easy to implement
Page 5
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Raft Basic
• Leader Election
– Servers are started as a Follower
– Randomly timeout to become Candidate and start a leader election
• Candidate sends requestVote to other servers
– It becomes the leader once it gets a majority of the votes.
• Append Entries
– Clients send requests to the Leader
– Leader forwards the requests to the Followers
• Leader sends appendEntries to Followers
– When there is no client requests, Leader also sends empty
appendEntries to Followers to maintain leadership
Page 6
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Raft Library
• Our Motations
– Use Raft in Ozone
• “In Search of a Usable Raft Library”
– A long list of Raft implementations is available
– None of them a general library ready to be consumed by other projects.
– Most of them are tied to another project or a part of another project.
• We need a Raft library!
Page 7
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Apache Ratis – A Raft Library
• A brand new, incubating Apache project
– Open source, open development
– Apache License 2.0
– Written in Java 8
• Contributions are welcome!
Page 8
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Ratis: Standard Raft Features
• Leader Election + Log Replication
– Automatically elect a leader among the servers in a Raft group
– Randomized timeout for avoiding split votes
– Log is replicated in the Raft group
• Membership Changes
– Members in a Raft group can be re-configurated in runtime
– Replication factor can be changed in runtime
• Log Compaction
– Snapshot is taken periodically
– Send snapshot instead of a long log history.
Page 9
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Ratis: Pluggability
• Pluggable state machine
– Application must define its state machine
– Example: a key-value map
• Pluggable RPC
– Users may provide their own RPC implementation
– Default implementations: gRPC, Netty, Hadoop RPC
• Pluggable Raft log
– Users may provide their own log implementation
– The default implementation stores log in local files
Page 10
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Data Intensive Applications
• In Raft,
– All transactions and the data are written in the log
– Not suitable for data intensive applications
• In Ratis
– Application could choose to not written all the data to log
– State machine data and log data can be separately managed
– See the FileStore example in ratis-example
Page 11
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Ratis: Asynchronous APIs
• Using gRPC bi-directional stream API
– Netty and Hadoop RPC can support async but not yet implementated
• Server-to-server
– Asynchronous append entires
• Client-to-server
– Asynchronous client requests
– RATIS-113: just committed yesterday (11/14/2017)
– Need to fix out-of-order issues RATIS-140 and RATIS-141
Page 12
Architecting the Future of Big Data
© Hortonworks Inc. 2017
General Ratis Use Cases
• You already have a service running on a single server.
• You want to:
– (1) replicate the server log/states to multiple machines
• The replication number/cluster membership can be changed in runtime
• It can tolerate server failures.
• or
– (2) have a HA (highly available) service
• When a server fails, another server will automatically take over.
• Clients automatically failover to the new server.
• Apache Ratis is for you!
Page 13
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Ozone/HDFS Use Cases
• Replicating open containers
– HDFS-11519
• Ozone: Implement XceiverServerSpi and XceiverClientSpi using Ratis
• Committed on 4/3/2017
• Support HA in SCM
– HDFS-11443
• Ozone: Support SCM multiple instances for HA
• Replacing the current Namenode HA solution
– No JIRA yet
Page 14
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Example: ArithmeticStateMachine
• Maintain a variable map (variable -> value)
– Users define the variables and expresions
– Then, submit assignment messages
Page 15
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Demo: ArithmeticStateMachine
• Pythagorean
– Put a = 3 and b = 4
– Find c
• Special thanks to Marton Elek for working on RATIS-95
(CLI to run examples) so that this demo is possible!
Page 16
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Gauss-Legendre
• A.k.a the arithmetic–geometric mean method
• A fast algorithm to compute pi
Page 17
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Example: FileStore (RATIS-122)
• Maintain a file map (key -> file)
• Support only
– Read, Write, Delete
• But not other operations such as
– List, Rename, etc.
• Asynchronous & In-order
– Client may submit multiple write requests to
• Write to multiple files at the same time
• Each file may have multiple write requests
• File data is managed by the state machine
– But not store in the raft log
Page 18
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Ratis: Development Status
• A brief history
– 2016-03: Project started at Hortonworks
– 2016-04: First commit “leader election (without tests)”
– 2017-01: Entered Apache incubation.
– 2017-03: Started preparing the first Alpha release (RATIS-53).
– 2017-04: Hadoop Ozone branch started using Ratis (HDFS-11519)!
– 2017-05: First Release 0.1.0-alpha
– 2017-11: Preparing the second Release 0.2.0-alpha
Page 19
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Work in Progress
• Multi-Raft
– General idea: Allow a server to join multiple Raft groups (RATIS-91)
• When the #groups is large, it is hard to manage the logs
• May assume SSD for log storage
– Short term goal: Allow a server to join a small number of groups
• “Small” means 2 or 3
• Use a small number of pre-configurated storage locations
• Big benefit: servers could transition from one group to another group
• Replicated Map (RATIS-40)
– A sorted map similar to ZooKeeper/etcd/LogCabin
– Intended to be used in production
Page 20
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Ratis: Future Works
• Performance
– Fix gRPC async bugs: RATIS-140 and RATIS-141
– Use FileStore as benchmarks
• Metrics
• Security
• API Specification
• Documentations
– Project web site
• Jerkins Builds
– Checkstyle configuration
Page 21
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Contributors
– Animesh Trivedi, Anu Engineer, Arpit Agarwal, Brent,
– Chen Liang, Chris Nauroth, Devaraj Das, Enis Soztutar,
– garvit, Hanisha Koneru, Hugo Louro, Jakob Homan,
– Jian He, Jing Chen, Jing Zhao, Jitendra Pandey, Junping Du,
– kaiyangzhang, Karl Heinz Marbaise, Li Lu, Lokesh Jain,
– Marton Elek, Mayank Bansal, Mingliang Liu,
– Mukul Kumar Singh, Sen Zhang, Sriharsha Chintalapani,
– Tsz Wo Nicholas Sze, Uma Maheswara Rao G,
– Venkat Ranganathan, Wangda Tan, Weiqing Yang,
– Will Xu, Xiaobing Zhou, Xiaoyu Yao, Yubo Xu, yue liu,
– Zhiyuan Yang
Page 22
Architecting the Future of Big Data
© Hortonworks Inc. 2017
Thank you!
Contributions are welcome!
• http://incubator.apache.org/projects/ratis.html
• dev@ratis.incubator.apache.org
Java question: What is <?> in above? !
Architecting the Future of Big Data
Page 23

More Related Content

What's hot

Introduction to DataFusion An Embeddable Query Engine Written in Rust
Introduction to DataFusion  An Embeddable Query Engine Written in RustIntroduction to DataFusion  An Embeddable Query Engine Written in Rust
Introduction to DataFusion An Embeddable Query Engine Written in RustAndrew Lamb
 
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...Vietnam Open Infrastructure User Group
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino ProjectMartin Traverso
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...HostedbyConfluent
 
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionApache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionWes McKinney
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3DataWorks Summit
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingDataWorks Summit
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache RangerDataWorks Summit
 
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...Andrew Lamb
 
Getting Started with Infrastructure as Code
Getting Started with Infrastructure as CodeGetting Started with Infrastructure as Code
Getting Started with Infrastructure as CodeWinWire Technologies Inc
 
How is Kafka so Fast?
How is Kafka so Fast?How is Kafka so Fast?
How is Kafka so Fast?Ricardo Paiva
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsCloudera, Inc.
 
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in FlinkMaxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in FlinkFlink Forward
 

What's hot (20)

Introduction to DataFusion An Embeddable Query Engine Written in Rust
Introduction to DataFusion  An Embeddable Query Engine Written in RustIntroduction to DataFusion  An Embeddable Query Engine Written in Rust
Introduction to DataFusion An Embeddable Query Engine Written in Rust
 
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino Project
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
 
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionApache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS Session
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
 
Integrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data LakesIntegrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data Lakes
 
Getting Started with Infrastructure as Code
Getting Started with Infrastructure as CodeGetting Started with Infrastructure as Code
Getting Started with Infrastructure as Code
 
How is Kafka so Fast?
How is Kafka so Fast?How is Kafka so Fast?
How is Kafka so Fast?
 
Grafana
GrafanaGrafana
Grafana
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
 
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in FlinkMaxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
 

Similar to Apache Ratis - In Search of a Usable Raft Library

HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and FutureDataWorks Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshotsenissoz
 
Democratizing Memory Storage
Democratizing Memory StorageDemocratizing Memory Storage
Democratizing Memory StorageDataWorks Summit
 
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Chris Nauroth
 
February 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and InsidesFebruary 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and InsidesYahoo Developer Network
 
La big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixitLa big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixitData Con LA
 
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsDesign Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsAshish Mrig
 
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...Data Con LA
 
Big Data Conference April 2015
Big Data Conference April 2015Big Data Conference April 2015
Big Data Conference April 2015Aaron Benz
 
Hadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and FutureHadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and FutureVinod Kumar Vavilapalli
 
Apache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and FutureApache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and FutureDataWorks Summit
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesDataWorks Summit
 
Big data processing engines, Atlanta Meetup 4/30
Big data processing engines, Atlanta Meetup 4/30Big data processing engines, Atlanta Meetup 4/30
Big data processing engines, Atlanta Meetup 4/30Ashish Narasimham
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopHortonworks
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?DataWorks Summit
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoDataWorks Summit
 

Similar to Apache Ratis - In Search of a Usable Raft Library (20)

HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshots
 
Democratizing Memory Storage
Democratizing Memory StorageDemocratizing Memory Storage
Democratizing Memory Storage
 
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
 
Containers and Big Data
Containers and Big Data Containers and Big Data
Containers and Big Data
 
February 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and InsidesFebruary 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and Insides
 
La big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixitLa big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixit
 
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsDesign Choices for Cloud Data Platforms
Design Choices for Cloud Data Platforms
 
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
 
Big Data Conference April 2015
Big Data Conference April 2015Big Data Conference April 2015
Big Data Conference April 2015
 
Hadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and FutureHadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and Future
 
Apache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and FutureApache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and Future
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
Big data processing engines, Atlanta Meetup 4/30
Big data processing engines, Atlanta Meetup 4/30Big data processing engines, Atlanta Meetup 4/30
Big data processing engines, Atlanta Meetup 4/30
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
 

Recently uploaded

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 

Recently uploaded (20)

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 

Apache Ratis - In Search of a Usable Raft Library

  • 1. © Hortonworks Inc. 2017 Apache Ratis In Search of a Usable Raft Library Tsz-Wo Nicholas Sze 11/15/2017 Brown Bag Lunch Talk Page 1
  • 2. © Hortonworks Inc. 2017 About Me • Tsz-Wo Nicholas Sze, Ph.D. – Software Engineer at Hortonworks – PMC member/Committer of Apache Hadoop – Active contributor and committer of Apache Ratis – Ph.D. from University of Maryland, College Park – MPhil & BEng from Hong Kong University of Sci & Tech Page 2 Architecting the Future of Big Data
  • 3. © Hortonworks Inc. 2017 Agenda • A brief introduction of RAFT • Apache Ratis – Features and use cases – Demo • Project Status – Current development – Future works Page 3 Architecting the Future of Big Data
  • 4. © Hortonworks Inc. 2017 Consensus • What is consensus? – Multiple servers to agree a value • Typical use cases – Log replication – Replicated state machines Page 4 Architecting the Future of Big Data
  • 5. © Hortonworks Inc. 2017 Consensus Algorithms • Paxos (1990) – Work but hard to understand – Hard to implement (correctly) • Raft (2014) • “In Search of an Understandable Consensus Algorithm” – by Diego Ongaro and John Ousterhout – USENIX ATC’14, https://raft.github.io • Motivations – Easy to understand – Easy to prove – Easy to implement Page 5 Architecting the Future of Big Data
  • 6. © Hortonworks Inc. 2017 Raft Basic • Leader Election – Servers are started as a Follower – Randomly timeout to become Candidate and start a leader election • Candidate sends requestVote to other servers – It becomes the leader once it gets a majority of the votes. • Append Entries – Clients send requests to the Leader – Leader forwards the requests to the Followers • Leader sends appendEntries to Followers – When there is no client requests, Leader also sends empty appendEntries to Followers to maintain leadership Page 6 Architecting the Future of Big Data
  • 7. © Hortonworks Inc. 2017 Raft Library • Our Motations – Use Raft in Ozone • “In Search of a Usable Raft Library” – A long list of Raft implementations is available – None of them a general library ready to be consumed by other projects. – Most of them are tied to another project or a part of another project. • We need a Raft library! Page 7 Architecting the Future of Big Data
  • 8. © Hortonworks Inc. 2017 Apache Ratis – A Raft Library • A brand new, incubating Apache project – Open source, open development – Apache License 2.0 – Written in Java 8 • Contributions are welcome! Page 8 Architecting the Future of Big Data
  • 9. © Hortonworks Inc. 2017 Ratis: Standard Raft Features • Leader Election + Log Replication – Automatically elect a leader among the servers in a Raft group – Randomized timeout for avoiding split votes – Log is replicated in the Raft group • Membership Changes – Members in a Raft group can be re-configurated in runtime – Replication factor can be changed in runtime • Log Compaction – Snapshot is taken periodically – Send snapshot instead of a long log history. Page 9 Architecting the Future of Big Data
  • 10. © Hortonworks Inc. 2017 Ratis: Pluggability • Pluggable state machine – Application must define its state machine – Example: a key-value map • Pluggable RPC – Users may provide their own RPC implementation – Default implementations: gRPC, Netty, Hadoop RPC • Pluggable Raft log – Users may provide their own log implementation – The default implementation stores log in local files Page 10 Architecting the Future of Big Data
  • 11. © Hortonworks Inc. 2017 Data Intensive Applications • In Raft, – All transactions and the data are written in the log – Not suitable for data intensive applications • In Ratis – Application could choose to not written all the data to log – State machine data and log data can be separately managed – See the FileStore example in ratis-example Page 11 Architecting the Future of Big Data
  • 12. © Hortonworks Inc. 2017 Ratis: Asynchronous APIs • Using gRPC bi-directional stream API – Netty and Hadoop RPC can support async but not yet implementated • Server-to-server – Asynchronous append entires • Client-to-server – Asynchronous client requests – RATIS-113: just committed yesterday (11/14/2017) – Need to fix out-of-order issues RATIS-140 and RATIS-141 Page 12 Architecting the Future of Big Data
  • 13. © Hortonworks Inc. 2017 General Ratis Use Cases • You already have a service running on a single server. • You want to: – (1) replicate the server log/states to multiple machines • The replication number/cluster membership can be changed in runtime • It can tolerate server failures. • or – (2) have a HA (highly available) service • When a server fails, another server will automatically take over. • Clients automatically failover to the new server. • Apache Ratis is for you! Page 13 Architecting the Future of Big Data
  • 14. © Hortonworks Inc. 2017 Ozone/HDFS Use Cases • Replicating open containers – HDFS-11519 • Ozone: Implement XceiverServerSpi and XceiverClientSpi using Ratis • Committed on 4/3/2017 • Support HA in SCM – HDFS-11443 • Ozone: Support SCM multiple instances for HA • Replacing the current Namenode HA solution – No JIRA yet Page 14 Architecting the Future of Big Data
  • 15. © Hortonworks Inc. 2017 Example: ArithmeticStateMachine • Maintain a variable map (variable -> value) – Users define the variables and expresions – Then, submit assignment messages Page 15 Architecting the Future of Big Data
  • 16. © Hortonworks Inc. 2017 Demo: ArithmeticStateMachine • Pythagorean – Put a = 3 and b = 4 – Find c • Special thanks to Marton Elek for working on RATIS-95 (CLI to run examples) so that this demo is possible! Page 16 Architecting the Future of Big Data
  • 17. © Hortonworks Inc. 2017 Gauss-Legendre • A.k.a the arithmetic–geometric mean method • A fast algorithm to compute pi Page 17 Architecting the Future of Big Data
  • 18. © Hortonworks Inc. 2017 Example: FileStore (RATIS-122) • Maintain a file map (key -> file) • Support only – Read, Write, Delete • But not other operations such as – List, Rename, etc. • Asynchronous & In-order – Client may submit multiple write requests to • Write to multiple files at the same time • Each file may have multiple write requests • File data is managed by the state machine – But not store in the raft log Page 18 Architecting the Future of Big Data
  • 19. © Hortonworks Inc. 2017 Ratis: Development Status • A brief history – 2016-03: Project started at Hortonworks – 2016-04: First commit “leader election (without tests)” – 2017-01: Entered Apache incubation. – 2017-03: Started preparing the first Alpha release (RATIS-53). – 2017-04: Hadoop Ozone branch started using Ratis (HDFS-11519)! – 2017-05: First Release 0.1.0-alpha – 2017-11: Preparing the second Release 0.2.0-alpha Page 19 Architecting the Future of Big Data
  • 20. © Hortonworks Inc. 2017 Work in Progress • Multi-Raft – General idea: Allow a server to join multiple Raft groups (RATIS-91) • When the #groups is large, it is hard to manage the logs • May assume SSD for log storage – Short term goal: Allow a server to join a small number of groups • “Small” means 2 or 3 • Use a small number of pre-configurated storage locations • Big benefit: servers could transition from one group to another group • Replicated Map (RATIS-40) – A sorted map similar to ZooKeeper/etcd/LogCabin – Intended to be used in production Page 20 Architecting the Future of Big Data
  • 21. © Hortonworks Inc. 2017 Ratis: Future Works • Performance – Fix gRPC async bugs: RATIS-140 and RATIS-141 – Use FileStore as benchmarks • Metrics • Security • API Specification • Documentations – Project web site • Jerkins Builds – Checkstyle configuration Page 21 Architecting the Future of Big Data
  • 22. © Hortonworks Inc. 2017 Contributors – Animesh Trivedi, Anu Engineer, Arpit Agarwal, Brent, – Chen Liang, Chris Nauroth, Devaraj Das, Enis Soztutar, – garvit, Hanisha Koneru, Hugo Louro, Jakob Homan, – Jian He, Jing Chen, Jing Zhao, Jitendra Pandey, Junping Du, – kaiyangzhang, Karl Heinz Marbaise, Li Lu, Lokesh Jain, – Marton Elek, Mayank Bansal, Mingliang Liu, – Mukul Kumar Singh, Sen Zhang, Sriharsha Chintalapani, – Tsz Wo Nicholas Sze, Uma Maheswara Rao G, – Venkat Ranganathan, Wangda Tan, Weiqing Yang, – Will Xu, Xiaobing Zhou, Xiaoyu Yao, Yubo Xu, yue liu, – Zhiyuan Yang Page 22 Architecting the Future of Big Data
  • 23. © Hortonworks Inc. 2017 Thank you! Contributions are welcome! • http://incubator.apache.org/projects/ratis.html • dev@ratis.incubator.apache.org Java question: What is <?> in above? ! Architecting the Future of Big Data Page 23