Submit Search
Upload
Apache Ratis - In Search of a Usable Raft Library
•
2 likes
•
42,414 views
T
Tsz-Wo (Nicholas) Sze
Follow
A talk on Apache Ratis.
Read less
Read more
Software
Report
Share
Report
Share
1 of 23
Download now
Download to read offline
Recommended
High throughput data replication over RAFT
High throughput data replication over RAFT
DataWorks Summit
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
Evans Ye
Flink Batch Processing and Iterations
Flink Batch Processing and Iterations
Sameer Wadkar
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical Optimization
Databricks
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
t3rmin4t0r
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
Ryan Blue
Recommended
High throughput data replication over RAFT
High throughput data replication over RAFT
DataWorks Summit
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
Evans Ye
Flink Batch Processing and Iterations
Flink Batch Processing and Iterations
Sameer Wadkar
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical Optimization
Databricks
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
t3rmin4t0r
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
Ryan Blue
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Andrew Lamb
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Vietnam Open Infrastructure User Group
Introduction to Redis
Introduction to Redis
Dvir Volk
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
GetInData
State of the Trino Project
State of the Trino Project
Martin Traverso
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
HostedbyConfluent
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Wes McKinney
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
DataWorks Summit
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
DataWorks Summit
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
DataWorks Summit
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
Andrew Lamb
Integrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data Lakes
DataWorks Summit/Hadoop Summit
Getting Started with Infrastructure as Code
Getting Started with Infrastructure as Code
WinWire Technologies Inc
How is Kafka so Fast?
How is Kafka so Fast?
Ricardo Paiva
Grafana
Grafana
NoelMc Grath
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Flink Forward
HDFS- What is New and Future
HDFS- What is New and Future
DataWorks Summit
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit
More Related Content
What's hot
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Andrew Lamb
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Vietnam Open Infrastructure User Group
Introduction to Redis
Introduction to Redis
Dvir Volk
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
GetInData
State of the Trino Project
State of the Trino Project
Martin Traverso
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
HostedbyConfluent
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Wes McKinney
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
DataWorks Summit
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
DataWorks Summit
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
DataWorks Summit
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
Andrew Lamb
Integrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data Lakes
DataWorks Summit/Hadoop Summit
Getting Started with Infrastructure as Code
Getting Started with Infrastructure as Code
WinWire Technologies Inc
How is Kafka so Fast?
How is Kafka so Fast?
Ricardo Paiva
Grafana
Grafana
NoelMc Grath
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Flink Forward
What's hot
(20)
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Introduction to Redis
Introduction to Redis
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
Dataflow with Apache NiFi
Dataflow with Apache NiFi
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
State of the Trino Project
State of the Trino Project
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
Integrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data Lakes
Getting Started with Infrastructure as Code
Getting Started with Infrastructure as Code
How is Kafka so Fast?
How is Kafka so Fast?
Grafana
Grafana
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Similar to Apache Ratis - In Search of a Usable Raft Library
HDFS- What is New and Future
HDFS- What is New and Future
DataWorks Summit
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit
Mapreduce over snapshots
Mapreduce over snapshots
enissoz
Democratizing Memory Storage
Democratizing Memory Storage
DataWorks Summit
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
Chris Nauroth
Containers and Big Data
Containers and Big Data
DataWorks Summit
February 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and Insides
Yahoo Developer Network
La big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixit
Data Con LA
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data Platforms
Ashish Mrig
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Data Con LA
Big Data Conference April 2015
Big Data Conference April 2015
Aaron Benz
Hadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and Future
Vinod Kumar Vavilapalli
Apache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and Future
DataWorks Summit
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
markgrover
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
DataWorks Summit
Big data processing engines, Atlanta Meetup 4/30
Big data processing engines, Atlanta Meetup 4/30
Ashish Narasimham
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
Hortonworks
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Hortonworks
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
DataWorks Summit
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
DataWorks Summit
Similar to Apache Ratis - In Search of a Usable Raft Library
(20)
HDFS- What is New and Future
HDFS- What is New and Future
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Mapreduce over snapshots
Mapreduce over snapshots
Democratizing Memory Storage
Democratizing Memory Storage
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
Containers and Big Data
Containers and Big Data
February 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and Insides
La big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixit
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data Platforms
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Conference April 2015
Big Data Conference April 2015
Hadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and Future
Apache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and Future
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
Big data processing engines, Atlanta Meetup 4/30
Big data processing engines, Atlanta Meetup 4/30
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
Recently uploaded
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
soniya singh
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
jennyeacort
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
qr0udbr0
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
umasea
Professional Resume Template for Software Developers
Professional Resume Template for Software Developers
Vinodh Ram
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
Christoph Pohl
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
kzayra69
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
Alina Yurenko
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
AxelRicardoTrocheRiq
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
OPEN KNOWLEDGE GmbH
MYjobs Presentation Django-based project
MYjobs Presentation Django-based project
AnoyGreter
Asset Management Software - Infographic
Asset Management Software - Infographic
Hr365.us smith
EY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
Neo4j
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
VICTOR MAESTRE RAMIREZ
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
Wave PLM
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
Andreas Granig
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
OnePlan Solutions
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Christina Lin
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
Sujith Sukumaran
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
StefanoLambiase
Recently uploaded
(20)
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
Professional Resume Template for Software Developers
Professional Resume Template for Software Developers
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
MYjobs Presentation Django-based project
MYjobs Presentation Django-based project
Asset Management Software - Infographic
Asset Management Software - Infographic
EY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Apache Ratis - In Search of a Usable Raft Library
1.
© Hortonworks Inc.
2017 Apache Ratis In Search of a Usable Raft Library Tsz-Wo Nicholas Sze 11/15/2017 Brown Bag Lunch Talk Page 1
2.
© Hortonworks Inc.
2017 About Me • Tsz-Wo Nicholas Sze, Ph.D. – Software Engineer at Hortonworks – PMC member/Committer of Apache Hadoop – Active contributor and committer of Apache Ratis – Ph.D. from University of Maryland, College Park – MPhil & BEng from Hong Kong University of Sci & Tech Page 2 Architecting the Future of Big Data
3.
© Hortonworks Inc.
2017 Agenda • A brief introduction of RAFT • Apache Ratis – Features and use cases – Demo • Project Status – Current development – Future works Page 3 Architecting the Future of Big Data
4.
© Hortonworks Inc.
2017 Consensus • What is consensus? – Multiple servers to agree a value • Typical use cases – Log replication – Replicated state machines Page 4 Architecting the Future of Big Data
5.
© Hortonworks Inc.
2017 Consensus Algorithms • Paxos (1990) – Work but hard to understand – Hard to implement (correctly) • Raft (2014) • “In Search of an Understandable Consensus Algorithm” – by Diego Ongaro and John Ousterhout – USENIX ATC’14, https://raft.github.io • Motivations – Easy to understand – Easy to prove – Easy to implement Page 5 Architecting the Future of Big Data
6.
© Hortonworks Inc.
2017 Raft Basic • Leader Election – Servers are started as a Follower – Randomly timeout to become Candidate and start a leader election • Candidate sends requestVote to other servers – It becomes the leader once it gets a majority of the votes. • Append Entries – Clients send requests to the Leader – Leader forwards the requests to the Followers • Leader sends appendEntries to Followers – When there is no client requests, Leader also sends empty appendEntries to Followers to maintain leadership Page 6 Architecting the Future of Big Data
7.
© Hortonworks Inc.
2017 Raft Library • Our Motations – Use Raft in Ozone • “In Search of a Usable Raft Library” – A long list of Raft implementations is available – None of them a general library ready to be consumed by other projects. – Most of them are tied to another project or a part of another project. • We need a Raft library! Page 7 Architecting the Future of Big Data
8.
© Hortonworks Inc.
2017 Apache Ratis – A Raft Library • A brand new, incubating Apache project – Open source, open development – Apache License 2.0 – Written in Java 8 • Contributions are welcome! Page 8 Architecting the Future of Big Data
9.
© Hortonworks Inc.
2017 Ratis: Standard Raft Features • Leader Election + Log Replication – Automatically elect a leader among the servers in a Raft group – Randomized timeout for avoiding split votes – Log is replicated in the Raft group • Membership Changes – Members in a Raft group can be re-configurated in runtime – Replication factor can be changed in runtime • Log Compaction – Snapshot is taken periodically – Send snapshot instead of a long log history. Page 9 Architecting the Future of Big Data
10.
© Hortonworks Inc.
2017 Ratis: Pluggability • Pluggable state machine – Application must define its state machine – Example: a key-value map • Pluggable RPC – Users may provide their own RPC implementation – Default implementations: gRPC, Netty, Hadoop RPC • Pluggable Raft log – Users may provide their own log implementation – The default implementation stores log in local files Page 10 Architecting the Future of Big Data
11.
© Hortonworks Inc.
2017 Data Intensive Applications • In Raft, – All transactions and the data are written in the log – Not suitable for data intensive applications • In Ratis – Application could choose to not written all the data to log – State machine data and log data can be separately managed – See the FileStore example in ratis-example Page 11 Architecting the Future of Big Data
12.
© Hortonworks Inc.
2017 Ratis: Asynchronous APIs • Using gRPC bi-directional stream API – Netty and Hadoop RPC can support async but not yet implementated • Server-to-server – Asynchronous append entires • Client-to-server – Asynchronous client requests – RATIS-113: just committed yesterday (11/14/2017) – Need to fix out-of-order issues RATIS-140 and RATIS-141 Page 12 Architecting the Future of Big Data
13.
© Hortonworks Inc.
2017 General Ratis Use Cases • You already have a service running on a single server. • You want to: – (1) replicate the server log/states to multiple machines • The replication number/cluster membership can be changed in runtime • It can tolerate server failures. • or – (2) have a HA (highly available) service • When a server fails, another server will automatically take over. • Clients automatically failover to the new server. • Apache Ratis is for you! Page 13 Architecting the Future of Big Data
14.
© Hortonworks Inc.
2017 Ozone/HDFS Use Cases • Replicating open containers – HDFS-11519 • Ozone: Implement XceiverServerSpi and XceiverClientSpi using Ratis • Committed on 4/3/2017 • Support HA in SCM – HDFS-11443 • Ozone: Support SCM multiple instances for HA • Replacing the current Namenode HA solution – No JIRA yet Page 14 Architecting the Future of Big Data
15.
© Hortonworks Inc.
2017 Example: ArithmeticStateMachine • Maintain a variable map (variable -> value) – Users define the variables and expresions – Then, submit assignment messages Page 15 Architecting the Future of Big Data
16.
© Hortonworks Inc.
2017 Demo: ArithmeticStateMachine • Pythagorean – Put a = 3 and b = 4 – Find c • Special thanks to Marton Elek for working on RATIS-95 (CLI to run examples) so that this demo is possible! Page 16 Architecting the Future of Big Data
17.
© Hortonworks Inc.
2017 Gauss-Legendre • A.k.a the arithmetic–geometric mean method • A fast algorithm to compute pi Page 17 Architecting the Future of Big Data
18.
© Hortonworks Inc.
2017 Example: FileStore (RATIS-122) • Maintain a file map (key -> file) • Support only – Read, Write, Delete • But not other operations such as – List, Rename, etc. • Asynchronous & In-order – Client may submit multiple write requests to • Write to multiple files at the same time • Each file may have multiple write requests • File data is managed by the state machine – But not store in the raft log Page 18 Architecting the Future of Big Data
19.
© Hortonworks Inc.
2017 Ratis: Development Status • A brief history – 2016-03: Project started at Hortonworks – 2016-04: First commit “leader election (without tests)” – 2017-01: Entered Apache incubation. – 2017-03: Started preparing the first Alpha release (RATIS-53). – 2017-04: Hadoop Ozone branch started using Ratis (HDFS-11519)! – 2017-05: First Release 0.1.0-alpha – 2017-11: Preparing the second Release 0.2.0-alpha Page 19 Architecting the Future of Big Data
20.
© Hortonworks Inc.
2017 Work in Progress • Multi-Raft – General idea: Allow a server to join multiple Raft groups (RATIS-91) • When the #groups is large, it is hard to manage the logs • May assume SSD for log storage – Short term goal: Allow a server to join a small number of groups • “Small” means 2 or 3 • Use a small number of pre-configurated storage locations • Big benefit: servers could transition from one group to another group • Replicated Map (RATIS-40) – A sorted map similar to ZooKeeper/etcd/LogCabin – Intended to be used in production Page 20 Architecting the Future of Big Data
21.
© Hortonworks Inc.
2017 Ratis: Future Works • Performance – Fix gRPC async bugs: RATIS-140 and RATIS-141 – Use FileStore as benchmarks • Metrics • Security • API Specification • Documentations – Project web site • Jerkins Builds – Checkstyle configuration Page 21 Architecting the Future of Big Data
22.
© Hortonworks Inc.
2017 Contributors – Animesh Trivedi, Anu Engineer, Arpit Agarwal, Brent, – Chen Liang, Chris Nauroth, Devaraj Das, Enis Soztutar, – garvit, Hanisha Koneru, Hugo Louro, Jakob Homan, – Jian He, Jing Chen, Jing Zhao, Jitendra Pandey, Junping Du, – kaiyangzhang, Karl Heinz Marbaise, Li Lu, Lokesh Jain, – Marton Elek, Mayank Bansal, Mingliang Liu, – Mukul Kumar Singh, Sen Zhang, Sriharsha Chintalapani, – Tsz Wo Nicholas Sze, Uma Maheswara Rao G, – Venkat Ranganathan, Wangda Tan, Weiqing Yang, – Will Xu, Xiaobing Zhou, Xiaoyu Yao, Yubo Xu, yue liu, – Zhiyuan Yang Page 22 Architecting the Future of Big Data
23.
© Hortonworks Inc.
2017 Thank you! Contributions are welcome! • http://incubator.apache.org/projects/ratis.html • dev@ratis.incubator.apache.org Java question: What is <?> in above? ! Architecting the Future of Big Data Page 23
Download now