Infinite gerrit

128 views

Published on

Proposed implementation for Apache Cassandra as DFS for JGit during Gerrit User Summit 2016, Google, Mountain View, CA.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
128
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Infinite gerrit

  1. 1. Haithem Jarraya GerritForge Haithem@gerritforge.com http://www.gerritforge.com Infinite Gerrit
  2. 2. 2 About Haithem • Haithem Jarraya Big Data consultant based in London, worked in different industries telecommunication, advertisement, financial services, travel and government. • Expertise with real time big data ecosystem, Apache Kafka, Apache Cassandra, Apache Spark.
  3. 3. 3 About GerritForge Founded in 2009 in London UK Mission: Integrate Gerrit with the Enterprise
  4. 4. 4 Agenda  Vision for Infinite Gerrit  Why Cassandra?  DfsObjDatabase – Cassandra proposed schema • Pack list • Packs
  5. 5. 5 Gerrit – End Goal  Every Gerrit mater accepts writes which in turn will increase throughput.  Increase flexibility for scaling up.  Zero down time, Zero data loss.  Reduce Gerrit operational cost by auto scaling down instances.
  6. 6. 6 Gerrit – Challenges  Distributed storage  Database replication  Concurrent Ref updates  Indexes update  Cache consistency  Shared sessions  Agreement protocol between nodes
  7. 7. 7 Gerrit – Challenges  Distributed storage  Database replication  Concurrent Ref updates  Indexes update  Cache consistency  Shared sessions  Agreement protocol between nodes
  8. 8. 8 C* in brief  Each table has its rows distributed N token ranges, and each token range replicated R times.  Fast writes commit log(durable writes), memtable, stored to SSTable.  Fast reads bloom filter, row key cache, row cache, SSTable index entry.  Compaction, repairs, materialized view(3.X)…
  9. 9. 9 DFS – Storage Layer for JGit  org...storage.dfs.DfsObjDatabase  org….storage.dfs.DfsRefDatabase
  10. 10. 10 Packs C* schema - Packs  Packs CREATE TABLE git_store.packs ( id uuid, ext text, offset bigint, value blob, PRIMARY KEY ((id, ext, offset)))
  11. 11. 11 Packs C* schema – Pack list  Pack list CREATE TABLE git_store.pack_list ( name text, id uuid, ext text, size bigint, PRIMARY KEY (name, id, ext)) WITH CLUSTERING ORDER BY (id DESC, ext ASC)
  12. 12. 12 LWB – HTTP route per repo
  13. 13. 13 Questions?
  14. 14. 14 Resources  http://cassandra.apache.org  http://www.datastax.com  http://eclipse.org/jgit  Dynamo paper http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf  Big Table paper http://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf

×