Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Infinite gerrit

363 views

Published on

Proposed implementation for Apache Cassandra as DFS for JGit during Gerrit User Summit 2016, Google, Mountain View, CA.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Infinite gerrit

  1. 1. Haithem Jarraya GerritForge Haithem@gerritforge.com http://www.gerritforge.com Infinite Gerrit
  2. 2. 2 About Haithem • Haithem Jarraya Big Data consultant based in London, worked in different industries telecommunication, advertisement, financial services, travel and government. • Expertise with real time big data ecosystem, Apache Kafka, Apache Cassandra, Apache Spark.
  3. 3. 3 About GerritForge Founded in 2009 in London UK Mission: Integrate Gerrit with the Enterprise
  4. 4. 4 Agenda  Vision for Infinite Gerrit  Why Cassandra?  DfsObjDatabase – Cassandra proposed schema • Pack list • Packs
  5. 5. 5 Gerrit – End Goal  Every Gerrit mater accepts writes which in turn will increase throughput.  Increase flexibility for scaling up.  Zero down time, Zero data loss.  Reduce Gerrit operational cost by auto scaling down instances.
  6. 6. 6 Gerrit – Challenges  Distributed storage  Database replication  Concurrent Ref updates  Indexes update  Cache consistency  Shared sessions  Agreement protocol between nodes
  7. 7. 7 Gerrit – Challenges  Distributed storage  Database replication  Concurrent Ref updates  Indexes update  Cache consistency  Shared sessions  Agreement protocol between nodes
  8. 8. 8 C* in brief  Each table has its rows distributed N token ranges, and each token range replicated R times.  Fast writes commit log(durable writes), memtable, stored to SSTable.  Fast reads bloom filter, row key cache, row cache, SSTable index entry.  Compaction, repairs, materialized view(3.X)…
  9. 9. 9 DFS – Storage Layer for JGit  org...storage.dfs.DfsObjDatabase  org….storage.dfs.DfsRefDatabase
  10. 10. 10 Packs C* schema - Packs  Packs CREATE TABLE git_store.packs ( id uuid, ext text, offset bigint, value blob, PRIMARY KEY ((id, ext, offset)))
  11. 11. 11 Packs C* schema – Pack list  Pack list CREATE TABLE git_store.pack_list ( name text, id uuid, ext text, size bigint, PRIMARY KEY (name, id, ext)) WITH CLUSTERING ORDER BY (id DESC, ext ASC)
  12. 12. 12 LWB – HTTP route per repo
  13. 13. 13 Questions?
  14. 14. 14 Resources  http://cassandra.apache.org  http://www.datastax.com  http://eclipse.org/jgit  Dynamo paper http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf  Big Table paper http://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf

×