Advertisement
Advertisement

More Related Content

Advertisement

Recently uploaded(20)

Advertisement

Data2breakfast - Introduction à la base de données NoSQL Apache Cassandra

  1. 6 Février 2018 Introduction à la base de données NoSQL Apache Cassandra Nicolas MENOUX Chief Data Technology Officer
  2. From RDBMS to NoSQL 1970-… 2000-…
  3. NoSQL genesis #nosql mouvement in 2009 Johan Oskarsson
  4. NoSQL a new paradigm • Large Volume of Data • Fast concurrent users queries • Dynamic schemas (schema less) • Auto-sharding • Replication • Horizontally Scalable • Eventually consistent
  5. Eventually Consistent
  6. CAP Theorem (only 2 of 3)
  7. No more relations !
  8. NoSQL databases
  9. A good blend ! • Peer-to-Peer • Key-Value pairs • Tunable consistency • Wide rows • Fast write throughput July 2008 Google code 2007 2006
  10. Cassandra characteristics February 2010
  11. 75 000 nodes 10 PB Cassandra in Production 2 500 nodes 420 PB 1 000 000 000 000 reqs / day 11 500 000 reqs / sec100 nodes 250 PB
  12. Multi-Datacenter Cassandra on Raspberry Pi's http://blablatech.com/blog/cassandra-gets-in-blablacar
  13. Fault tolerance : Data Replication
  14. Built-in WAN sync
  15. How Cassandra stores data (nodes & vnodes) Hash(partition key) =
  16. Data Model SELECT id, age FROM users WHERE id=xxx Data Models SELECT age, count(*) FROM users GROUP By age SELECT id, nom FROM users JOIN names ON users.id = names.id … NOSQL = From Query to Data Model
  17. Data Model : Column oriented
  18. Cassandra : Data types
  19. Primary Key
  20. Cassandra : Primary keys Compound primary key Composite partition key
  21. Cassandra Query Language : CQL TTL : Timestamp :
  22. Performance : Linear scalability
  23. Write path flush
  24. Last Write Win (LWW)
  25. Last Write Win (LWW)
  26. Compactions
  27. Compaction Strategies SizeTieredCompactionStrategy (STCS) LeveledCompactionStrategy (LCS) TimeWindowCompactionStrategy (TWCS) DateTieredCompactionStrategy (DTCS)
  28. Tunable consistency
  29. Write consistency (RF=3, ONE)
  30. Write consistency (RF=3, QUORUM) Multi-DC Delay Alternative LOCAL_QUORUM
  31. Read consistency : (RF=3, QUORUM) Background node repair : read_repair_chance (10%)
  32. Read consistency : (RF=3, LOCAL_QUORUM)
  33. Conclusion Cassandra is … Not a general purpose solution But very good in Scalability Write(/Read) performance NoSPoF DataModel Multi-DC native Spark Integration TimeSeries Data https://academy.datastax.com
  34. Q&A sudo apt-get update sudo apt-get upgrade sudo apt-get install default-jdk echo "deb http://www.apache.org/dist/cassandra/debian 311x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list sudo apt-key adv --keyserver pool.sks-keyservers.net --recv-key A278B781FE4B2BDA sudo apt-get update sudo apt-get install cassandra sudo systemctl start cassandra.service Ubuntu docker run --name cassandra-server -d cassandra sleep 20 docker run -it --link cassandra-server:cassandra --rm cassandra cqlsh cassandra Docker
Advertisement