C* Summit EU 2013: The Cassandra Experience at Orange

2,399
-1

Published on

Speaker: Jean Armel Luce — Senior Software Engineer/Cassandra Admin at Orange
Video: http://www.youtube.com/watch?v=mefOE9K7sLI&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=28
At Orange, Jean Armel has helped develop an open source tool for the migration of data to Cassandra; Jean and his team were in need of the NoSQL solution Apache Cassandra in order to sustain the growth of requests and volume of data required by their application PnS. In this session, Jean Armel will start out with an overview of the Orange application PnS and dive into why they chose Apache Cassandra how they did their data migration without any interruption of service. Jean Armel will also show how his application behaves after the migration

Published in: Technology
1 Comment
0 Likes
Statistics
Notes
  • Here are some clickable links to the three open source projects mentioned in these slides:

    https://github.com/Orange-OpenSource/mod_dup

    https://github.com/Orange-OpenSource/YACassandraPDO

    http://libdbi-drivers.cvs.sourceforge.net/viewvc/libdbi-drivers/libdbi-drivers/?pathrev=Branch-2012-07-02-cassandra
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total Views
2,399
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
36
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

C* Summit EU 2013: The Cassandra Experience at Orange

  1. 1. The  Cassandra  Experience  at  Orange   Project  PnS  3.0   Jean  Armel  Luce   Orange  France/DSIF/DF/SDF   V1.0  
  2. 2. Summary   §  §  Our  migraGon  strategy   §  AMer  the  migraGon  …   §  AnalyGcs  with  Hadoop/Pig/Hive  over  Cassandra   §  2 Short  descripGon  of  PnS.  Why  did  we  choose  C*  ?   ContribuGons  and  open  sourced  modules  from  Orange  &  conclusions   Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  3. 3. Short description of PnS3
  4. 4. PnS – Short description §  PnS means Profiles and Syndication : PnS is a highly available service for collecting and serving live data about Orange customers §  End users of PnS are : –  Orange customers (logged to Portal www.orange.fr) –  Sellers in Orange shops –  Some services in Orange (advertisements, …) 4 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  5. 5. PnS – The Big Picture Millions of HTTP requests (Rest or Soap) Fast and highly available WebService to get or set data stored by pns : - postProcessing(data1) - postProcessing(data2) - postProcessing(data3) - postProcessing(datax) - … End users DB Queries R/W operations Thousands of files (Csv or Xml) Scheduled data injection PNS Data providers 5 Jean Armel Luce - Orange-DOP-PnS 3.0 Database Cassandra Summit Europe – October 17 2013
  6. 6. PnS2 – Architecture 2 DCs architecture for high availability §  Until 2012, data were stored in 2 differents backends : ü  MySQL cluster (for volatile data) ü  PostGres « cluster » (sharding and replication) Bagnol et and §  web services (read and writes) Sophia Antipolis for batch updates §  6 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  7. 7. Timeline  –  Key  dates  of  PnS  3.0   PNS 2 •  Study  phase   2010 to 2012 We  did  a  large  study  about  a  few  NoSQL  databases  (Cassandra,  MongoDB,  Riak,  Hbase,  Hypertable,  …)   è  We  chose  Cassandra  as  the  single  backend  for  PnS •  Design  phase   06/2012 09/2012 We  started  the  design  phase  of  PnS3.0   •  Proof  Of  Concept   We  started  a  1st  (small)  Cassandra  cluster  in  producGon  for  a  non  criGcal  applicaGon  :  1  table,  key  value  access   •  Produc7on  phase   04/2013 MigraGon  of  the  1st  subset  of  data  of  PnS  from  mysql  cluster  to  Cassandra  in  produc7on   •  Complete  migra7on   05/2013 to 12/2013 7 MigraGon  of  all  other  subsets  of  data  from  Mysql  cluster  and  Postgres  to  Cassandra     Add  new  nodes  in  the  cluster  (From  8  nodes  in  each  DC  to  16  nodes  in  each  DC)   Add  a  3rd  datacenter  for  AnalyGcs   Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  8. 8. PnS – Why did we choose Cassandra ? §  Cassandra fits our requirements : –  Very high availability PnS2 = 99,95% availability we want to improve it !!! –  Low latency 20 ms < RT PnS2 web service < 150 ms we want to improve it !!! Higher load, higher volume next years ? unpredictable; better scalability brings new businesses –  Scalability §  And also : –  Ease of use : Cassandra is easy to administrate and operate –  Some features that I like (rack aware, CL per request, …) –  Cassandra is very efficient for simple requests : «  SELECT  mycol1,  mycol2,  …,  mycolx  FROM  mytable  WHERE  myprimarykey  =  ‘mycustomerid’  »   8   Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  9. 9. Our migration strategy
  10. 10. The  migraGon  -­‐  Input   §  During the migration, we need to : §  §  maintain (or lower) the latency during the migration §  §  maintain a very high availability guarantee no functional regression Question : §  10 How can we migrate the data to Cassandra without any interruption of service ? Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  11. 11. The  migraGon  :  Step  by  step  processing   §  Subdivision of data into many subsets according to many criteria : §  §  §  Same source of data Relationships between data And then, migrate each subset 1 by 1 S ubdivision   into subsets §  Definition of a generic process for all the subsets S witch  q ueries to  C assandra   for  the  s ubset Check /validation   of  the  m ig ration 11 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013 goto 1st subset goto next subset Mig ration   data  of  the   subset
  12. 12. The  migraGon  :  Tools  and  Gps   §  The strategy of migration is based on 2 main facilities : mod_dup PNS 2 An Apache module developped (and open sourced) by Orange teams. HTTP Req Mod_dup can duplicate web requests, filter them on some criteria, substitute characters (regexp), and send the duplicated requests to another pool of web servers. Used in order to fill legacy (relational) database and Cassandra database simultaneously during the migration of the subset mod_du p PNS 3 the timestamp management by Cassandra Each data stored in C* is timestamped. It is possible to set this timestamp when inserting/updating/deleting a data in Cassandra. When Cassandra retrieves a data item, it returns the value having the most recent timestamp. We use this feature to distinguish the values stored before the migration started and the values inserted during or inserted after the migration 12 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  13. 13. The migration : initial state Step 0 HTTP Rest/Soap Read HTTP Rest/Soap Write WebServer End users SQL Read/ Write PNS 2 DB Files transfer via FTP or CFT BatchInjector Data providers 13 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  14. 14. The migration : double feed Step 1 Duplicate HTTP update streams from end users mod_dup WebServer CQL update Data providers Duplicate streams (files) from data providers 14 Jean Armel Luce - Orange-DOP-PnS 3.0 BatchInjector Cassandra Summit Europe – October 17 2013 PNS 3 Cassandra DB
  15. 15. The migration : copy data form PnS2 to PnS3 Step 2 HTTP Write mod_dup WebServer Batc h In jecti on TimeStamp  =  start    date  of  extraction   Data providers BatchInjector 15 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013 PNS 3 Cassandra DB
  16. 16. The migration : control Step 3 HTTP Write mod_dup WebServer SQL Synchro Control Data providers CQL BatchInjector 16 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013 PNS 3 Cassandra DB
  17. 17. The migration : switch reads Step 4 100 % read now on Cassandra HTTP Read requests HTTP Write mod_dup HTTP Rest/Soap Read HTTP Rest/Soap Write WebServer WebServer End users PNS 2 DB Files transfer via FTP or CFT Data providers BatchInjector BatchInjector 17 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013 PNS 3 Cassandra DB
  18. 18. The migration : stop double feed Step 5 HTTP Read/Write HTTPrequests Read requests 100 % read on Cassandra 100% write on Cassandra for HTTP request HTTP Write mod_dup HTTP Rest/Soap Write WebServer WebServer End users PNS 2 DB Files transfer via FTP or CFT Data providers BatchInjector Files transfer 100% write on Cassandra for Data injection 18 Jean Armel Luce - Orange-DOP-PnS 3.0 BatchInjector Cassandra Summit Europe – October 17 2013 PNS 3 Cassandra DB
  19. 19. The  migraGon   §  Using this procedure : §  §  During the control phase, we can take time (a few days, a few weeks) to check that everything is OK before switching to Cassandra §  It is possible to easily rollback the migration of a subset if errors are found during the control phase, without losing any update §  19 It is possible to switch progressively to Cassandra rather than doing a one shot switch §  §  We can migrate to Cassandra without any interruption of service Doesn’t work if the queries are not idempotent. After the migration, we can easily duplicate production requests (entirely or partially) and send them to a bench platform thanks to mod_dup Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  20. 20. After the migration …
  21. 21. The  latency   §  Comparison before/after migration to Cassandra §  Some graphs about the latency of the web services are very explicit : Service push mail Service push webxms dates of migration to C* 21 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  22. 22. The  latency   §  Read and write latencies are now in microseconds in the datanodes : Thanks to and This  latency  will  be  improved  by  (tests  in  progress)  :   ALTER  TABLE  syndic  WITH  compacGon  =  {  'class'  :  'LeveledCompacGonStrategy',  'sstable_size_in_mb'  :  ??  };   22 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  23. 23. The  availability   •  We got a few hardware failures and network outages •  No impact on QoS : •  •  23 no error returned by the application no real impact on latency Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  24. 24. The  scalability   •  PnS activity is always increasing (volume of data and requests/sec) •  How to measure the capacity of a cluster ? Capacity of a C* cluster = capacity of a node * number of nodes (true if all nodes are identical) •  there are 2 ways to deal with the expansion of activity : Ø  scale up (add more resources such as CPU, disks, RAM to each node) Ø  scale  out  (add  new  nodes  in  the  cluster)   24 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  25. 25. The  scalability   •  Thanks to vnodes (available since Cassandra 1.2), it is easy to scale out With NetworkTopologyStrategy, make sure to distribute evenly the nodes in the racks 25 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  26. 26. Analytics with Hadoop/ Pig/ Hive over Cassandra
  27. 27. Basic  architecture  of  the  Cassandra  cluster   §  Cluster without Hadoop : 2 datacenters, 16 nodes in each DC §  RF (DC1, DC2) = (3, 3) §  Requests from web servers in DC1 are sent to C* nodes in DC1 §  Requests from web servers in DC2 are sent to C* nodes in DC2 Pool of web servers DC1 27 Jean Armel Luce - Orange-DOP-PnS 3.0 DC1 DC2 Cassandra Summit Europe – October 17 2013 Pool of web servers DC2
  28. 28. Architecture  of  the  Cassandra  cluster  with  the  datacenter  for   analyGcs   §  Cluster with Hadoop : 3 datacenters, 16 nodes in DC1, 16 nodes in DC2, 4 nodes in DC3 §  RF (DC1, DC2, DC3) = (3, 3, 1) §  §  We favor cheaper disks (SATA) in DC3 rather than SSDs or FusionIo cards §  28 Because RF = 1 in DC3, we shall need less storage space in this datacenter Works better with HSHA Thrift server (tests in progress) Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  29. 29. Architecture  of  the  Cassandra  cluster  with  the  datacenter  for   analyGcs   Pool of web servers DC1 DC1 DC2 DC3 29 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013 Pool of web servers DC2
  30. 30. Contributions and open sourced modules from Orange & conclusions
  31. 31. ContribuGons  and  open  sourced  modules   §  Open sources by Orange §  PHP driver for Cassandra : https://github.com/Orange-OpenSource/YACassandraPDO Thanks to Sandro Lex & Mathieu Lornac §  Mod_dup (Migration to Cassandra) § https://github.com/Orange-OpenSource/mod_dup Thanks to Jonas Wustrack & Emmanuel Courreges §  Other contributions §  C driver (libdbi driver) : http://libdbi-drivers.cvs.sourceforge.net/viewvc/libdbi-drivers/libdbidrivers/?pathrev=Branch-2012-07-02-cassandra Thanks to Emmanuel Courreges 31 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  32. 32. Conclusions §  With Cassandra, we have improved our QoS §  We are able to open our service to new opportunities §  There is an ecosystem around C* (Hadoop, Hive, Pig, Storm, Shark, …), which offers more capabilities. However, we would love to have some of the components (Hive) integrated in C* core (as Pig) §  32 PnS3 works better and hopefully cheaper than PnS2 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  33. 33. Thank  you   33 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  34. 34. Questions 34 Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  35. 35. A  few  answers  about  hardware/OS  version  /Java  version/ Cassandra  version   §  Hardware : §  16 nodes in each DC at the end of 2013 : §  §  1.2.2 (with a few patches backported from 1.2.3) Java version : §  35 Ubuntu Precise (12.04 LTS) Cassandra version : §  §  FusionIO 320 GB MLC OS : §  §  24 GB RAM §  §  6 CPU Intel® Xeon® 2.00 GHz Java7u7 : not recommended, upgrade scheduled soon Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  36. 36. A  few  answers  about  data  and  requests   §  Data types : §  §  elementary types : boolean, integer, string, date §  collection types §  §  Volume : 6 TB at the end of 2013 complex types : json, xml (between 1 and 20 KB) Requests : §  §  80% get §  §  10.000 requests/sec at the end of 2013 20% set Consistency level used by PnS : §  §  36 ONE (95% of the queries) LOCAL_QUORUM (5% of the queries) Jean Armel Luce - Orange-DOP-PnS 3.0 Cassandra Summit Europe – October 17 2013
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×