SlideShare a Scribd company logo
Issues and Tips for Big Data
       on Cassandra



                     Shotaro Kamio
Architecture and Core Technology dept., DU, Rakuten, Inc.   1
Contents


1   Big Data Problem in Rakuten


2   Contributions to Cassandra Project


3   System Architecture


4   Details and Tips


5   Conclusion




                                         2
Contents


1   Big Data Problem in Rakuten


2   Contributions to Cassandra Project


3   System Architecture


4   Details and Tips


5   Conclusion




                                         3

                                      
                                                                                                         
                                                                                                                                                                    Total size
                                                                                                                                       M
                                                                                                                                        on
                                                                                                                                          th
                                                                                                                                             -Y
                                                                                                                                           Ju ear
                                                                                                                                              n
                                                                                                                                          De -9
                                                                                                                                              c 7
                                                                                                                                           Ju -97
                                                                                                                                              n
                                                                                                                                          De -9
                                                                                                                                              c- 8
                                                                                                                                           Ju 98
                                                                                                                                              n
                                                                                                                                          De -99
                                                                                                                                              c
                                                                                                                                           Ju -99
                                                                                                                                              n
                                                                                                                                           Ja -00
                                                                                                                                              n
                                                                                                                                           Ju -00
                                                                                                                                              n
                                                                                                                                          De -01
                                                                                                                                              c
                                                                                                                                           Ju -01
                                                                                                                                              n
                                                                                                                                          De -0
                                                                                                                                              c 2
                                                                                                                                           Ju -02
                                                                                                                                              n
                                                                                                                                          De -0




    More than 1 billion records.
                                                                                                                                              c- 3
                                                                                                                                           Ju 03
                                                                                                                                              n
                                                                                                                                          De -0
                                                                                                                                              c 4

                                                           – Double its size every second year.
                                                                                                                                           Ju -04
                                                                                                                                              n
                                                                                                                                          De -05
                                                                                                  User data increases exponentially.
                                                                                                                                              c
                                                                                                                                           Ju -05
                                                                                                                                              n
                                                                                                                                          De -06
                                                                                                                                              c
                                                                                                                                           Ju -06
                                                                                                                                              n
                                                                                                                                          De -07
                                                                                                                                              c
                                                                                                                                           Ju -07
                                                                                                                                              n
                                                                                                                                          De -0
                                                                                                                                                                                 Big Data Problem in Rakuten




                                                                                                                                              c- 8
                                                                                                                                           Ju 08
                                                                                                                                              n
                                                                                                                                          De -0
                                                                                                                                              c 9
                                                                                                                                           Ju -09
                                                                                                                                                     2 years




                                                                                                                                              n
                                                                                                                                          De -1
                                                                                                                                              c- 0
    We need a scalable solution to handle this big data.
                                                                                                                                                               x2




                                                                                                                                                10
4
Importance of Data Store in Rakuten


• Rakuten have a lot of data
   – User data, item data, reviews, etc.
• Expect connectivity to Hadoop
• High-performance, fault-tolerant, scalable
  storage is necessary → Cassandra


             Service A           Service B   Service C   …



             Data A                Data B


                                                             5
Performance of New System (Cassandra)


   Store all data in 1 day
     – Achieved 15,000 updates/sec with quorum.
     – 50 times faster than DB.
                                              15,000 updates/sec
   Good read throughput
     – Handle more than 100 read threads at a
       time.
                                                x 50



                                                  DB   New


                                                              6
Contents


1   Big Data Problem in Rakuten


2   Contributions to Cassandra Project


3   System Architecture


4   Details and Tips


5   Conclusion




                                         7
Contributions to Cassandra Project


• Tested 0.7.x - 0.8.x

• Bug reports / Feedback to JIRA
   – CASSANDRA-2212, 2297, 2406, 2557, 2626 and more
   – Bugs related to specific condition, secondary index and large
     dataset.
• Contribute patches
   – Talk this in later slides.




                                                                     8
JIRA: Overflow in bytesPastMark(..)


•   https://issues.apache.org/jira/browse/CASSANDRA-2297


• Hit the error on a row which is more than 60GB
     – The row has column families of super column type


• bytesPastMark method was fixed to return long value.




                                                           9
JIRA: Stack overflow while compacting


•   https://issues.apache.org/jira/browse/CASSANDRA-2626


• Long series of compaction causes stack overflow.
← This occurs with large dataset.

• Helped debugging.




                                                           10
Challenges in OSS


• Not well tested with real big data.
→ Rakuten can feedback a lot to community.
   – Bug report, patches, and communication.
• OSS becomes much stable.



                    Feedback




                                               11
Contribution of Patches


• Column name aliasing
  – Encode column name in compact way.
  – Useful to reduce data size for structured (relational)
    data.
  – Reduce SSTable size by 15%.
• Variable-length quantity (VLQ) compression
  – Reduce encoding overhead in columns
  – Reduce SSTable size by 17%.




                                                             12
VLQ Compression Patch


• Serializer is changed to use VLQ encoding.
• Typical column has fixed length of:
   –   2 bytes for column name length
   –   1 byte for flag
   –   8 bytes for TTL, deletion time
   –   8 bytes for timestamp
   –   4 bytes for length of value.
• Those encoding overheads are reduced.



                                               13
Contents


1   Big Data Problem in Rakuten


2   Contributions to Cassandra Project


3   System Architecture


4   Details and Tips


5   Conclusion




                                         14
System Architecture




                               DB




                                    …
                          DB



                         Cassandra 1
     B atch



       Data
      feeder
              

DB                                      Services
     B atch
                     …

                               DB




                                    …
                          DB



                         Cassandra 2


     Backup

                                                   15
System Architecture




                               DB




                                    …
                          DB



                         Cassandra 1
     B atch



       Data
      feeder
              

DB                                      Services
     B atch
                     …

                               DB




                                    …
                          DB



                         Cassandra 2


     Backup

                                                   16
Planning: Schema Design


• Data modeling is a key of scalability.
• Design schema
   – Query patterns for super column and normal column.
• Think queries based on use cases.
   – Batch operation to reduce number of requests because Thrift has
     communication overhead.
• Secondary Index
   – We used it to find out updated data.
• Choose partitioner appropriately.
   – One partitioner for a cluster.




                                                                       17
Secondary Index


• Pros
   – Useful to query based on a column value.
   – It can reduce consistency problem.
   – For example, to query updated data based on update-time.
• Cons
   – Performance of complex query depends on data.
      E.g., Year == 2011 and Price < 100




                                                                18
A Bit Detail of Secondary Index


   Works like a hash + filters.
    1. Pick up a row which has a key for the index (hash).
    2. Apply filters.
        – Collect the result if all filters are matched.
    1. Repeat until the requested number of rows are obtained.

                                            E.g., Year == 2011 and Price < 100
Key1     Year = 2011

Key2     Year = 2011       Price = 1,000
                                                     Many keys of year = 2011,
Key3     Year = 2011       Price = 10                    but a few results.
Key4     Year = 2011       Price = 10,000

Key5     Year = 2011       Price = 200

                                                                                 19
A Bit Detail of Secondary Index (2)


   Consider the frequency of results for the query
     – Very few result in large data set → query might get
       timeout.
   Careful data/query design is necessary at this moment.
   Improvement is discussed: CASSANDRA-2915




                                                             20
Planning: Data Size Estimation


• Estimate future data volume
• Serialization overhead: x 3 - 4
   – Big overhead for small data.
   – We improved with custom patches, compression code
      • Cassandra 1.0 can use Snappy/Deflate compression.
• Replication: x 3 (depends on your decision)
• Compaction: x 2 or above




                                                            21
Other Factors for Data Size


• Obsolete SSTables
   – Disk usage may keep high after compaction.
   – Cassandra 0.8.x relies on GC to remove obsolete SSTables.
   – Improved in 1.0.

• How to balance data distribution
   – Disk usage can be unbalanced (ByteOrderedPartitioner).
   – Partitioning, key design, initial token assignment.
   – Very helpful if you know data in advance.



• Backup scheme affects disk space
   – Need backup space.
   – Discuss later.
                                                                 22
Configuration


• We adopted Cassandra 0.8.x + custom patches.
• Without mmap
   – No noticeable difference on performance
   – Easier to monitor and debug memory usage and GC related
     issues
• ulimit
   – Avoid file descriptor shortage. Need more than number of db
     files. Bug??
   – “memlock unlimited” for JNA
   – Make /etc/security/limits.d/cassandra.conf (Redhat)




                                                                   23
JVM / GC


• Have to avoid Full GC anytime.
• JVM cannot utilize large heap over 15G.
   – Slow GC. Can be unstable.
   – Don’t give too much data/cache into heap.
   – Off-heap cache is available in 0.8.1
• Cassandra may use more memory than heap size.
   – ulimit –d 25000000 (max data segment size)
   – ulimit –v 75000000 (max virtual memory size)
• Need benchmark to know appropriate parameters.




                                                    24
Parameter Tuning for Failure Detector


•   Cassandra uses Phi Accrual Failure Detector
     – The Φ Accrual Failure Detector [SRDS'04]

                                        double phi(long tnow)
•   Failure detection error occurs      {
    when node is having too much          int size = arrivalIntervals_.size();
                                          double log = 0d;
    access and/or GC running              if ( size > 0 )
                                          {
                                              double t = tnow - tLast_;
•   Depends on number of nodes:               double probability = p(t);
                                              log = (-1) * Math.log10( probability );
     – Larger cluster, larger number.     }
                                          return log;
                                        }
                                        double p(double t)
                                        {
                                            double mean = mean();
                                            double exponent = (-1)*(t)/mean;
                                            return Math.pow(Math.E, exponent);
                                        }

                                                                                    25
Hardware


• Benchmark is important to decide hardware.
   – Requirements for performance, data size, etc.
   – Cassandra is good at utilizing CPU cores.
• Network ports will be bottleneck to scale-out…
   – Large number of low-spec servers or
   – Small number of high-spec servers.



     Our case:
     • High-spec CPU and SSD drives
     • 2 clusters (active and test cluster)



                                                     26
System Architecture




                               DB




                                    …
                          DB



                         Cassandra 1
     B atch



       Data
      feeder
              

DB                                      Services
     B atch
                     …

                               DB




                                    …
                          DB



                         Cassandra 2


     Backup

                                                   27
Customize Hector Library


• Query can timeout on Cassandra:
   – When Cassandra is in high load temporarily.
   – Request of large result set
   – Timeout of secondary index query
• Hector retries forever when query get timed-out.
• Client cannot detect infinite loop.
• Customize:
   – 3 Timeouts to return exception to client.




                                                     28
System Architecture




                               DB




                                    …
                          DB



                         Cassandra 1
     B atch



       Data
      feeder
              

DB                                      Services
     B atch
                     …

                               DB




                                    …
                          DB



                         Cassandra 2


     Backup

                                                   29
Testing: Data Consistency Check Tool


   • We wanted to make sure data is not corrupted within
      Cassandra.
   • Made a tool to check the data consistency.
                                                 Input data
- Insert                                        (Periodically
- Update                                         comes in)
- Delete           Process A
                   Insert, update, and
                   delete data
Another
                   Process B                            Cassandra
database
                   Compare data with that
                   in Cassandra
                                                                    30
Testing: Data Consistency Check Tool (2)


   Compare only keys of data, not contents.
   Useful to diagnose which part is wrong in test phase.
   We found out other team’s bug as well




                                                            31
Repair


• Some types of query doesn’t trigger read repair.
• Nodetool repair is tricky on big data.
   – Disk usage
   – Time consuming
→ Read all data afterward: Read repair

• Discussion for improvement is going on:
   – CASSANDRA-2699




                                                     32
System Architecture




                               DB




                                    …
                          DB



                         Cassandra 1
     B atch



       Data
      feeder
              

DB                                      Services
     B atch
                     …

                               DB




                                    …
                          DB



                         Cassandra 2


     Backup

                                                   33
Backup Scheme

  Backup might be required to shorten recovery time.
1. Snapshot to local disk
    – Plan disk size at server estimation phase.
1. Full backup of input data
    – We had full data feed several times for various reasons:
       E.g., Logic change, schema change, data corruption, etc.


                                            DB

    Incoming




                                                 …
                                       DB



       data                           Cassandra

                    Backup
                                      Snapshot
                                       Snapshot
                                        Snapshot

                                                                  34
Contents


1   Big Data Problem in Rakuten


2   Contributions to Cassandra Project


3   System Architecture


4   Details and Tips


5   Conclusion




                                         35
Conclusion


• Rakuten uses Cassandra with Big data.
• We’ll continue contributing to OSS.




                                          36
最後に・・・




ちょっと宣伝させてください・・・




                   37
We are hiring! 中途採用を大募集しております!

楽天のMission

人と社会を(ネットを通じて)Empowermentし
自らの成功を通じ社会を変革し豊かにする
楽天のGOAL
              To become No.1
         Internet Service Company
                in the World
楽天のMission&GOALに共感いただける方は是非ご連絡ください!

       tech-career@mail.rakuten.com
                                         38

More Related Content

Viewers also liked

[Rakuten TechConf2014] [Sendai] Little look inside Global Ichiba: Ichiba Busi...
[Rakuten TechConf2014] [Sendai] Little look inside Global Ichiba: Ichiba Busi...[Rakuten TechConf2014] [Sendai] Little look inside Global Ichiba: Ichiba Busi...
[Rakuten TechConf2014] [Sendai] Little look inside Global Ichiba: Ichiba Busi...
Rakuten Group, Inc.
 
第4回楽天研究開発シンポジウム.開会挨拶
第4回楽天研究開発シンポジウム.開会挨拶第4回楽天研究開発シンポジウム.開会挨拶
第4回楽天研究開発シンポジウム.開会挨拶
Rakuten Group, Inc.
 
Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06
Rakuten Group, Inc.
 
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions
Rakuten Group, Inc.
 
RIT (Rakuten Institute of Technology) presentation about UI/UX
RIT (Rakuten Institute of Technology) presentation about UI/UXRIT (Rakuten Institute of Technology) presentation about UI/UX
RIT (Rakuten Institute of Technology) presentation about UI/UX
Rakuten Group, Inc.
 
Case Analysis Rakuten Ichiba
Case Analysis  Rakuten IchibaCase Analysis  Rakuten Ichiba
Case Analysis Rakuten Ichiba
Eddie Lee
 

Viewers also liked (6)

[Rakuten TechConf2014] [Sendai] Little look inside Global Ichiba: Ichiba Busi...
[Rakuten TechConf2014] [Sendai] Little look inside Global Ichiba: Ichiba Busi...[Rakuten TechConf2014] [Sendai] Little look inside Global Ichiba: Ichiba Busi...
[Rakuten TechConf2014] [Sendai] Little look inside Global Ichiba: Ichiba Busi...
 
第4回楽天研究開発シンポジウム.開会挨拶
第4回楽天研究開発シンポジウム.開会挨拶第4回楽天研究開発シンポジウム.開会挨拶
第4回楽天研究開発シンポジウム.開会挨拶
 
Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06
 
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions
 
RIT (Rakuten Institute of Technology) presentation about UI/UX
RIT (Rakuten Institute of Technology) presentation about UI/UXRIT (Rakuten Institute of Technology) presentation about UI/UX
RIT (Rakuten Institute of Technology) presentation about UI/UX
 
Case Analysis Rakuten Ichiba
Case Analysis  Rakuten IchibaCase Analysis  Rakuten Ichiba
Case Analysis Rakuten Ichiba
 

Similar to Cassandra conference

art of presentation Map of Jamies Yam
art of presentation Map of Jamies Yamart of presentation Map of Jamies Yam
art of presentation Map of Jamies YamJamies Yam
 
Social media marketing
Social media marketing Social media marketing
Social media marketing
Vinalink Media JSC
 
VMware vCloud Director and Nexus 1000V / Workload Mobility
VMware vCloud Director and Nexus 1000V / Workload MobilityVMware vCloud Director and Nexus 1000V / Workload Mobility
VMware vCloud Director and Nexus 1000V / Workload MobilitySal Lopez
 
UBD Media Kit 2012
UBD Media Kit 2012UBD Media Kit 2012
UBD Media Kit 2012
UnBuenDoctor
 
Webster City Enterprise Zone Map
Webster City Enterprise Zone MapWebster City Enterprise Zone Map
Webster City Enterprise Zone Map
Webster City Economic Development
 
Report: HSE in the Oilfield
Report: HSE in the OilfieldReport: HSE in the Oilfield
Report: HSE in the Oilfield
Doug Sheridan
 
International Trade Compliance Strategy Responsibility Matrix
International Trade Compliance Strategy Responsibility MatrixInternational Trade Compliance Strategy Responsibility Matrix
International Trade Compliance Strategy Responsibility Matrix
GHY International
 
High stakes-world-of-mobile-payments-infographic
High stakes-world-of-mobile-payments-infographicHigh stakes-world-of-mobile-payments-infographic
High stakes-world-of-mobile-payments-infographic
Tyson Hackwood
 
High stakes world of Mobile Payments
High stakes world of Mobile PaymentsHigh stakes world of Mobile Payments
High stakes world of Mobile Payments
txtNation
 
9 18 Part 2
9 18 Part 29 18 Part 2
9 18 Part 2
burgerja
 
The Content Creation Workflow of the Ship Simulator Game - A Case Study
The Content Creation Workflow of the Ship Simulator Game - A Case StudyThe Content Creation Workflow of the Ship Simulator Game - A Case Study
The Content Creation Workflow of the Ship Simulator Game - A Case Study
Wolfgang Hürst
 
3AMIGAS - Keynote: Pjotr Van Schothorst, VStep
3AMIGAS - Keynote: Pjotr Van Schothorst, VStep3AMIGAS - Keynote: Pjotr Van Schothorst, VStep
3AMIGAS - Keynote: Pjotr Van Schothorst, VStep
FOCUS K3D
 
Are you paying attention
Are you paying attentionAre you paying attention
Are you paying attention
Vlad Hayrapetyan
 
Crompton Way Traffic Proposal Map
Crompton Way Traffic Proposal MapCrompton Way Traffic Proposal Map
Crompton Way Traffic Proposal Mapguestf8bf20
 

Similar to Cassandra conference (20)

art of presentation Map of Jamies Yam
art of presentation Map of Jamies Yamart of presentation Map of Jamies Yam
art of presentation Map of Jamies Yam
 
Social media marketing
Social media marketing Social media marketing
Social media marketing
 
VMware vCloud Director and Nexus 1000V / Workload Mobility
VMware vCloud Director and Nexus 1000V / Workload MobilityVMware vCloud Director and Nexus 1000V / Workload Mobility
VMware vCloud Director and Nexus 1000V / Workload Mobility
 
Tvr new map 2012
Tvr new map 2012Tvr new map 2012
Tvr new map 2012
 
UBD Media Kit 2012
UBD Media Kit 2012UBD Media Kit 2012
UBD Media Kit 2012
 
Webster City Enterprise Zone Map
Webster City Enterprise Zone MapWebster City Enterprise Zone Map
Webster City Enterprise Zone Map
 
Report: HSE in the Oilfield
Report: HSE in the OilfieldReport: HSE in the Oilfield
Report: HSE in the Oilfield
 
Jun05 A01 Bct
Jun05 A01 BctJun05 A01 Bct
Jun05 A01 Bct
 
International Trade Compliance Strategy Responsibility Matrix
International Trade Compliance Strategy Responsibility MatrixInternational Trade Compliance Strategy Responsibility Matrix
International Trade Compliance Strategy Responsibility Matrix
 
High stakes-world-of-mobile-payments-infographic
High stakes-world-of-mobile-payments-infographicHigh stakes-world-of-mobile-payments-infographic
High stakes-world-of-mobile-payments-infographic
 
High stakes world of Mobile Payments
High stakes world of Mobile PaymentsHigh stakes world of Mobile Payments
High stakes world of Mobile Payments
 
9 18 Part 2
9 18 Part 29 18 Part 2
9 18 Part 2
 
The Content Creation Workflow of the Ship Simulator Game - A Case Study
The Content Creation Workflow of the Ship Simulator Game - A Case StudyThe Content Creation Workflow of the Ship Simulator Game - A Case Study
The Content Creation Workflow of the Ship Simulator Game - A Case Study
 
3AMIGAS - Keynote: Pjotr Van Schothorst, VStep
3AMIGAS - Keynote: Pjotr Van Schothorst, VStep3AMIGAS - Keynote: Pjotr Van Schothorst, VStep
3AMIGAS - Keynote: Pjotr Van Schothorst, VStep
 
Are you paying attention
Are you paying attentionAre you paying attention
Are you paying attention
 
Brentwood Park Disc Golf Course Map
Brentwood Park Disc Golf Course MapBrentwood Park Disc Golf Course Map
Brentwood Park Disc Golf Course Map
 
Timeline 1
Timeline 1Timeline 1
Timeline 1
 
Crompton Way Traffic Proposal Map
Crompton Way Traffic Proposal MapCrompton Way Traffic Proposal Map
Crompton Way Traffic Proposal Map
 
Hse Product Promo
Hse Product PromoHse Product Promo
Hse Product Promo
 
Hse product promo
Hse product promoHse product promo
Hse product promo
 

More from Rakuten Group, Inc.

コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
Rakuten Group, Inc.
 
楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり
Rakuten Group, Inc.
 
What Makes Software Green?
What Makes Software Green?What Makes Software Green?
What Makes Software Green?
Rakuten Group, Inc.
 
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Rakuten Group, Inc.
 
DataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みDataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組み
Rakuten Group, Inc.
 
大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開
Rakuten Group, Inc.
 
楽天における大規模データベースの運用
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用
Rakuten Group, Inc.
 
楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー
Rakuten Group, Inc.
 
楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割
Rakuten Group, Inc.
 
Rakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdf
Rakuten Group, Inc.
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdf
Rakuten Group, Inc.
 
Supporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfSupporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdf
Rakuten Group, Inc.
 
Making Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfMaking Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdf
Rakuten Group, Inc.
 
How We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfHow We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdf
Rakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
Rakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
Rakuten Group, Inc.
 
OWASPTop10_Introduction
OWASPTop10_IntroductionOWASPTop10_Introduction
OWASPTop10_Introduction
Rakuten Group, Inc.
 
Introduction of GORA API Group technology
Introduction of GORA API Group technologyIntroduction of GORA API Group technology
Introduction of GORA API Group technology
Rakuten Group, Inc.
 
100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情
Rakuten Group, Inc.
 
社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー
Rakuten Group, Inc.
 

More from Rakuten Group, Inc. (20)

コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
 
楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり
 
What Makes Software Green?
What Makes Software Green?What Makes Software Green?
What Makes Software Green?
 
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
 
DataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みDataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組み
 
大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開
 
楽天における大規模データベースの運用
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用
 
楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー
 
楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割
 
Rakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdf
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdf
 
Supporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfSupporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdf
 
Making Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfMaking Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdf
 
How We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfHow We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdf
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
 
OWASPTop10_Introduction
OWASPTop10_IntroductionOWASPTop10_Introduction
OWASPTop10_Introduction
 
Introduction of GORA API Group technology
Introduction of GORA API Group technologyIntroduction of GORA API Group technology
Introduction of GORA API Group technology
 
100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情
 
社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー
 

Recently uploaded

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 

Recently uploaded (20)

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 

Cassandra conference

  • 1. Issues and Tips for Big Data on Cassandra Shotaro Kamio Architecture and Core Technology dept., DU, Rakuten, Inc. 1
  • 2. Contents 1 Big Data Problem in Rakuten 2 Contributions to Cassandra Project 3 System Architecture 4 Details and Tips 5 Conclusion 2
  • 3. Contents 1 Big Data Problem in Rakuten 2 Contributions to Cassandra Project 3 System Architecture 4 Details and Tips 5 Conclusion 3
  • 4.   Total size M on th -Y Ju ear n De -9 c 7 Ju -97 n De -9 c- 8 Ju 98 n De -99 c Ju -99 n Ja -00 n Ju -00 n De -01 c Ju -01 n De -0 c 2 Ju -02 n De -0 More than 1 billion records. c- 3 Ju 03 n De -0 c 4 – Double its size every second year. Ju -04 n De -05 User data increases exponentially. c Ju -05 n De -06 c Ju -06 n De -07 c Ju -07 n De -0 Big Data Problem in Rakuten c- 8 Ju 08 n De -0 c 9 Ju -09 2 years n De -1 c- 0 We need a scalable solution to handle this big data. x2 10 4
  • 5. Importance of Data Store in Rakuten • Rakuten have a lot of data – User data, item data, reviews, etc. • Expect connectivity to Hadoop • High-performance, fault-tolerant, scalable storage is necessary → Cassandra Service A Service B Service C … Data A Data B 5
  • 6. Performance of New System (Cassandra)  Store all data in 1 day – Achieved 15,000 updates/sec with quorum. – 50 times faster than DB. 15,000 updates/sec  Good read throughput – Handle more than 100 read threads at a time. x 50 DB New 6
  • 7. Contents 1 Big Data Problem in Rakuten 2 Contributions to Cassandra Project 3 System Architecture 4 Details and Tips 5 Conclusion 7
  • 8. Contributions to Cassandra Project • Tested 0.7.x - 0.8.x • Bug reports / Feedback to JIRA – CASSANDRA-2212, 2297, 2406, 2557, 2626 and more – Bugs related to specific condition, secondary index and large dataset. • Contribute patches – Talk this in later slides. 8
  • 9. JIRA: Overflow in bytesPastMark(..) • https://issues.apache.org/jira/browse/CASSANDRA-2297 • Hit the error on a row which is more than 60GB – The row has column families of super column type • bytesPastMark method was fixed to return long value. 9
  • 10. JIRA: Stack overflow while compacting • https://issues.apache.org/jira/browse/CASSANDRA-2626 • Long series of compaction causes stack overflow. ← This occurs with large dataset. • Helped debugging. 10
  • 11. Challenges in OSS • Not well tested with real big data. → Rakuten can feedback a lot to community. – Bug report, patches, and communication. • OSS becomes much stable. Feedback 11
  • 12. Contribution of Patches • Column name aliasing – Encode column name in compact way. – Useful to reduce data size for structured (relational) data. – Reduce SSTable size by 15%. • Variable-length quantity (VLQ) compression – Reduce encoding overhead in columns – Reduce SSTable size by 17%. 12
  • 13. VLQ Compression Patch • Serializer is changed to use VLQ encoding. • Typical column has fixed length of: – 2 bytes for column name length – 1 byte for flag – 8 bytes for TTL, deletion time – 8 bytes for timestamp – 4 bytes for length of value. • Those encoding overheads are reduced. 13
  • 14. Contents 1 Big Data Problem in Rakuten 2 Contributions to Cassandra Project 3 System Architecture 4 Details and Tips 5 Conclusion 14
  • 15. System Architecture DB … DB Cassandra 1 B atch Data feeder          DB Services B atch … DB … DB Cassandra 2 Backup 15
  • 16. System Architecture DB … DB Cassandra 1 B atch Data feeder          DB Services B atch … DB … DB Cassandra 2 Backup 16
  • 17. Planning: Schema Design • Data modeling is a key of scalability. • Design schema – Query patterns for super column and normal column. • Think queries based on use cases. – Batch operation to reduce number of requests because Thrift has communication overhead. • Secondary Index – We used it to find out updated data. • Choose partitioner appropriately. – One partitioner for a cluster. 17
  • 18. Secondary Index • Pros – Useful to query based on a column value. – It can reduce consistency problem. – For example, to query updated data based on update-time. • Cons – Performance of complex query depends on data. E.g., Year == 2011 and Price < 100 18
  • 19. A Bit Detail of Secondary Index  Works like a hash + filters. 1. Pick up a row which has a key for the index (hash). 2. Apply filters. – Collect the result if all filters are matched. 1. Repeat until the requested number of rows are obtained. E.g., Year == 2011 and Price < 100 Key1 Year = 2011 Key2 Year = 2011 Price = 1,000 Many keys of year = 2011, Key3 Year = 2011 Price = 10 but a few results. Key4 Year = 2011 Price = 10,000 Key5 Year = 2011 Price = 200 19
  • 20. A Bit Detail of Secondary Index (2)  Consider the frequency of results for the query – Very few result in large data set → query might get timeout.  Careful data/query design is necessary at this moment.  Improvement is discussed: CASSANDRA-2915 20
  • 21. Planning: Data Size Estimation • Estimate future data volume • Serialization overhead: x 3 - 4 – Big overhead for small data. – We improved with custom patches, compression code • Cassandra 1.0 can use Snappy/Deflate compression. • Replication: x 3 (depends on your decision) • Compaction: x 2 or above 21
  • 22. Other Factors for Data Size • Obsolete SSTables – Disk usage may keep high after compaction. – Cassandra 0.8.x relies on GC to remove obsolete SSTables. – Improved in 1.0. • How to balance data distribution – Disk usage can be unbalanced (ByteOrderedPartitioner). – Partitioning, key design, initial token assignment. – Very helpful if you know data in advance. • Backup scheme affects disk space – Need backup space. – Discuss later. 22
  • 23. Configuration • We adopted Cassandra 0.8.x + custom patches. • Without mmap – No noticeable difference on performance – Easier to monitor and debug memory usage and GC related issues • ulimit – Avoid file descriptor shortage. Need more than number of db files. Bug?? – “memlock unlimited” for JNA – Make /etc/security/limits.d/cassandra.conf (Redhat) 23
  • 24. JVM / GC • Have to avoid Full GC anytime. • JVM cannot utilize large heap over 15G. – Slow GC. Can be unstable. – Don’t give too much data/cache into heap. – Off-heap cache is available in 0.8.1 • Cassandra may use more memory than heap size. – ulimit –d 25000000 (max data segment size) – ulimit –v 75000000 (max virtual memory size) • Need benchmark to know appropriate parameters. 24
  • 25. Parameter Tuning for Failure Detector • Cassandra uses Phi Accrual Failure Detector – The Φ Accrual Failure Detector [SRDS'04] double phi(long tnow) • Failure detection error occurs { when node is having too much int size = arrivalIntervals_.size(); double log = 0d; access and/or GC running if ( size > 0 ) { double t = tnow - tLast_; • Depends on number of nodes: double probability = p(t); log = (-1) * Math.log10( probability ); – Larger cluster, larger number. } return log; } double p(double t) { double mean = mean(); double exponent = (-1)*(t)/mean; return Math.pow(Math.E, exponent); } 25
  • 26. Hardware • Benchmark is important to decide hardware. – Requirements for performance, data size, etc. – Cassandra is good at utilizing CPU cores. • Network ports will be bottleneck to scale-out… – Large number of low-spec servers or – Small number of high-spec servers. Our case: • High-spec CPU and SSD drives • 2 clusters (active and test cluster) 26
  • 27. System Architecture DB … DB Cassandra 1 B atch Data feeder          DB Services B atch … DB … DB Cassandra 2 Backup 27
  • 28. Customize Hector Library • Query can timeout on Cassandra: – When Cassandra is in high load temporarily. – Request of large result set – Timeout of secondary index query • Hector retries forever when query get timed-out. • Client cannot detect infinite loop. • Customize: – 3 Timeouts to return exception to client. 28
  • 29. System Architecture DB … DB Cassandra 1 B atch Data feeder          DB Services B atch … DB … DB Cassandra 2 Backup 29
  • 30. Testing: Data Consistency Check Tool • We wanted to make sure data is not corrupted within Cassandra. • Made a tool to check the data consistency. Input data - Insert (Periodically - Update comes in) - Delete Process A Insert, update, and delete data Another Process B Cassandra database Compare data with that in Cassandra 30
  • 31. Testing: Data Consistency Check Tool (2)  Compare only keys of data, not contents.  Useful to diagnose which part is wrong in test phase.  We found out other team’s bug as well 31
  • 32. Repair • Some types of query doesn’t trigger read repair. • Nodetool repair is tricky on big data. – Disk usage – Time consuming → Read all data afterward: Read repair • Discussion for improvement is going on: – CASSANDRA-2699 32
  • 33. System Architecture DB … DB Cassandra 1 B atch Data feeder          DB Services B atch … DB … DB Cassandra 2 Backup 33
  • 34. Backup Scheme  Backup might be required to shorten recovery time. 1. Snapshot to local disk – Plan disk size at server estimation phase. 1. Full backup of input data – We had full data feed several times for various reasons: E.g., Logic change, schema change, data corruption, etc. DB Incoming … DB data Cassandra Backup Snapshot Snapshot Snapshot 34
  • 35. Contents 1 Big Data Problem in Rakuten 2 Contributions to Cassandra Project 3 System Architecture 4 Details and Tips 5 Conclusion 35
  • 36. Conclusion • Rakuten uses Cassandra with Big data. • We’ll continue contributing to OSS. 36
  • 38. We are hiring! 中途採用を大募集しております! 楽天のMission 人と社会を(ネットを通じて)Empowermentし 自らの成功を通じ社会を変革し豊かにする 楽天のGOAL To become No.1 Internet Service Company in the World 楽天のMission&GOALに共感いただける方は是非ご連絡ください!  tech-career@mail.rakuten.com 38