• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
C* Summit 2013: Time for a New Relationship - Intuit's Journey from RDBMS to Cassandra by Mohit Anchlia
 

C* Summit 2013: Time for a New Relationship - Intuit's Journey from RDBMS to Cassandra by Mohit Anchlia

on

  • 2,158 views

This session talks about Intuit’s journey of our Consumer Financial Platform that is built to scale to petabytes of data. The original system used a major RDBMS and from there, we redesigned to use ...

This session talks about Intuit’s journey of our Consumer Financial Platform that is built to scale to petabytes of data. The original system used a major RDBMS and from there, we redesigned to use the distributed nature of Cassandra. This talk will go through our transition including the data model used for the final product. As with any large system transition, many hard lessons are learned and we will discuss those and share our experiences.

Statistics

Views

Total Views
2,158
Views on SlideShare
2,157
Embed Views
1

Actions

Likes
1
Downloads
27
Comments
0

1 Embed 1

http://localhost 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    C* Summit 2013: Time for a New Relationship - Intuit's Journey from RDBMS to Cassandra by Mohit Anchlia C* Summit 2013: Time for a New Relationship - Intuit's Journey from RDBMS to Cassandra by Mohit Anchlia Presentation Transcript

    • Intuit Proprietary & ConfidentialpeopleThe Consumer Financial Platform (CFP)Mohit AnchliaArchitect, Intuit
    • Intuit Proprietary & ConfidentialAgenda2•  Background•  Problem statement•  Idea of a Platform•  Why Cassandra?•  CFP Stack•  CFP Cassandra Data Model•  Learning in Production•  Q&A
    • Intuit Proprietary & ConfidentialBackground3•  Intuit is maker of TurboTax, Quicken, Quickbooks and many other productsfor SBUs.•  Many services work together to deliver awesome product experience
    • Intuit Proprietary & ConfidentialProblem Statement (Service explosion)4•  Service explosion over the years–  Code duplication–  Cross cutting concern–  Data silos (information silos)–  Operational challenges - schema design, installs–  Added overhead to test and repeat test in production –slow prototyping
    • Intuit Proprietary & Confidential5Idea of a Platform•  Brings information togetherto avoid data silos•  Quick turnaround time•  Plug and play serviceframework•  Don’t need IT andoperations•  Highly personalizedexperience•  Security•  Share data betweenproducts, betweenusersto plug ‘n’play
    • Intuit Proprietary & ConfidentialData Platform/Tier6•  Principles – Highly Available, Highly Scalable, Fast, Easy to operatesoftware only solution for structured and unstructured data (blobs)•  Projection – Petabyte in 2-3 yrs•  Support – Critical application with 99.99%(5 nines) SLA•  But Wait …No Stress
    • Intuit Proprietary & ConfidentialTraditional RDBMS?7•  Challenges with availability andscalability•  Sharding works well, but introduces new challenges as well
    • Intuit Proprietary & ConfidentialNoSQL?8•  Easy?•  Core use cases – Most of the use cases don’t need transactions and withgood design, consistency can be managed properly.•  Evaluated Hbase, MongoDB and Cassandra.
    • Intuit Proprietary & ConfidentialWhy Cassandra?9•  Scalable–  Easy to scale horizontally•  Availability–  Highly Available, can be designed for no SPOF–  Easy to setup clusters and replication between DC–  Fast snapshots–  Rolling upgrades•  Operations–  Easy to install and operate–  Easy to make schema changes•  Fast–  Given the right hardware, Cassandra provides low latency response times.
    • Intuit Proprietary & ConfidentialHigh Level CFP Stack10Data PlatformServices PlatformMule ESBQueue Service Cache serviceCassandraRedHat Storage(DFS)Analytics PlatformMule ESB(services)Mule ESBHBase Hadoop Search Engine MPPFlume•  MuleSoft ESB forbusiness logicorchestration, withframeworks foradditionalauthoringCassandra-poweredschemaless databasewrapped in entity andrelationship logic.RHS – a distributedfile system for blobstorageHadoop/Hbase/Solr/CEP-to meet batchprocessing and nearreal time analytics
    • Intuit Proprietary & ConfidentialCFP Active/Active Multi-Data Center11Data PlatformServices PlatformCassandraRedHat Storage(DFS)Analytics PlatformHadoopMuleData PlatformServices PlatformCassandraRedHat Storage(DFS)Analytics PlatformHadoopMuleReplicationReplicationReplicationLoadBalancerLoadBalancerGlobal LoadBalancer•  30mt Sessionstickiness•  Provides HA•  Low LatencyDC-A DC-B
    • Intuit Proprietary & ConfidentialCFP Schema12•  Represented as a graph–  Entity–  Relationships•  Additional CF for indexes–  Inverted Indexes driven by schemaEntity UserEntityDocumentIndex CF
    • Intuit Proprietary & ConfidentialLearning in Production13•  Monitor Heap Usage–  High and uneven CPU usage–  Add nodes if you can–  Reduce Bloom Filters–  Increase heap if you have to, don’t be scaredBefore After•  Monitor Data per Node – Most importantly keys per node•  Monitor disk IO
    • Intuit Proprietary & ConfidentialThe End14We are hiring.Contact @ mohit_anchlia@intuit.com