Intuit Proprietary & ConfidentialpeopleThe Consumer Financial Platform (CFP)Mohit AnchliaArchitect, Intuit
Intuit Proprietary & ConfidentialAgenda2•  Background•  Problem statement•  Idea of a Platform•  Why Cassandra?•  CFP Stac...
Intuit Proprietary & ConfidentialBackground3•  Intuit is maker of TurboTax, Quicken, Quickbooks and many other productsfor...
Intuit Proprietary & ConfidentialProblem Statement (Service explosion)4•  Service explosion over the years–  Code duplicat...
Intuit Proprietary & Confidential5Idea of a Platform•  Brings information togetherto avoid data silos•  Quick turnaround t...
Intuit Proprietary & ConfidentialData Platform/Tier6•  Principles – Highly Available, Highly Scalable, Fast, Easy to opera...
Intuit Proprietary & ConfidentialTraditional RDBMS?7•  Challenges with availability andscalability•  Sharding works well, ...
Intuit Proprietary & ConfidentialNoSQL?8•  Easy?•  Core use cases – Most of the use cases don’t need transactions and with...
Intuit Proprietary & ConfidentialWhy Cassandra?9•  Scalable–  Easy to scale horizontally•  Availability–  Highly Available...
Intuit Proprietary & ConfidentialHigh Level CFP Stack10Data PlatformServices PlatformMule ESBQueue Service Cache serviceCa...
Intuit Proprietary & ConfidentialCFP Active/Active Multi-Data Center11Data PlatformServices PlatformCassandraRedHat Storag...
Intuit Proprietary & ConfidentialCFP Schema12•  Represented as a graph–  Entity–  Relationships•  Additional CF for indexe...
Intuit Proprietary & ConfidentialLearning in Production13•  Monitor Heap Usage–  High and uneven CPU usage–  Add nodes if ...
Intuit Proprietary & ConfidentialThe End14We are hiring.Contact @ mohit_anchlia@intuit.com
Upcoming SlideShare
Loading in...5
×

C* Summit 2013: Time for a New Relationship - Intuit's Journey from RDBMS to Cassandra by Mohit Anchlia

8,711

Published on

This session talks about Intuit’s journey of our Consumer Financial Platform that is built to scale to petabytes of data. The original system used a major RDBMS and from there, we redesigned to use the distributed nature of Cassandra. This talk will go through our transition including the data model used for the final product. As with any large system transition, many hard lessons are learned and we will discuss those and share our experiences.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
8,711
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
58
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

C* Summit 2013: Time for a New Relationship - Intuit's Journey from RDBMS to Cassandra by Mohit Anchlia

  1. 1. Intuit Proprietary & ConfidentialpeopleThe Consumer Financial Platform (CFP)Mohit AnchliaArchitect, Intuit
  2. 2. Intuit Proprietary & ConfidentialAgenda2•  Background•  Problem statement•  Idea of a Platform•  Why Cassandra?•  CFP Stack•  CFP Cassandra Data Model•  Learning in Production•  Q&A
  3. 3. Intuit Proprietary & ConfidentialBackground3•  Intuit is maker of TurboTax, Quicken, Quickbooks and many other productsfor SBUs.•  Many services work together to deliver awesome product experience
  4. 4. Intuit Proprietary & ConfidentialProblem Statement (Service explosion)4•  Service explosion over the years–  Code duplication–  Cross cutting concern–  Data silos (information silos)–  Operational challenges - schema design, installs–  Added overhead to test and repeat test in production –slow prototyping
  5. 5. Intuit Proprietary & Confidential5Idea of a Platform•  Brings information togetherto avoid data silos•  Quick turnaround time•  Plug and play serviceframework•  Don’t need IT andoperations•  Highly personalizedexperience•  Security•  Share data betweenproducts, betweenusersto plug ‘n’play
  6. 6. Intuit Proprietary & ConfidentialData Platform/Tier6•  Principles – Highly Available, Highly Scalable, Fast, Easy to operatesoftware only solution for structured and unstructured data (blobs)•  Projection – Petabyte in 2-3 yrs•  Support – Critical application with 99.99%(5 nines) SLA•  But Wait …No Stress
  7. 7. Intuit Proprietary & ConfidentialTraditional RDBMS?7•  Challenges with availability andscalability•  Sharding works well, but introduces new challenges as well
  8. 8. Intuit Proprietary & ConfidentialNoSQL?8•  Easy?•  Core use cases – Most of the use cases don’t need transactions and withgood design, consistency can be managed properly.•  Evaluated Hbase, MongoDB and Cassandra.
  9. 9. Intuit Proprietary & ConfidentialWhy Cassandra?9•  Scalable–  Easy to scale horizontally•  Availability–  Highly Available, can be designed for no SPOF–  Easy to setup clusters and replication between DC–  Fast snapshots–  Rolling upgrades•  Operations–  Easy to install and operate–  Easy to make schema changes•  Fast–  Given the right hardware, Cassandra provides low latency response times.
  10. 10. Intuit Proprietary & ConfidentialHigh Level CFP Stack10Data PlatformServices PlatformMule ESBQueue Service Cache serviceCassandraRedHat Storage(DFS)Analytics PlatformMule ESB(services)Mule ESBHBase Hadoop Search Engine MPPFlume•  MuleSoft ESB forbusiness logicorchestration, withframeworks foradditionalauthoringCassandra-poweredschemaless databasewrapped in entity andrelationship logic.RHS – a distributedfile system for blobstorageHadoop/Hbase/Solr/CEP-to meet batchprocessing and nearreal time analytics
  11. 11. Intuit Proprietary & ConfidentialCFP Active/Active Multi-Data Center11Data PlatformServices PlatformCassandraRedHat Storage(DFS)Analytics PlatformHadoopMuleData PlatformServices PlatformCassandraRedHat Storage(DFS)Analytics PlatformHadoopMuleReplicationReplicationReplicationLoadBalancerLoadBalancerGlobal LoadBalancer•  30mt Sessionstickiness•  Provides HA•  Low LatencyDC-A DC-B
  12. 12. Intuit Proprietary & ConfidentialCFP Schema12•  Represented as a graph–  Entity–  Relationships•  Additional CF for indexes–  Inverted Indexes driven by schemaEntity UserEntityDocumentIndex CF
  13. 13. Intuit Proprietary & ConfidentialLearning in Production13•  Monitor Heap Usage–  High and uneven CPU usage–  Add nodes if you can–  Reduce Bloom Filters–  Increase heap if you have to, don’t be scaredBefore After•  Monitor Data per Node – Most importantly keys per node•  Monitor disk IO
  14. 14. Intuit Proprietary & ConfidentialThe End14We are hiring.Contact @ mohit_anchlia@intuit.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×