Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cassandra & puppet, scaling data at $15 per month

35,976 views

Published on

Constant Contact shares lessons learned from DevOps approach to implementing Cassandra to manage social media data for over 400k small business customers. Puppet is the critical in our tool chain. Single most important factor was the willingness of Development and Operations to stretch beyond traditional roles and responsibilities.

Published in: Technology
  • Be the first to comment

Cassandra & puppet, scaling data at $15 per month

  1. 1. Cassandra & Puppet:<br />Scaling data at $15/month<br />Constant Contact<br />March 2011<br />Dave Connors – VP Operations<br />Jim Ancona – Systems Architect<br />Mark Schena – Manager Systems Automation<br />
  2. 2. Constant Contact<br />Constant Contact<br />2000 – 2010 <br />Market leader for Small Businesses<br /><ul><li>Email, Event & Survey
  3. 3. Over 400k paying customers
  4. 4. No. 134 on the Deloitte Technology Fast 500 listing</li></ul>Business model<br /><ul><li>Many customers pay as little as $15 a month
  5. 5. ~2 million database transactions per minute</li></li></ul><li>Constant Contact<br />The business problem<br />
  6. 6. Constant Contact <br />Small Businesses are looking to us for help with Social Media marketing<br /><ul><li>Social Media 10-100 times more data
  7. 7. Challenge with our business model</li></li></ul><li>The Key Challenge<br />The Key Challenge<br />Integrate social media data<br /><ul><li>Solution = NoSQL
  8. 8. Cost = Low
  9. 9. Time to market = ?</li></li></ul><li>Implementation<br />Implementing NoSQL<br />Ops and Dev both face issues<br /><ul><li>Data model
  10. 10. Monitoring
  11. 11. Authentication
  12. 12. Logging
  13. 13. Risk profile
  14. 14. Roles & Responsibilities</li></li></ul><li>Ops<br />Dev<br />
  15. 15. Apache Cassandra<br />Apache Cassandra<br /><ul><li>Developed at Facebook
  16. 16. Open sourced in 2008
  17. 17. Incubated at Apache
  18. 18. Became an Apache top-level project in 2010
  19. 19. http://cassandra.apache.org
  20. 20. In use at Digg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco, …
  21. 21. Largest production cluster has over 100 TB of data in over 150 machines</li></li></ul><li>What is Cassandra?<br />What is Cassandra<br /><ul><li>Implemented in Java
  22. 22. Fault Tolerant
  23. 23. Elastic
  24. 24. Durable
  25. 25. Rich data model
  26. 26. Replicated data
  27. 27. Consistency options</li></li></ul><li>Replication<br />Replication<br />How many copies of each piece of data <br />do we want?<br />N=3<br />
  28. 28. Consistency LevelONE<br />Consistency Level One<br />Y<br />
  29. 29. Consistency Level Quorum<br />X<br />
  30. 30. Risks and Mitigation<br />Risks and Mitigation<br /><ul><li>Moving target
  31. 31. Developer unfamiliarity
  32. 32. Operational procedures
  33. 33. Reliability concerns
  34. 34. Deployment automation
  35. 35. Community involvement
  36. 36. Training/Consulting
  37. 37. Application selection
  38. 38. Lots of monitoring
  39. 39. Phased rollout</li></li></ul><li>Development Challenges<br />Development Challenges<br />Understanding the data model<br />Choosing a client<br />Clients available for Java, Python, .NET, Ruby, PHP<br />Don’t use Thrift<br />Moving target<br />
  40. 40. Open Source<br /><ul><li>Not “one neck to wring”
  41. 41. Paid support and training is available: http://datastax.com
  42. 42. Community</li></ul>Mailing lists<br />IRC #cassandra at freenode<br /><ul><li>Contribute</li></li></ul><li>Phased Rollout<br /><ul><li>Switchable modes
  43. 43. Mirroring
  44. 44. Dial-able traffic </li></li></ul><li>Collaboration<br /><ul><li>Big, complex project
  45. 45. Close collaboration
  46. 46. Flexible roles
  47. 47. Ability to iterate</li></li></ul><li>Ops<br />Dev<br />
  48. 48. “Are you sure you really want that?” <br />“Are you sure you really want that?”<br /><ul><li>3 500G disks
  49. 49. 1 250G disk
  50. 50. No SWAP
  51. 51. RAID Zero Root Partition and Data Storage
  52. 52. 32G Memory</li></li></ul><li>We will need how many servers?<br />We will need how many servers?<br />
  53. 53. How many nodes?<br /><ul><li>Quorum = 3
  54. 54. Multiple Datacenters = 2
  55. 55. Use only half the available disk = 2
  56. 56. 12 Servers = ~1 TB Of Data Storage
  57. 57. ~6 TB of Data Storage </li></ul>72<br />x 6 =<br />x 2 = 12<br />3<br />x 2 = 6<br />
  58. 58. Ran<br />Random Partitioner<br />
  59. 59. Tool Chain<br />Tool Chain<br />
  60. 60. with Puppet<br /><ul><li>Puppet is the shared framework between Operations and Development
  61. 61. Versioning of puppet code allows for adoption of development best practices
  62. 62. Leverage Domain specific knowledge and skill</li></ul>DevOps with Puppet<br />
  63. 63. Always Move Forward<br />Always Move Forward<br />
  64. 64. Operational Efficiencies<br /><ul><li>Remote logging is a requirement
  65. 65. Cassandra uses log4j natively
  66. 66. Resources not available for remote log4j development
  67. 67. Scribed with Puppet provides the solution</li></ul>Operational Efficiencies<br />
  68. 68. Development takes the Operational Lead<br /><ul><li>Munin
  69. 69. JMX trending
  70. 70. Identify critical data points
  71. 71. Rapid development of graphs
  72. 72. Puppet Definitions are used for rapid deployment</li></li></ul><li>Sample Munin Graph<br />
  73. 73. Puppet Code <br />Example: Munin Puppet Code<br />define munin::cassandracolumnfamily ( ) {<br /> include cassandravirtual<br /> File <| title == "jmxbin" |><br /> $confdir="/opt/cassandra-munin-plugins”<br /> $plugindir="/etc/munin/plugins"<br /> $target="/opt/cassandra-munin-plugins/jmx_"<br /> # Match 3 strings separated by periods<br /> $pattern = '^([^.]*)[.]([^.]*)[.]([^.]*)$'<br /> $keyspace = regsubst($name, $pattern, '1')<br /> $columnfamily = regsubst($name, $pattern, '2')<br /> $file = regsubst($name, $pattern, '3')<br />file {"${keyspace}_${columnfamily}_${file}.conf":<br /> owner => 'root', ensure => 'file', group => 'root', type => 'file',<br /> path => "${confdir}/${keyspace}_${columnfamily}_${file}.conf",<br /> mode => '644',<br /> content => template("munin/attribute_${file}.conf.erb"),<br /> require => [ Package['munin-node'], File['/opt/cassandra-munin-plugins'], File['jmxquery'], ], <br /> }<br />file {"$plugindir/${keyspace}_${columnfamily}_${file}":<br /> ensure => 'link', owner => 'root', group => 'root', mode => '511', type => 'link',<br /> target => "$target",<br /> require => [ File['/opt/cassandra-munin-plugins'], File["${keyspace}_${columnfamily}_${file}.conf"], File['jmxquery'], Package['munin-node'], ], <br />
  74. 74. Conclusion<br />Conclusion<br /><ul><li>Cassandra as an appliance
  75. 75. Development Best Practices with Life Cycle Management
  76. 76. Traditional vs. Today
  77. 77. Infrastructure </li></ul> 4 weeks 4 hours to build 72 nodes<br /><ul><li>Development to Deployment</li></ul> 9 months 3 months<br /><ul><li>Cost</li></ul> Millions 150k<br />
  78. 78. Q&A<br />Thank You!<br />

×