Cassandra & puppet, scaling data at $15 per month

1.
Cassandra & Puppet:Scalingdata at $15/monthConstant ContactMarch 2011Dave Connors – VP OperationsJim Ancona – Systems ArchitectMark Schena – Manager Systems Automation

2.
Constant ContactConstant Contact2000– 2010 Market leader for Small BusinessesEmail, Event & Survey

3.
Over 400k payingcustomers

4.
No. 134 onthe Deloitte Technology Fast 500 listingBusiness modelMany customers pay as little as $15 a month

5.
~2 million databasetransactions per minuteConstant ContactThe business problem

6.
Constant Contact SmallBusinesses are looking to us for help with Social Media marketingSocial Media 10-100 times more data

7.
Challenge with ourbusiness modelThe Key ChallengeThe Key ChallengeIntegrate social media dataSolution = NoSQL

8.
Cost = Low

9.
Time to market= ?ImplementationImplementing NoSQLOps and Dev both face issuesData model

10.
Monitoring

11.
Authentication

12.
Logging

13.
Risk profile

14.
Roles & ResponsibilitiesOpsDev

15.
Apache CassandraApache CassandraDevelopedat Facebook

16.
Open sourced in2008

17.
Incubated at Apache

18.
Became an Apachetop-level project in 2010

19.
http://cassandra.apache.org

20.
In use atDigg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco, …

21.
Largest production clusterhas over 100 TB of data in over 150 machinesWhat is Cassandra?What is CassandraImplemented in Java

22.
Fault Tolerant

23.
Elastic

24.
Durable

25.
Rich data model

26.
Replicated data

27.
Consistency optionsReplicationReplicationHow manycopies of each piece of data do we want?N=3

28.
Consistency LevelONEConsistency LevelOneY

29.
Consistency Level QuorumX

30.
Risks and MitigationRisksand MitigationMoving target

31.
Developer unfamiliarity

32.
Operational procedures

33.
Reliability concerns

34.
Deployment automation

35.
Community involvement

36.
Training/Consulting

37.
Application selection

38.
Lots of monitoring

39.
Phased rolloutDevelopment ChallengesDevelopmentChallengesUnderstanding the data modelChoosing a clientClients available for Java, Python, .NET, Ruby, PHPDon’t use ThriftMoving target

40.
Open SourceNot “oneneck to wring”

41.
Paid support andtraining is available: http://datastax.com

42.
CommunityMailing listsIRC #cassandraat freenodeContributePhased RolloutSwitchable modes

43.
Mirroring

44.
Dial-able traffic CollaborationBig,complex project

45.
Close collaboration

46.
Flexible roles

47.
Ability to iterateOpsDev

48.
“Are you sureyou really want that?” “Are you sure you really want that?”3 500G disks

49.
1 250G disk

50.
No SWAP

51.
RAID Zero RootPartition and Data Storage

52.
32G MemoryWe willneed how many servers?We will need how many servers?

53.
How many nodes?Quorum = 3

54.
Multiple Datacenters =2

55.
Use only halfthe available disk = 2

56.
12 Servers =~1 TB Of Data Storage

57.
~6 TB ofData Storage 72x 6 =x 2 = 123x 2 = 6

58.
RanRandom Partitioner

59.
Tool ChainTool Chain

60.
with PuppetPuppetis the shared framework between Operations and Development

61.
Versioning of puppetcode allows for adoption of development best practices

62.
Leverage Domain specificknowledge and skillDevOps with Puppet

63.
Always Move ForwardAlwaysMove Forward

64.
Operational EfficienciesRemote loggingis a requirement

65.
Cassandra uses log4jnatively

66.
Resources not availablefor remote log4j development

67.
Scribed with Puppetprovides the solutionOperational Efficiencies

68.
Development takes theOperational LeadMunin

69.
JMX trending

70.
Identify critical datapoints

71.
Rapid development ofgraphs

72.
Puppet Definitions areused for rapid deploymentSample Munin Graph

73.
Puppet Code Example:Munin Puppet Codedefine munin::cassandracolumnfamily ( ) { include cassandravirtual File <| title == "jmxbin" |> $confdir="/opt/cassandra-munin-plugins” $plugindir="/etc/munin/plugins" $target="/opt/cassandra-munin-plugins/jmx_" # Match 3 strings separated by periods $pattern = '^([^.]*)[.]([^.]*)[.]([^.]*)$' $keyspace = regsubst($name, $pattern, '\1') $columnfamily = regsubst($name, $pattern, '\2') $file = regsubst($name, $pattern, '\3')file {"${keyspace}_${columnfamily}_${file}.conf": owner => 'root', ensure => 'file', group => 'root', type => 'file', path => "${confdir}/${keyspace}_${columnfamily}_${file}.conf", mode => '644', content => template("munin/attribute_${file}.conf.erb"), require => [ Package['munin-node'], File['/opt/cassandra-munin-plugins'], File['jmxquery'], ], }file {"$plugindir/${keyspace}_${columnfamily}_${file}": ensure => 'link', owner => 'root', group => 'root', mode => '511', type => 'link', target => "$target", require => [ File['/opt/cassandra-munin-plugins'], File["${keyspace}_${columnfamily}_${file}.conf"], File['jmxquery'], Package['munin-node'], ],

74.
ConclusionConclusionCassandra as anappliance

75.
Development Best Practiceswith Life Cycle Management

76.
Traditional vs. Today

77.
Infrastructure 4 weeks 4 hours to build 72 nodesDevelopment to Deployment 9 months 3 monthsCost Millions 150k

78.
Q&AThank You!

Cassandra & puppet, scaling data at $15 per month

More Related Content

What's hot

Viewers also liked

Similar to Cassandra & puppet, scaling data at $15 per month

Recently uploaded

Cassandra & puppet, scaling data at $15 per month

Editor's Notes