SlideShare a Scribd company logo
1 of 7
Download to read offline
Cassandra at Mahalo.com

Noah Silas          John Watson
Backend Developer   Data Systems Architect
noah@mahalo.com     johnw@mahalo.com

twitter: @noah256   twitter: @dctrwatson
About Mahalo

Mahalo.com is one of the top 200 domains on the net1

We serve ~ 12 Million unique visitors per month

Served out of two geographically disparate data-centers

nginx, Apache, Python, Django stack
Primary Datastore - Replicated MySQL Cluster



 1. reported by quantcast.com
Cassandra at Mahalo

Hardware: 8 Node Cluster
  CPU: 2x Intel Xeon E5530 Quad-Core 2.40GHz,
          8MB Cache, 5.86GT/s QPI

  RAM: 24GB ( 6 x 4GB) DDR3-1333 ECC Registered DIMM

Software:
  FreeBSD

  Lazyboy Python Client Library
Current Use Case - Activity Log

Near real-time feeds documenting site usage

Appears on user profiles, detailed page change logs

Actions on the site are recorded in between 4 and 4000 feeds
  - requirement: "Stupidly Fast Writes"

Data Model:
  Two Column Families
    ActivityLog
    ActivityLogIndexes

Important Lesson: Pick Unambiguous keys!
Current Use Case - Content Pages

Mahalo Content Pages provide comprehensive search results

Search results can be curated by our staff of Guides

Curated results must be stored and ordered
  - This was leading to large MySQL tables, with one table in
    particular exploding to nearly 20 million rows with ~ 15GB
    of data

Only one query generally performed against this data - given a
page slug, find the curated results for this page.

When we migrated this table from MySQL into Cassandra we
saw immediate performance gains across our MySQL cluster
Our Experiences /
  Boneheaded Mistakes
Plan Ahead!!!

CASSANDRA-16 - Large Rows

Nagios Monitoring for Cassandra -
http://www.mahalo.com/how-to-monitor-cassandra-with-nagios

Cassandra Upgrades solve problems. Usually.

The CommitLog really does belong on a dedicated disk

Storing data encoded in difficult formats is a bad plan
  - example: python pickles
Our Experiences /
  Boneheaded Mistakes
Problems can be solved by throwing more memory at the JMX
heap, right?

Cluster Load Balancing - HA Proxy is Awesome!
  - but it sometimes obscures which node is experiencing
     issues.

We have found that we don't need a memcached instance in
front of Cassandra

Onboarding Devs for cassandra is Hard!
  - Terminology is overloaded from RDBMS world

More Related Content

What's hot

Потоковая фильтрация событий
Потоковая фильтрация событийПотоковая фильтрация событий
Потоковая фильтрация событийCEE-SEC(R)
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkkbajda
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...Cloudera, Inc.
 
M|18 How DBAs at TradingScreen Make Life Easier With Automation
M|18 How DBAs at TradingScreen Make Life Easier With AutomationM|18 How DBAs at TradingScreen Make Life Easier With Automation
M|18 How DBAs at TradingScreen Make Life Easier With AutomationMariaDB plc
 
HBaseConAsia2018 Track3-3: HBase at China Life Insurance
HBaseConAsia2018 Track3-3: HBase at China Life InsuranceHBaseConAsia2018 Track3-3: HBase at China Life Insurance
HBaseConAsia2018 Track3-3: HBase at China Life InsuranceMichael Stack
 
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)Ontico
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud EraMydbops
 
U C2007 My S Q L Performance Cookbook
U C2007  My S Q L  Performance  CookbookU C2007  My S Q L  Performance  Cookbook
U C2007 My S Q L Performance Cookbookguestae36d0
 
Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017Felix GV
 
Microservice-based software architecture
Microservice-based software architectureMicroservice-based software architecture
Microservice-based software architectureArangoDB Database
 
NoSQL and MongoDB Introdction
NoSQL and MongoDB IntrodctionNoSQL and MongoDB Introdction
NoSQL and MongoDB IntrodctionBrian Enochson
 
Scaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case studyScaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case studyOliver Seemann
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Consjohnrjenson
 
Fast Online Access to Massive Offline Data - SECR 2016
Fast Online Access to Massive Offline Data - SECR 2016Fast Online Access to Massive Offline Data - SECR 2016
Fast Online Access to Massive Offline Data - SECR 2016Felix GV
 
Building Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDBBuilding Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDBAshnikbiz
 
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...Hadoop User Group
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
Scaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSScaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSMax Neunhöffer
 

What's hot (20)

Потоковая фильтрация событий
Потоковая фильтрация событийПотоковая фильтрация событий
Потоковая фильтрация событий
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talk
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
 
M|18 How DBAs at TradingScreen Make Life Easier With Automation
M|18 How DBAs at TradingScreen Make Life Easier With AutomationM|18 How DBAs at TradingScreen Make Life Easier With Automation
M|18 How DBAs at TradingScreen Make Life Easier With Automation
 
HBaseConAsia2018 Track3-3: HBase at China Life Insurance
HBaseConAsia2018 Track3-3: HBase at China Life InsuranceHBaseConAsia2018 Track3-3: HBase at China Life Insurance
HBaseConAsia2018 Track3-3: HBase at China Life Insurance
 
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud Era
 
U C2007 My S Q L Performance Cookbook
U C2007  My S Q L  Performance  CookbookU C2007  My S Q L  Performance  Cookbook
U C2007 My S Q L Performance Cookbook
 
Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017
 
Microservice-based software architecture
Microservice-based software architectureMicroservice-based software architecture
Microservice-based software architecture
 
NoSQL and MongoDB Introdction
NoSQL and MongoDB IntrodctionNoSQL and MongoDB Introdction
NoSQL and MongoDB Introdction
 
Redis IU
Redis IURedis IU
Redis IU
 
Jee conf
Jee confJee conf
Jee conf
 
Scaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case studyScaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case study
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
Fast Online Access to Massive Offline Data - SECR 2016
Fast Online Access to Massive Offline Data - SECR 2016Fast Online Access to Massive Offline Data - SECR 2016
Fast Online Access to Massive Offline Data - SECR 2016
 
Building Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDBBuilding Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDB
 
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...
Yahoo! Hadoop User Group - May Meetup - Extraordinarily rapid and robust data...
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Scaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSScaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOS
 

Viewers also liked

19 Bevis Marks, London, Ec3 A 7 Ja
19 Bevis Marks, London, Ec3 A 7 Ja19 Bevis Marks, London, Ec3 A 7 Ja
19 Bevis Marks, London, Ec3 A 7 JaDavidHowarth
 
Treibhaus 0.8 social media fürs event 2011
Treibhaus 0.8 social media fürs event 2011Treibhaus 0.8 social media fürs event 2011
Treibhaus 0.8 social media fürs event 2011treibhaus08
 
The New Interactive Reduced List Pubs Development Opportunities
The New Interactive Reduced List   Pubs  Development OpportunitiesThe New Interactive Reduced List   Pubs  Development Opportunities
The New Interactive Reduced List Pubs Development OpportunitiesDavidHowarth
 
OpenXava: Rapid Development for Business Applications
OpenXava: Rapid Development for Business ApplicationsOpenXava: Rapid Development for Business Applications
OpenXava: Rapid Development for Business Applicationsjavierpaniza
 
P3M3 - Discovery Assessment
P3M3 - Discovery AssessmentP3M3 - Discovery Assessment
P3M3 - Discovery AssessmentProfeo
 
Palindrome - JFTL2014 - Mais quelle qualité voulons-nous?
Palindrome - JFTL2014 - Mais quelle qualité voulons-nous?Palindrome - JFTL2014 - Mais quelle qualité voulons-nous?
Palindrome - JFTL2014 - Mais quelle qualité voulons-nous?Olivier Denoo
 
Congrès régional d’automne 2013
Congrès régional d’automne 2013Congrès régional d’automne 2013
Congrès régional d’automne 2013dabou-ch
 
Hadoop interview question
Hadoop interview questionHadoop interview question
Hadoop interview questionpappupassindia
 
MoP - Portfolio Management Standard
MoP - Portfolio Management StandardMoP - Portfolio Management Standard
MoP - Portfolio Management StandardProfeo
 
congenital heart disease & rheumatic heart disease
congenital heart disease & rheumatic heart diseasecongenital heart disease & rheumatic heart disease
congenital heart disease & rheumatic heart diseaseMustapha Asaa'd
 
Extraction in orthodontics
Extraction in orthodontics Extraction in orthodontics
Extraction in orthodontics Mustapha Asaa'd
 
P3O - a standard for PMO modelling
P3O - a standard for PMO modellingP3O - a standard for PMO modelling
P3O - a standard for PMO modellingProfeo
 
viral infections of the oral cavity
viral infections of the oral cavityviral infections of the oral cavity
viral infections of the oral cavityMustapha Asaa'd
 
C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gar...
C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gar...C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gar...
C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gar...DataStax Academy
 

Viewers also liked (20)

Tim o’brien
Tim o’brienTim o’brien
Tim o’brien
 
19 Bevis Marks, London, Ec3 A 7 Ja
19 Bevis Marks, London, Ec3 A 7 Ja19 Bevis Marks, London, Ec3 A 7 Ja
19 Bevis Marks, London, Ec3 A 7 Ja
 
Follow the thing
Follow the thingFollow the thing
Follow the thing
 
Treibhaus 0.8 social media fürs event 2011
Treibhaus 0.8 social media fürs event 2011Treibhaus 0.8 social media fürs event 2011
Treibhaus 0.8 social media fürs event 2011
 
Ftt.final
Ftt.finalFtt.final
Ftt.final
 
Educ190report (1)
Educ190report (1)Educ190report (1)
Educ190report (1)
 
SAVE WATER TO SAVE WORLD
SAVE WATER TO SAVE WORLD SAVE WATER TO SAVE WORLD
SAVE WATER TO SAVE WORLD
 
The New Interactive Reduced List Pubs Development Opportunities
The New Interactive Reduced List   Pubs  Development OpportunitiesThe New Interactive Reduced List   Pubs  Development Opportunities
The New Interactive Reduced List Pubs Development Opportunities
 
OpenXava: Rapid Development for Business Applications
OpenXava: Rapid Development for Business ApplicationsOpenXava: Rapid Development for Business Applications
OpenXava: Rapid Development for Business Applications
 
Tongue
TongueTongue
Tongue
 
P3M3 - Discovery Assessment
P3M3 - Discovery AssessmentP3M3 - Discovery Assessment
P3M3 - Discovery Assessment
 
Palindrome - JFTL2014 - Mais quelle qualité voulons-nous?
Palindrome - JFTL2014 - Mais quelle qualité voulons-nous?Palindrome - JFTL2014 - Mais quelle qualité voulons-nous?
Palindrome - JFTL2014 - Mais quelle qualité voulons-nous?
 
Congrès régional d’automne 2013
Congrès régional d’automne 2013Congrès régional d’automne 2013
Congrès régional d’automne 2013
 
Hadoop interview question
Hadoop interview questionHadoop interview question
Hadoop interview question
 
MoP - Portfolio Management Standard
MoP - Portfolio Management StandardMoP - Portfolio Management Standard
MoP - Portfolio Management Standard
 
congenital heart disease & rheumatic heart disease
congenital heart disease & rheumatic heart diseasecongenital heart disease & rheumatic heart disease
congenital heart disease & rheumatic heart disease
 
Extraction in orthodontics
Extraction in orthodontics Extraction in orthodontics
Extraction in orthodontics
 
P3O - a standard for PMO modelling
P3O - a standard for PMO modellingP3O - a standard for PMO modelling
P3O - a standard for PMO modelling
 
viral infections of the oral cavity
viral infections of the oral cavityviral infections of the oral cavity
viral infections of the oral cavity
 
C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gar...
C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gar...C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gar...
C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gar...
 

Similar to Cassandra at mahalo_com_scale_la_meetup_de

Storage Systems for High Scalable Systems Presentation
Storage Systems for High Scalable Systems PresentationStorage Systems for High Scalable Systems Presentation
Storage Systems for High Scalable Systems Presentationandyman3000
 
My Sql And Search At Craigslist
My Sql And Search At CraigslistMy Sql And Search At Craigslist
My Sql And Search At CraigslistMySQLConference
 
Alexander Sibiryakov- Frontera
Alexander Sibiryakov- FronteraAlexander Sibiryakov- Frontera
Alexander Sibiryakov- FronteraPyData
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
VMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing DatabasesVMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing DatabasesVMworld
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09Chris Purrington
 
Four Ways to Improve ASP .NET Performance and Scalability
 Four Ways to Improve ASP .NET Performance and Scalability Four Ways to Improve ASP .NET Performance and Scalability
Four Ways to Improve ASP .NET Performance and ScalabilityAlachisoft
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archroyans
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archguest18a0f1
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archmclee
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaDataWorks Summit
 
Data Streaming Technology Overview
Data Streaming Technology OverviewData Streaming Technology Overview
Data Streaming Technology OverviewDan Lynn
 
Tweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский ДмитрийTweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский ДмитрийGeeksLab Odessa
 
http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151xlight
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudHBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudMichael Stack
 
M|18 How Facebook Migrated to MyRocks
M|18 How Facebook Migrated to MyRocksM|18 How Facebook Migrated to MyRocks
M|18 How Facebook Migrated to MyRocksMariaDB plc
 

Similar to Cassandra at mahalo_com_scale_la_meetup_de (20)

20080611accel
20080611accel20080611accel
20080611accel
 
20081022cca
20081022cca20081022cca
20081022cca
 
Storage Systems for High Scalable Systems Presentation
Storage Systems for High Scalable Systems PresentationStorage Systems for High Scalable Systems Presentation
Storage Systems for High Scalable Systems Presentation
 
My Sql And Search At Craigslist
My Sql And Search At CraigslistMy Sql And Search At Craigslist
My Sql And Search At Craigslist
 
Alexander Sibiryakov- Frontera
Alexander Sibiryakov- FronteraAlexander Sibiryakov- Frontera
Alexander Sibiryakov- Frontera
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
VMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing DatabasesVMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing Databases
 
20080528dublinpt1
20080528dublinpt120080528dublinpt1
20080528dublinpt1
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
Four Ways to Improve ASP .NET Performance and Scalability
 Four Ways to Improve ASP .NET Performance and Scalability Four Ways to Improve ASP .NET Performance and Scalability
Four Ways to Improve ASP .NET Performance and Scalability
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
 
Data Streaming Technology Overview
Data Streaming Technology OverviewData Streaming Technology Overview
Data Streaming Technology Overview
 
Tweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский ДмитрийTweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский Дмитрий
 
http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudHBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
 
M|18 How Facebook Migrated to MyRocks
M|18 How Facebook Migrated to MyRocksM|18 How Facebook Migrated to MyRocks
M|18 How Facebook Migrated to MyRocks
 

Cassandra at mahalo_com_scale_la_meetup_de

  • 1. Cassandra at Mahalo.com Noah Silas John Watson Backend Developer Data Systems Architect noah@mahalo.com johnw@mahalo.com twitter: @noah256 twitter: @dctrwatson
  • 2. About Mahalo Mahalo.com is one of the top 200 domains on the net1 We serve ~ 12 Million unique visitors per month Served out of two geographically disparate data-centers nginx, Apache, Python, Django stack Primary Datastore - Replicated MySQL Cluster 1. reported by quantcast.com
  • 3. Cassandra at Mahalo Hardware: 8 Node Cluster CPU: 2x Intel Xeon E5530 Quad-Core 2.40GHz, 8MB Cache, 5.86GT/s QPI RAM: 24GB ( 6 x 4GB) DDR3-1333 ECC Registered DIMM Software: FreeBSD Lazyboy Python Client Library
  • 4. Current Use Case - Activity Log Near real-time feeds documenting site usage Appears on user profiles, detailed page change logs Actions on the site are recorded in between 4 and 4000 feeds - requirement: "Stupidly Fast Writes" Data Model: Two Column Families ActivityLog ActivityLogIndexes Important Lesson: Pick Unambiguous keys!
  • 5. Current Use Case - Content Pages Mahalo Content Pages provide comprehensive search results Search results can be curated by our staff of Guides Curated results must be stored and ordered - This was leading to large MySQL tables, with one table in particular exploding to nearly 20 million rows with ~ 15GB of data Only one query generally performed against this data - given a page slug, find the curated results for this page. When we migrated this table from MySQL into Cassandra we saw immediate performance gains across our MySQL cluster
  • 6. Our Experiences / Boneheaded Mistakes Plan Ahead!!! CASSANDRA-16 - Large Rows Nagios Monitoring for Cassandra - http://www.mahalo.com/how-to-monitor-cassandra-with-nagios Cassandra Upgrades solve problems. Usually. The CommitLog really does belong on a dedicated disk Storing data encoded in difficult formats is a bad plan - example: python pickles
  • 7. Our Experiences / Boneheaded Mistakes Problems can be solved by throwing more memory at the JMX heap, right? Cluster Load Balancing - HA Proxy is Awesome! - but it sometimes obscures which node is experiencing issues. We have found that we don't need a memcached instance in front of Cassandra Onboarding Devs for cassandra is Hard! - Terminology is overloaded from RDBMS world