SlideShare a Scribd company logo
1 of 31
REAL WORLD CASSANDRA
         AT
        NASA

       Christopher Keller
      December 13th, 2012
THANKS!
I failed to copy this to iCloud after the DC presentation
WHO AM I?

•a CSC solutions architect working at the advanced
 supercomputing facility (NAS) at NASA Ames in silicon valley

• consulted   at various federal agencies during the tech boom of
 the 90’s

  • classified   and unclassified

• http://about.me/christopherkeller
WHO I’M NOT



•a   cassandra expert

• someone   pushing a corporate agenda
ENVIRONMENT

• unix   based enterprise (desktops, servers, supercomputers)

• heavywrites around the clock from incoming data, but far
 fewer analytical reads

• we retain the data in a raw format, but it does not need to be
 in a database (however we can easily load old data)

• weneed flexibility as technology and our requirements evolve
 over time
THE PROBLEM

• TL;DR- how to use all of our available data to make
 supercomputing more secure for our customers

• replace   a COTS security event management system

  • poor    query performance

  • difficult   to extend and integrate with our custom software

  • pre-defined analytics were a big plus, but more overall
   minuses for our environment
WHY CASSANDRA

• snapshotting   for backups was lightning fast

• no   single point of failure

• reads   are fast, writes are faster

•idid research other solutions (couchbase, hbase, mongo, riak,
 etc), but didn’t find anything compelling enough to trial
WHY CASSANDRA

• simple    clustering = win

  • availability   + scalability + replication

• built   in data expiration was key

• enabling
         technology that allowed us to ask new
 questions
IN THE BEGINNING...

• set   up a virtualized three node cluster on a spare server

• wrote    the cassandra equivalent of “hello world” to check

  • replication   / availability

  • data   expiration

  • rough   performance estimates
ARE YOU KIDDING ME?

• selling   cassandra to management was easier than i thought

• theNAS is very receptive to new technology even though we
 prefer to be system integrators rather than developers

• my    testing showed that cassandra works...shocking!!!

• openssource resources are good, DataStax being able to
 provide support after i leave is better
TAX DOLLARS AT WORK

• bought   five servers for around 22.5k

 •3   of them for our production cluster

 •1   for our data parsing and loading

 •1   for our analytics

• thosewere our only purchases, the rest has been primarily my
 labor hours
write operations
             30000


             23750
Operations




             17500


             11250


                  5000
                                 1   6   12   17   23   28      34      39     45          50      56           61        67        72        78        83        89        94
                                                                                   Elapsed Time



                                                             6 nodes         9 nodes (v)               9 nodes (p)


                                                                                   latency




                                                                                   .6
                            20




                                                                             1.0
             Milliseconds




                            10




                             0
                                 1   6   12   17   23   28     34      39     45        50        56       61        67        72        78        83        89        94
                                                                               Elapsed Time


                                                             6 nodes         9 nodes (v)               9 nodes (p)




      http://christophernkeller.tumblr.com/post/15242366864/cassandra-benchmarks
TAKEAWAY



• bare   metal > virtualized w/ assigned disks > fully virtualized

• match your hardware to your environment , expertise, and
 requirements
CURRENT CLUSTER

• gentoo    running xen 4.1.2 & apache cassandra 1.1.3

• three   virtual nodes per physical server

  •7   cpu’s, 15gig RAM, 1.2 TB disk

• eight   disks per physical server

  •2   running the hypervisor + OS in a RAID 1

  •2   disks per virtual machine in a RAID 0
ELAPSED TIME

• emptyrack to benchmarks took about five days over the
 course of christmas/new years 2011

• veryhelpful to understand our hardware limits and how
 cassandra scaled

• understandinghow to model the data and effectively use
 cassandra took a lot longer...i’m still learning
HELPFUL TIPS

• always      start with the questions you plan to ask the data

• if   you know these your job just got exponentially easier

• if   you never deviate from this, you’re lucky

  • once   you realize how powerful cassandra is, you’ll figure out
       new questions that may change things

• don’t    use supercolumns
MAINTENANCE

•i haven’t done serious sys-admin years...had to develop tools
 from scratch

 • cluster   start up and shutdown scripts

 • use   good CM software (we use puppet)

     • OS, Cassandra   & JVM upgrades

      • cassandra-env.sh   & cassandra.yaml
TRIAL AND ERROR

•a   lot of testing dealing how to organize the data

  • secondary     indexes

  • materialized   views

• i’d
    get failures and errors in cassandra that were solved by
  changing the schema to be more efficient (based on our
  questions)

• try   not to think relationally, it wasn’t helping me
THIS WORKED...POORLY

uid     name age gender            uid     job     hobby

    1    chris   39    male         1    architect jiu-jitsu

    2   jaeden   2     male         2    toddler    gaming


uid employer           phone             address

1        csc          5555555555         123 Main St

2        mom          4444444444         123 Main St
THIS WORKED WELL

            1234       1235     {“age”:”39”,
                                “name”:”chris”,”gender”:”male”...}
architect json blob
                                {“age”:”2”,
toddler               json blob
                                “name”:”jaeden”,”gender”:”male”...}

           4567        7364                   3453        4554

 male     json blob json blob        chris   json blob

                                    jaeden               json blob
WHY DID THAT WORK

• we   only have to query a single table

 • aren’tyou glad you optimized the schemas for the questions
   ahead of time?

• manualjoins by reading successive column families resulted in
 timeout errors even though the cluster was idle and
 everything was on the same switch segment
LESSONS LEARNED


• if
   your data changes frequently, de-normalization is annoying,
  but can be solved with discipline

• give yourself a lot of experimentation time if you’re new to
  cassandra

  • if   you are hitting problems...likely you’re doing it wrong
TECHNICAL TIPS

• use ‘-pr’ to   repair each node at least every gc_grace_seconds

• script   which staggers weekly repairs across each node

• onceyou assign a token ID, you can remove it from
 cassandra.yaml and keep the same file across nodes

• you are free to use the Thrift bindings for the language of your
 choice, but save yourself time and use a high level client (eg
 Java, Python, Scala, PHP, Erlang, etc)
HOW I SPENT MY TIME


• i’dspend a few hours writing code to load data into
  cassandra, then another few hours writing code to retrieve it

  • the   data browsers aren’t great and unhelpful with blobs

• theni’d profile the performance, tweak the code, tweak the
  schema, reload the data and repeat until i was happy
ANALYTICS


• all
    server side analytics are developed in python using sub-
  processes for parallel performance

  • pycassa   is our cassandra client library

• our web layer is currently ruby on rails, but we might end up
  going with django to stay language consistent
SHOW STOPPERS


• dealing
        with an incredibly annoying JMX recurring crash but it
 doesn’t seem to affect cassandra stability

  • other
        cassandra sites haven’t seen this, so it may just be a
   consequence of java6 on gentoo
                         .1.3

• commitlog_total_space_in_mb    was being ignored
                      in 1
                   ED
                FIX
RECENT SHOW STOPPERS

• 1.1.3   accidentally removed the ability to drop column families

  • pick   your poison - full disks or data that never goes away

• recent   v6 JVM patches required per-thread stack sizes to 180k

  • nodes were up individually, zero log errors, gossip is up, but
    the nodes weren’t talking collectively

• cassandra   solves a need, but bugs like this make my customers
 wary
ROAD AHEAD
cql
map/reduce
solr
ops center
SHOUT OUT

• the   folks at datastax have been very helpful

  • Tyler   Hobbs (cassandra developer)

  • Darren    Sack (accounts)

  • Michael   Shaler (biz dev)

• everyone    in #cassandra on irc.freenode.org
QUESTIONS?


• cnkeller@gmail.com

• @cnkeller

• http://www.linkedin.com/in/christopherkeller

More Related Content

What's hot

Breda Development Meetup 2016-06-08 - High Availability
Breda Development Meetup 2016-06-08 - High AvailabilityBreda Development Meetup 2016-06-08 - High Availability
Breda Development Meetup 2016-06-08 - High AvailabilityBas Peters
 
The Highs and Lows of Stateful Containers
The Highs and Lows of Stateful ContainersThe Highs and Lows of Stateful Containers
The Highs and Lows of Stateful ContainersC4Media
 
Big data meetup 2012 01-18 - stripped
Big data meetup 2012 01-18 - strippedBig data meetup 2012 01-18 - stripped
Big data meetup 2012 01-18 - strippedMalcolm Box
 
Reactive Supply To Changing Demand
Reactive Supply To Changing DemandReactive Supply To Changing Demand
Reactive Supply To Changing DemandJonas Bonér
 
PagerDuty: One Year of Cassandra Failures
PagerDuty: One Year of Cassandra FailuresPagerDuty: One Year of Cassandra Failures
PagerDuty: One Year of Cassandra FailuresDataStax Academy
 
Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsConcentric Sky
 
Manchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsManchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsChristopher Batey
 
Building your own NSQL store
Building your own NSQL storeBuilding your own NSQL store
Building your own NSQL storeEdward Capriolo
 
HTTP Plugin for MySQL!
HTTP Plugin for MySQL!HTTP Plugin for MySQL!
HTTP Plugin for MySQL!Ulf Wendel
 
Apache cassandra en production - devoxx 2017
Apache cassandra en production  - devoxx 2017Apache cassandra en production  - devoxx 2017
Apache cassandra en production - devoxx 2017Alexander DEJANOVSKI
 
合并到 XtraDB 存储引擎集群
合并到 XtraDB 存储引擎集群合并到 XtraDB 存储引擎集群
合并到 XtraDB 存储引擎集群YUCHENG HU
 
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy CobleyC* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy CobleyDataStax Academy
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - DenverJon Haddad
 
Database Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantDatabase Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantJoshua Goldbard
 
Concurrency and Multithreading Demistified - Reversim Summit 2014
Concurrency and Multithreading Demistified - Reversim Summit 2014Concurrency and Multithreading Demistified - Reversim Summit 2014
Concurrency and Multithreading Demistified - Reversim Summit 2014Haim Yadid
 
NAT64/DNS64 experiments, warnings and one useful tool
NAT64/DNS64 experiments, warnings and one useful toolNAT64/DNS64 experiments, warnings and one useful tool
NAT64/DNS64 experiments, warnings and one useful toolAPNIC
 
Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)Henrik Ingo
 
Building Storage on the Cheap
Building Storage on the CheapBuilding Storage on the Cheap
Building Storage on the CheapYao Jun Yap
 

What's hot (20)

Breda Development Meetup 2016-06-08 - High Availability
Breda Development Meetup 2016-06-08 - High AvailabilityBreda Development Meetup 2016-06-08 - High Availability
Breda Development Meetup 2016-06-08 - High Availability
 
The Highs and Lows of Stateful Containers
The Highs and Lows of Stateful ContainersThe Highs and Lows of Stateful Containers
The Highs and Lows of Stateful Containers
 
Big data meetup 2012 01-18 - stripped
Big data meetup 2012 01-18 - strippedBig data meetup 2012 01-18 - stripped
Big data meetup 2012 01-18 - stripped
 
Reactive Supply To Changing Demand
Reactive Supply To Changing DemandReactive Supply To Changing Demand
Reactive Supply To Changing Demand
 
PagerDuty: One Year of Cassandra Failures
PagerDuty: One Year of Cassandra FailuresPagerDuty: One Year of Cassandra Failures
PagerDuty: One Year of Cassandra Failures
 
Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the Seams
 
Manchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsManchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internals
 
Into The Box 2020 Keynote Day 1
Into The Box 2020 Keynote Day 1Into The Box 2020 Keynote Day 1
Into The Box 2020 Keynote Day 1
 
Building your own NSQL store
Building your own NSQL storeBuilding your own NSQL store
Building your own NSQL store
 
HTTP Plugin for MySQL!
HTTP Plugin for MySQL!HTTP Plugin for MySQL!
HTTP Plugin for MySQL!
 
Apache cassandra en production - devoxx 2017
Apache cassandra en production  - devoxx 2017Apache cassandra en production  - devoxx 2017
Apache cassandra en production - devoxx 2017
 
合并到 XtraDB 存储引擎集群
合并到 XtraDB 存储引擎集群合并到 XtraDB 存储引擎集群
合并到 XtraDB 存储引擎集群
 
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy CobleyC* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - Denver
 
Database Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantDatabase Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and Cloudant
 
Concurrency and Multithreading Demistified - Reversim Summit 2014
Concurrency and Multithreading Demistified - Reversim Summit 2014Concurrency and Multithreading Demistified - Reversim Summit 2014
Concurrency and Multithreading Demistified - Reversim Summit 2014
 
NAT64/DNS64 experiments, warnings and one useful tool
NAT64/DNS64 experiments, warnings and one useful toolNAT64/DNS64 experiments, warnings and one useful tool
NAT64/DNS64 experiments, warnings and one useful tool
 
Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)
 
Building Storage on the Cheap
Building Storage on the CheapBuilding Storage on the Cheap
Building Storage on the Cheap
 
Database Health Check
Database Health CheckDatabase Health Check
Database Health Check
 

Similar to Cassandra Silicon Valley

Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionDataStax Academy
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionDataStax Academy
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Jon Haddad
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraMichael Kjellman
 
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanC* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanDataStax Academy
 
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyCassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyDataStax Academy
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectMorningstar Tech Talks
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandraAxel Liljencrantz
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to CassandraJon Haddad
 
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxGetting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxData Con LA
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Jon Haddad
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionDataStax Academy
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionCassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionDataStax Academy
 
Cassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in ProductionCassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in ProductionDataStax Academy
 
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016DataStax
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... CassandraInstaclustr
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Sparknickmbailey
 
LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraChristopher Batey
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraJon Haddad
 

Similar to Cassandra Silicon Valley (20)

Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to Cassandra
 
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanC* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
 
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyCassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra Project
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxGetting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of Datastax
 
Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionCassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in Production
 
Cassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in ProductionCassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in Production
 
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
 
LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache Cassandra
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 

Recently uploaded

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 

Recently uploaded (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 

Cassandra Silicon Valley

  • 1. REAL WORLD CASSANDRA AT NASA Christopher Keller December 13th, 2012
  • 2. THANKS! I failed to copy this to iCloud after the DC presentation
  • 3. WHO AM I? •a CSC solutions architect working at the advanced supercomputing facility (NAS) at NASA Ames in silicon valley • consulted at various federal agencies during the tech boom of the 90’s • classified and unclassified • http://about.me/christopherkeller
  • 4. WHO I’M NOT •a cassandra expert • someone pushing a corporate agenda
  • 5. ENVIRONMENT • unix based enterprise (desktops, servers, supercomputers) • heavywrites around the clock from incoming data, but far fewer analytical reads • we retain the data in a raw format, but it does not need to be in a database (however we can easily load old data) • weneed flexibility as technology and our requirements evolve over time
  • 6. THE PROBLEM • TL;DR- how to use all of our available data to make supercomputing more secure for our customers • replace a COTS security event management system • poor query performance • difficult to extend and integrate with our custom software • pre-defined analytics were a big plus, but more overall minuses for our environment
  • 7.
  • 8. WHY CASSANDRA • snapshotting for backups was lightning fast • no single point of failure • reads are fast, writes are faster •idid research other solutions (couchbase, hbase, mongo, riak, etc), but didn’t find anything compelling enough to trial
  • 9. WHY CASSANDRA • simple clustering = win • availability + scalability + replication • built in data expiration was key • enabling technology that allowed us to ask new questions
  • 10. IN THE BEGINNING... • set up a virtualized three node cluster on a spare server • wrote the cassandra equivalent of “hello world” to check • replication / availability • data expiration • rough performance estimates
  • 11. ARE YOU KIDDING ME? • selling cassandra to management was easier than i thought • theNAS is very receptive to new technology even though we prefer to be system integrators rather than developers • my testing showed that cassandra works...shocking!!! • openssource resources are good, DataStax being able to provide support after i leave is better
  • 12. TAX DOLLARS AT WORK • bought five servers for around 22.5k •3 of them for our production cluster •1 for our data parsing and loading •1 for our analytics • thosewere our only purchases, the rest has been primarily my labor hours
  • 13. write operations 30000 23750 Operations 17500 11250 5000 1 6 12 17 23 28 34 39 45 50 56 61 67 72 78 83 89 94 Elapsed Time 6 nodes 9 nodes (v) 9 nodes (p) latency .6 20 1.0 Milliseconds 10 0 1 6 12 17 23 28 34 39 45 50 56 61 67 72 78 83 89 94 Elapsed Time 6 nodes 9 nodes (v) 9 nodes (p) http://christophernkeller.tumblr.com/post/15242366864/cassandra-benchmarks
  • 14. TAKEAWAY • bare metal > virtualized w/ assigned disks > fully virtualized • match your hardware to your environment , expertise, and requirements
  • 15. CURRENT CLUSTER • gentoo running xen 4.1.2 & apache cassandra 1.1.3 • three virtual nodes per physical server •7 cpu’s, 15gig RAM, 1.2 TB disk • eight disks per physical server •2 running the hypervisor + OS in a RAID 1 •2 disks per virtual machine in a RAID 0
  • 16. ELAPSED TIME • emptyrack to benchmarks took about five days over the course of christmas/new years 2011 • veryhelpful to understand our hardware limits and how cassandra scaled • understandinghow to model the data and effectively use cassandra took a lot longer...i’m still learning
  • 17. HELPFUL TIPS • always start with the questions you plan to ask the data • if you know these your job just got exponentially easier • if you never deviate from this, you’re lucky • once you realize how powerful cassandra is, you’ll figure out new questions that may change things • don’t use supercolumns
  • 18. MAINTENANCE •i haven’t done serious sys-admin years...had to develop tools from scratch • cluster start up and shutdown scripts • use good CM software (we use puppet) • OS, Cassandra & JVM upgrades • cassandra-env.sh & cassandra.yaml
  • 19. TRIAL AND ERROR •a lot of testing dealing how to organize the data • secondary indexes • materialized views • i’d get failures and errors in cassandra that were solved by changing the schema to be more efficient (based on our questions) • try not to think relationally, it wasn’t helping me
  • 20. THIS WORKED...POORLY uid name age gender uid job hobby 1 chris 39 male 1 architect jiu-jitsu 2 jaeden 2 male 2 toddler gaming uid employer phone address 1 csc 5555555555 123 Main St 2 mom 4444444444 123 Main St
  • 21. THIS WORKED WELL 1234 1235 {“age”:”39”, “name”:”chris”,”gender”:”male”...} architect json blob {“age”:”2”, toddler json blob “name”:”jaeden”,”gender”:”male”...} 4567 7364 3453 4554 male json blob json blob chris json blob jaeden json blob
  • 22. WHY DID THAT WORK • we only have to query a single table • aren’tyou glad you optimized the schemas for the questions ahead of time? • manualjoins by reading successive column families resulted in timeout errors even though the cluster was idle and everything was on the same switch segment
  • 23. LESSONS LEARNED • if your data changes frequently, de-normalization is annoying, but can be solved with discipline • give yourself a lot of experimentation time if you’re new to cassandra • if you are hitting problems...likely you’re doing it wrong
  • 24. TECHNICAL TIPS • use ‘-pr’ to repair each node at least every gc_grace_seconds • script which staggers weekly repairs across each node • onceyou assign a token ID, you can remove it from cassandra.yaml and keep the same file across nodes • you are free to use the Thrift bindings for the language of your choice, but save yourself time and use a high level client (eg Java, Python, Scala, PHP, Erlang, etc)
  • 25. HOW I SPENT MY TIME • i’dspend a few hours writing code to load data into cassandra, then another few hours writing code to retrieve it • the data browsers aren’t great and unhelpful with blobs • theni’d profile the performance, tweak the code, tweak the schema, reload the data and repeat until i was happy
  • 26. ANALYTICS • all server side analytics are developed in python using sub- processes for parallel performance • pycassa is our cassandra client library • our web layer is currently ruby on rails, but we might end up going with django to stay language consistent
  • 27. SHOW STOPPERS • dealing with an incredibly annoying JMX recurring crash but it doesn’t seem to affect cassandra stability • other cassandra sites haven’t seen this, so it may just be a consequence of java6 on gentoo .1.3 • commitlog_total_space_in_mb was being ignored in 1 ED FIX
  • 28. RECENT SHOW STOPPERS • 1.1.3 accidentally removed the ability to drop column families • pick your poison - full disks or data that never goes away • recent v6 JVM patches required per-thread stack sizes to 180k • nodes were up individually, zero log errors, gossip is up, but the nodes weren’t talking collectively • cassandra solves a need, but bugs like this make my customers wary
  • 30. SHOUT OUT • the folks at datastax have been very helpful • Tyler Hobbs (cassandra developer) • Darren Sack (accounts) • Michael Shaler (biz dev) • everyone in #cassandra on irc.freenode.org
  • 31. QUESTIONS? • cnkeller@gmail.com • @cnkeller • http://www.linkedin.com/in/christopherkeller