This document discusses scaling Cassandra for big data applications. It describes how Ooyala uses Cassandra for fast access to data generated by MapReduce, high availability key-value storage from Storm, and playhead tracking for cross-device resume. It outlines Ooyala's experience migrating to newer Cassandra versions as data doubled yearly, including removing expired tombstones, schema changes, and Linux performance tuning.
C* Summit 2013: Practice Makes Perfect: Extreme Cassandra Optimization by Alb...DataStax Academy
Ooyala has been using Apache Cassandra since version 0.4. Our data ingest volume has exploded since 0.4 and Cassandra has scaled along with us. Al will cover many topics from an operational perspective on how to manage, tune, and scale Cassandra in a production environment.
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
Al has been using Cassandra since version 0.6 and has spent the last few months doing little else but tune Cassandra clusters. In this talk, Al will show how to tune Cassandra for efficient operation using multiple views into system metrics, including OS stats, GC logs, JMX, and cassandra-stress.
San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS fo...DataStax Academy
What You'll Learn at this Meetup
Tips and Tricks to achieve high performance when running Cassandra on AWS
• Configuration tuning for Cassandra
• Tools to benchmark raw filesystem IO
• AWS available AMIs to boost performance
• Stress testing on AWS i2 HVM instances
• Configuring AWS EC2 instances with SSDs and EBS storage with PIOPS
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per SecondAmazon Web Services
With the introduction of Amazon Elastic Block Store (EBS) GP2 and recent stability improvements, EBS has gained credibility in the Cassandra world for high performance workloads. By running Cassandra on Amazon EBS, you can run denser, cheaper Cassandra clusters with just as much availability as ephemeral storage instances. This talk walks through a highly detailed use case and configuration guide for a multi PetaByte, million write per second cluster that needs to be high performing and cost efficient. We explore the instance type choices, configuration, and low-level tuning that allowed us to hit 1.3 million writes per second with a replication factor of 3 on just 60 nodes.
Presentation from 2016 Austin OpenStack Summit.
The Ceph upstream community is declaring CephFS stable for the first time in the recent Jewel release, but that declaration comes with caveats: while we have filesystem repair tools and a horizontally scalable POSIX filesystem, we have default-disabled exciting features like horizontally-scalable metadata servers and snapshots. This talk will present exactly what features you can expect to see, what's blocking the inclusion of other features, and what you as a user can expect and can contribute by deploying or testing CephFS.
C* Summit 2013: Practice Makes Perfect: Extreme Cassandra Optimization by Alb...DataStax Academy
Ooyala has been using Apache Cassandra since version 0.4. Our data ingest volume has exploded since 0.4 and Cassandra has scaled along with us. Al will cover many topics from an operational perspective on how to manage, tune, and scale Cassandra in a production environment.
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
Al has been using Cassandra since version 0.6 and has spent the last few months doing little else but tune Cassandra clusters. In this talk, Al will show how to tune Cassandra for efficient operation using multiple views into system metrics, including OS stats, GC logs, JMX, and cassandra-stress.
San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS fo...DataStax Academy
What You'll Learn at this Meetup
Tips and Tricks to achieve high performance when running Cassandra on AWS
• Configuration tuning for Cassandra
• Tools to benchmark raw filesystem IO
• AWS available AMIs to boost performance
• Stress testing on AWS i2 HVM instances
• Configuring AWS EC2 instances with SSDs and EBS storage with PIOPS
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per SecondAmazon Web Services
With the introduction of Amazon Elastic Block Store (EBS) GP2 and recent stability improvements, EBS has gained credibility in the Cassandra world for high performance workloads. By running Cassandra on Amazon EBS, you can run denser, cheaper Cassandra clusters with just as much availability as ephemeral storage instances. This talk walks through a highly detailed use case and configuration guide for a multi PetaByte, million write per second cluster that needs to be high performing and cost efficient. We explore the instance type choices, configuration, and low-level tuning that allowed us to hit 1.3 million writes per second with a replication factor of 3 on just 60 nodes.
Presentation from 2016 Austin OpenStack Summit.
The Ceph upstream community is declaring CephFS stable for the first time in the recent Jewel release, but that declaration comes with caveats: while we have filesystem repair tools and a horizontally scalable POSIX filesystem, we have default-disabled exciting features like horizontally-scalable metadata servers and snapshots. This talk will present exactly what features you can expect to see, what's blocking the inclusion of other features, and what you as a user can expect and can contribute by deploying or testing CephFS.
Cassandra Summit 2014: Performance Tuning Cassandra in AWSDataStax Academy
Presenters: Michael Nelson, Development Manager at FamilySearch
A recent research project at FamilySearch.org pushed Cassandra to very high scale and performance limits in AWS using a real application. Come see how we achieved 250K reads/sec with latencies under 5 milliseconds on a 400-core cluster holding 6 TB of data while maintaining transactional consistency for users. We'll cover tuning of Cassandra's caches, other server-side settings, client driver, AWS cluster placement and instance types, and the tradeoffs between regular & SSD storage.
Teads is #1 in Video Ads. Read how Teads handles up to ~1 million requests/s with Apache Cassandra. How do we tuned Cassandra servers and clients. What issues we faced during the last year. How do we provision our clusters. Which tools are used: Datadog for monitoring and alerting, Cassandra reaper, Rundeck, Sumologic, cassandra_snapshotter. Why do we need a fork.
This session will cover performance-related developments in Red Hat Gluster Storage 3 and share best practices for testing, sizing, configuration, and tuning.
Join us to learn about:
Current features in Red Hat Gluster Storage, including 3-way replication, JBOD support, and thin-provisioning.
Features that are in development, including network file system (NFS) support with Ganesha, erasure coding, and cache tiering.
New performance enhancements related to the area of remote directory memory access (RDMA), small-file performance, FUSE caching, and solid state disks (SSD) readiness.
Cassandra Day Chicago 2015: DataStax Enterprise & Apache Cassandra Hardware B...DataStax Academy
Speaker(s): Kathryn Erickson, Engineering at DataStax
During this session we will discuss varying recommended hardware configurations for DSE. We’ll get right to the point and provide quick and solid recommendations up front. After we get the main points down take a brief tour of the history of database storage and then focus on designing a storage subsystem that won't let you down.
Scylla Summit 2018: In-Memory Scylla - When Fast Storage is Not Fast EnoughScyllaDB
Some workloads require very low latency for high percentile of requests that even the fastest of disks may feel challenged to provide. This requirement will not be met if several IO reads will have to be issued to retrieve requested data from the storage array. The new Scylla In-Memory storage option was added to Scylla Enterprise to satisfy the read mostly workloads that fit into the memory and require consistent low latency. In this talk we will discuss the characteristics of the implementation and how can you take advantage of it.
This presentation provides an overview of the Dell PowerEdge R730xd server performance results with Red Hat Ceph Storage. It covers the advantages of using Red Hat Ceph Storage on Dell servers with their proven hardware components that provide high scalability, enhanced ROI cost benefits, and support of unstructured data.
Compaction is the consequence of the Log-Structured Merge-Tree engine used by Cassandra. Starting with the SizeTieredCompactionStrategy, we added the LeveledCompactionStrategy and recently the DateTieredCompactionStrategy it has always required some care and feeding. In this talk Aaron Morton, Co-Founder and Principal Consultant at The Last Pickle, will discuss the different strategies, their options, and when to use them.
This presentation provides a basic overview of Ceph, upon which SUSE Storage is based. It discusses the various factors and trade-offs that affect the performance and other functional and non-functional properties of a software-defined storage (SDS) environment.
Cassandra Summit 2014: Performance Tuning Cassandra in AWSDataStax Academy
Presenters: Michael Nelson, Development Manager at FamilySearch
A recent research project at FamilySearch.org pushed Cassandra to very high scale and performance limits in AWS using a real application. Come see how we achieved 250K reads/sec with latencies under 5 milliseconds on a 400-core cluster holding 6 TB of data while maintaining transactional consistency for users. We'll cover tuning of Cassandra's caches, other server-side settings, client driver, AWS cluster placement and instance types, and the tradeoffs between regular & SSD storage.
Teads is #1 in Video Ads. Read how Teads handles up to ~1 million requests/s with Apache Cassandra. How do we tuned Cassandra servers and clients. What issues we faced during the last year. How do we provision our clusters. Which tools are used: Datadog for monitoring and alerting, Cassandra reaper, Rundeck, Sumologic, cassandra_snapshotter. Why do we need a fork.
This session will cover performance-related developments in Red Hat Gluster Storage 3 and share best practices for testing, sizing, configuration, and tuning.
Join us to learn about:
Current features in Red Hat Gluster Storage, including 3-way replication, JBOD support, and thin-provisioning.
Features that are in development, including network file system (NFS) support with Ganesha, erasure coding, and cache tiering.
New performance enhancements related to the area of remote directory memory access (RDMA), small-file performance, FUSE caching, and solid state disks (SSD) readiness.
Cassandra Day Chicago 2015: DataStax Enterprise & Apache Cassandra Hardware B...DataStax Academy
Speaker(s): Kathryn Erickson, Engineering at DataStax
During this session we will discuss varying recommended hardware configurations for DSE. We’ll get right to the point and provide quick and solid recommendations up front. After we get the main points down take a brief tour of the history of database storage and then focus on designing a storage subsystem that won't let you down.
Scylla Summit 2018: In-Memory Scylla - When Fast Storage is Not Fast EnoughScyllaDB
Some workloads require very low latency for high percentile of requests that even the fastest of disks may feel challenged to provide. This requirement will not be met if several IO reads will have to be issued to retrieve requested data from the storage array. The new Scylla In-Memory storage option was added to Scylla Enterprise to satisfy the read mostly workloads that fit into the memory and require consistent low latency. In this talk we will discuss the characteristics of the implementation and how can you take advantage of it.
This presentation provides an overview of the Dell PowerEdge R730xd server performance results with Red Hat Ceph Storage. It covers the advantages of using Red Hat Ceph Storage on Dell servers with their proven hardware components that provide high scalability, enhanced ROI cost benefits, and support of unstructured data.
Compaction is the consequence of the Log-Structured Merge-Tree engine used by Cassandra. Starting with the SizeTieredCompactionStrategy, we added the LeveledCompactionStrategy and recently the DateTieredCompactionStrategy it has always required some care and feeding. In this talk Aaron Morton, Co-Founder and Principal Consultant at The Last Pickle, will discuss the different strategies, their options, and when to use them.
This presentation provides a basic overview of Ceph, upon which SUSE Storage is based. It discusses the various factors and trade-offs that affect the performance and other functional and non-functional properties of a software-defined storage (SDS) environment.
Inside the Chef Push Jobs Service - ChefConf 2015 Chef
I'll give a brief summary of what Push Jobs has to offer, talk about the new features for 2.0, and then show our scaling efforts and future roadmap for push, along with some examples of how we're using it internally.
https://youtu.be/8SL7Rgc9swE
A brief history of Instagram's adoption cycle of the open source distributed database Apache Cassandra, in addition to details about it's use case and implementation. This was presented at the San Francisco Cassandra Meetup at the Disqus HQ in August 2013.
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...DataStax
Successfully running Apache Cassandra in production often means knowing what configuration settings to change and which ones to leave as default. Over the years the cassandra.yaml file has grown to provide a number of settings that can improve stability and performance. While the file contains plenty of helpful comments, there is more to be said about the settings and when to change them.
In this talk Edward Capriolo, Consultant at The Last Pickle, will break down the parameters in the configuration files. Looking at those that are essential to getting started, those that impact performance, those that improve availability, the exotic ones, and the ones that should not be played with. This talk is ideal for someone someone setting up Cassandra for the first time up to people with deployments in productions and wondering what the more exotic configuration options do.
About the Speaker
Edward Capriolo Consultant, The Last Pickle
Long time Apache Cassandra user, big data enthusiast.
DataStax: Backup and Restore in Cassandra and OpsCenterDataStax Academy
Cassandra and OpsCenter has a range of backup and restore topics. I will start with a basic overview of Cassandra backup/restore, walking through the operational steps to provide the understanding required to perform an on disk backup and restore. Expanding on this overview, I'll cover the limitations (including schema requirements) and their impact on the restore process. Further, I'll discuss commit log archiving and point in time restore operations. After covering the underlying operations, I'll wrap up with a discussion of how OpsCenter automates this process and leverages S3.
Building Apache Cassandra clusters for massive scaleAlex Thompson
Covering theory and operational aspects of bring up Apache Cassandra clusters - this presentation can be used as a field reference. Presented by Alex Thompson at the Sydney Cassandra Meetup.
Apache Hadoop 3 is coming! As the next major milestone for hadoop and big data, it attracts everyone's attention as showcase several bleeding-edge technologies and significant features across all components of Apache Hadoop: Erasure Coding in HDFS, Docker container support, Apache Slider integration and Native service support, Application Timeline Service version 2, Hadoop library updates and client-side class path isolation, etc. In this talk, first we will update the status of Hadoop 3.0 releasing work in apache community and the feasible path through alpha, beta towards GA. Then we will go deep diving on each new feature, include: development progress and maturity status in Hadoop 3. Last but not the least, as a new major release, Hadoop 3.0 will contain some incompatible API or CLI changes which could be challengeable for downstream projects and existing Hadoop users for upgrade - we will go through these major changes and explore its impact to other projects and users.
Kafka on ZFS: Better Living Through Filesystems confluent
(Hugh O'Brien, Jet.com) Kafka Summit SF 2018
You’re doing disk IO wrong, let ZFS show you the way. ZFS on Linux is now stable. Say goodbye to JBOD, to directories in your reassignment plans, to unevenly used disks. Instead, have 8K Cloud IOPS for $25, SSD speed reads on spinning disks, in-kernel LZ4 compression and the smartest page cache on the planet. (Fear compactions no more!)
Learn how Jet’s Kafka clusters squeeze every drop of disk performance out of Azure, all completely transparent to Kafka.
-Striping cheap disks to maximize instance IOPS
-Block compression to reduce disk usage by ~80% (JSON data)
-Instance SSD as the secondary read cache (storing compressed data), eliminating >99% of disk reads and safe across host redeployments
-Upcoming features: Compressed blocks in memory, potentially quadrupling your page cache (RAM) for free
We’ll cover:
-Basic Principles
-Adapting ZFS for cloud instances (gotchas)
-Performance tuning for Kafka
-Benchmarks
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedHostedbyConfluent
Enforcing format, changing schema, introducing privacy filters have always been a challenge with the classical Kafka-API. In this talk we'll cover how to extend existing applications with webassembly, allowing developers to change the shape of data at runtime, per application without creating additional topics. By leveraging WebAssembly, we can extend the capabilities of the Kafka-API beyond what it was initially imagined. Come and learn about the future of the Kafka-API
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at NightScyllaDB
Opera chose Scylla over Cassandra to sync the data of millions of browsers to a back-end data repository. The results of the migration and further optimizations they made in their stack helped Opera to gain better latency/throughput and lower resources usage beyond their expectations.
Attend this session to learn how to
Migrate your data in a sane way, without any downtime
Connect a Python+Django web app to Scylla, how to use intranode sharding to improve your application
A Comprehensive Introduction to Apache Cassandra.
Agenda:
- What is NoSQL?
- What is Cassandra?
- Architecture
- Data Model
- Key Features and Benefits
- Cassandra Tools
-- CQL
-- Nodetool
-- DataStax Opscenter
- Who’s using Cassandra?
Introducing Galera Cluster & the Codership Team
Galera Cluster in a nutshell:
True multi-master:
Read & write to any node
* Synchronous replication
* No slave lag
* No integrity issues
* No master-slave failovers or VIP needed
* Multi-threaded slave, no performance penalty
* Automatic node provisioning
Elastic:
Easy scale-out & scale-in, all nodes read-write
ScyllaDB is a NoSQL database compatible with Apache Cassandra, distinguishing itself by supporting millions of operations per second, per node, with predictably low latency, on similar hardware.
Achieving such speed requires a great deal of diligent, deliberate mechanical sympathy: ScyllaDB employs a totally asynchronous, share-nothing programming model, relies on its own memory allocators, and meticulously schedules all its IO requests.
In this talk we will go over the low-level details of all the techniques involved - from a log-structured memory allocator to an advanced cache design -, covering how they are implemented and how they fully utilize the hardware resources they target.
Introduction to Docker (as presented at December 2013 Global Hackathon)Jérôme Petazzoni
Not on board of the Docker ship yet? This presentation will get you up to speed, and explain everything you want to know about Linux Containers and Docker, including the new features of the latest 0.7 version (which brings support for all Linux distros and kernels).
Boosting I/O Performance with KVM io_uringShapeBlue
Storage performance is becoming much more important. KVM io_uring attempts to bring the I/O performance of a virtual machine on almost the same level of bare metal. Apache CloudStack has support for io_uring since version 4.16. Wido will show the difference in performance io_uring brings to the table.
Wido den Hollander is the CTO of CLouDinfra, an infrastructure company offering total Webhosting solutions. CLDIN provides datacenter, IP and virtualization services for the companies within TWS. Wido den Hollander is a PMC member of the Apache CloudStack Project and a Ceph expert. He started with CloudStack 9 years ago. What attracted his attention is the simplicity of CloudStack and the fact that it is an open-source solution. During the years Wido became a contributor, a PMC member and he was a VP of the project for a year. He is one of our most active members, who puts a lot of efforts to keep the project active and transform it into a turnkey solution for cloud builders.
-----------------------------------------
The CloudStack European User Group 2022 took place on 7th April. The day saw a virtual get together for the European CloudStack Community, hosting 265 attendees from 25 countries. The event hosted 10 sessions with from leading CloudStack experts, users and skilful engineers from the open-source world, which included: technical talks, user stories, new features and integrations presentations and more.
------------------------------------------
About CloudStack: https://cloudstack.apache.org/
Optimizing Servers for High-Throughput and Low-Latency at DropboxScyllaDB
I'm going to discuss the efficiency/performance optimizations of different layers of the system. Starting from the lowest levels like hardware and drivers: these tunings can be applied to pretty much any high-load server. Then we’ll move to Linux kernel and its TCP/IP stack: these are the knobs you want to try on any of your TCP-heavy boxes. Finally, we’ll discuss library and application-level tunings, which are mostly applicable to HTTP servers in general and nginx/envoy specifically.
For each potential area of optimization I’ll try to give some background on latency/throughput tradeoffs (if any), monitoring guidelines, and, finally, suggest tunings for different workloads.
Also, I'll cover more theoretical approaches to performance analysis and the newly developed tooling like `bpftrace` and new `perf` features.
The Proper Care and Feeding of MySQL DatabasesDave Stokes
Many Linux System Administrators are 'also' accidental database administrators. This is a guide for them to keep their MySQL database instances happy, health, and glowing
Slides presented at Percona Live Europe Open Source Database Conference 2019, Amsterdam, 2019-10-01.
Imagine a world where all Wikipedia articles disappear due to a human error or software bug. Sounds unreal? According to some estimations, it would take an excess of hundreds of million person-hours to be written again. To prevent that scenario from ever happening, our SRE team at Wikimedia recently refactored the relational database recovery system.
In this session, we will discuss how we backup 550TB of MariaDB data without impacting the 15 billion page views per month we get. We will cover what were our initial plans to replace the old infrastructure, how we achieved recovering 2TB databases in less than 30 minutes while maintaining per-table granularity, as well as the different types of backups we implemented. Lastly, we will talk about lessons learned, what went well, how our original plans changed and future work.
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
Companies today are innovating with real-time data to deliver truly amazing customer experiences in the moment. Real-time data management for real-time customer experience is core to staying ahead of competition and driving revenue growth. Join Trays to learn how Comcast is differentiating itself from it's own historical reputation with Customer Experience strategies.
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
DataStax Enterprise (DSE) Graph is a built to manage, analyze, and search highly connected data. DSE Graph, built on NoSQL Apache Cassandra delivers continuous uptime along with predictable performance and scales for modern systems dealing with complex and constantly changing data.
Download DataStax Enterprise: Academy.DataStax.com/Download
Start free training for DataStax Enterprise Graph: Academy.DataStax.com/courses/ds332-datastax-enterprise-graph
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
DataStax Enterprise Advanced Replication supports one-way distributed data replication from remote database clusters that might experience periods of network or internet downtime. Benefiting use cases that require a 'hub and spoke' architecture.
Learn more at http://www.datastax.com/2016/07/stay-100-connected-with-dse-advanced-replication
Advanced Replication docs – https://docs.datastax.com/en/latest-dse/datastax_enterprise/advRep/advRepTOC.html
Data Modeling is the one of the first things to sink your teeth into when trying out a new database. That's why we are going to cover this foundational topic in enough detail for you to get dangerous. Data Modeling for relational databases is more than a touch different than the way it's approached with Cassandra. We will address the quintessential query-driven methodology through a couple of different use cases, including working with time series data for IoT. We will also demo a new tool to get you bootstrapped quickly with MovieLens sample data. This talk should give you the basics you need to get serious with Apache Cassandra.
Hear about how Coursera uses Cassandra as the core of its scalable online education platform. I'll discuss the strengths of Cassandra that we leverage, as well as some limitations that you might run into as well in practice.
In the second part of this talk, we'll dive into how best to effectively use the Datastax Java drivers. We'll dig into how the driver is architected, and use this understanding to develop best practices to follow. I'll also share a couple of interesting bug we've run into at Coursera.
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
This talk covers scaling Cassandra to a fast growing user base. Alex and Isaias will cover new best practices and how to work with the strengths and weaknesses of Cassandra at large scale. They will discuss how to adapt to bottlenecks while providing a rich feature set to the playstation community.
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
This talk covers scaling Cassandra to a fast growing user base. Alex and Isaias will cover new best practices and how to work with the strengths and weaknesses of Cassandra at large scale. They will discuss how to adapt to bottlenecks while providing a rich feature set to the playstation community.
3. What does Ooyala use it for?
Fast access to data generated by Map/Reduce
High availability key/value out of Storm
Cross-device resume (playhead tracking)
ML predictions
Time-series data, raw events & application metrics
4. The Beginning
Our data is doubling every year
Cluster size: 18 nodes
Biggest CF: 2TB
Repairs becoming a problem
Expired tombstones
5. First Migration
Upgrade to C* 0.6 to 0.8
Remove expired tombstones
Scrub data and rebuild indexes
Lots of Linux performance tuning
Map/Reduce
6. Second Migration
Upgrade to Cassandra 1.0
Remove expired tombstones
Update schema
More Linux performance tuning
Map/Reduce - this time using DSE Hadoop
7. Tuning Highlights
Bloom filter false-positive chance (schema)
Index density (schema)
LeveledCompaction ssTable size (schema)
XFS filesystem bugs (Linux)
Stick with ext4 if you like to sleep.
NO SWAP!
8. More Information
cassandra-users mailing list
irc.freenode.net #cassandra / #cassandra-ops
http://www.datastax.com/docs/1.1/index
@AlTobey / al@ooyala.com
Contact me about open positions at Ooyala.
9. Rejected Slides follow:
An old version of this deck was a lot more
technical. I've added them back for online
posting since people have asked about the
specifics.
10. Linux: General Observations
● use a modern kernel, 2.6.32 is ancient
○ Running 3.4.11 on new production hardware
○ default Ubuntu Lucid / Oneiric kernels in EC2
● I have yet to use XFS bug-free
○ 2.6.38 has an especially fun bug
○ allocsize=64m allocates 64m always & forever
○ echo 1 > /proc/sys/vm/drop_caches
● Put commit log on a different filesystem
● btrfs works fine in production
● Block alignment is hard
○ use GPT disk labels and it's generally not an issue
○ or just skip disk labels and RAID whole disks
11. Linux: almost a server OS
/etc/security/limits.conf
* - memlock unlimited
* - nofile 1048576
* - fsize unlimited
* - nproc 999999
13. Ubuntu: FFFFFFFUUUUUUUUUUU
/etc/fstab
/dev/md4 /commit ext4 nobootwait,barrier=1,journal_ioprio=0,rw 0 0
/dev/md7 /srv xfs nobootwait,rw 0 0
● force barriers for journal
● noatime & relatime aren't necessary anymore
○ since ~ 2.6.31
● nobootwait is an upstart option
○ set this or upstart will troll you at 4am
○ mountall hangs on boot for any error without this
○ use on both hardware and EC2 unless you love
using OOB consoles
● As noted, XFS is buggy, so consider ext4.
14. Linux: Final Adjustments
/etc/rc.local (or whatever you prefer)
● CFQ disk scheduler
○ deadline is still faster, but no cgroup support
○ noop is a popular choice in EC2, SSD, and HW
RAID
● Tune readahead
○ don't go crazy, 64k is a decent choice
○ big RA will inflate your bandwidth numbers, but
really large values will waste IO on unused data
● If running MD RAID5/6
○ echo 16384 >
/sys/block/$md/md/stripe_cache_size
15. JVM: ALL THE MEMORY
● Use Oracle JVM 1.6 for Cassandra
○ OpenJDK works, still not recommended
○ Use fpm to create packages if you don't have them
● Default Cassandra GC settings are OK
○ -XX:+UseNUMA
■ works fine in production
■ Apache scripts will use numactl if installed
● DSE does not! (yet)
○ Bigger data will need bigger heaps.
■ 12G seems to work OK
■ 24G works, but approaching limits of JVM
■ too little free memory causes excessive
memtable flushing (more on this later)
16. Cassandra.(?:ya|f)ml
● index_interval: 512
○ save some memory on indexes
● compaction_throughput_mb_per_sec: 0
○ this can hurt your read latency, but in my experience
leveled compaction falls behind under very high
insert loads without this, use a bigger heap to
compensate?
● rpc_server_type: hsha
○ if you have lots & lots of connections, e.g. from
Hadoop, saves memory
17. Cassandra: Schema Tuning
● Enable compression
○ compression_options = {'sstable_compression': 'org.
apache.cassandra.io.compress.SnappyCompressor'};
● Examine bloom filter false-positives
○ nodetool -h localhost cfstats |grep Bloom
○ bloom_filter_fp_chance = 0.1 # diminishing returns
● Reduce ssTable count
○ memory pressure caused frequent memtable flushes
○ compaction throttling made it worse
○ compaction_strategy_options = {'sstable_size_in_mb':
256}
● Give yourself time to repair
○ gc_grace = 5184000 # 60 days
○ shoot for (node_count * 86400 * 3) to be safe
18. Future
● Upgrade all clusters to DSE 2.2
● Chef cookbook (likely open)
● Mixing CQL3 and Thrift API access
○ all lower case CF names
○ WITH COMPACT STORAGE
● Cassandra 1.2
○ native protocol
○ JBOD support
○ vnodes
○ compound row key support in CQL3
19. MOAR
● Freenode IRC is a great resource
○ #cassandra, #cassandra-ops
● cassandra-users mailing list
● DataStax Enterprise
○ The Hadoop integration works and is useful
○ Still playing with Solr
○ OpsCenter is really nice
● Me:
○ @AlTobey on Twitter
○ tobert on irc.freenode.net
○ https://gist.github.com/tobert
●
20. More Information (again)
cassandra-users mailing list
irc.freenode.net #cassandra / #cassandra-ops
http://www.datastax.com/docs/1.1/index
@AlTobey / al@ooyala.com
Contact me about open positions at Ooyala.