CASSANDRA DAY SILICON VALLEY 2014 – APRIL 7TH – MATT JURIK
SCALING VIDEO PROGRESS TRACKING
MATT JURIK
SOFTWARE DEVELOPER
WHAT IS HULU?
3
Help people find and enjoy
the world’s premium content
when, where and how they want it.
HULU’S MISSION
5
•  Service Oriented Architecture
•  Follow the Unix Philosophy
•  Small services with specialized scopes
•  Small teams focusing on specific areas
•  Right tool for the job
•  Many languages, frameworks, formats
•  Cross team development encouraged
•  If something you depend on needs fixing, feel free
to fix it
VIDEO PROGRESS TRACKING
CODENAME: HUGETOP
6
7
AGENDA
•  Old architecture
•  New architecture
•  Keyspace design
•  Migrating to cassandra
•  Operations
9
OLD ARCHITECTURE (MYSQL)
HUGETOP (PYTHON)
OTHER SERVICESDEVICESHULU.COM
64 Redis Shards
(Persistence-enabled)
API (PYTHON)
8 MySQL Shards
10
NEW ARCHITECTURE (C*)
HUGETOP (PYTHON)
OTHER SERVICESDEVICESHULU.COM
64 Redis Shards
(Cache-only)
CRAPI (JAVA)
8 Cassandra Nodes
The dilemma
•  Unbounded data growth
•  MySQL very stable, but servers running out of space
•  “Manually resharding is fun!” – No one, ever
Why cassandra?
•  Our data fits cassandra’s data model well.
•  Cassandra promises (and delivers) great scalability
•  Highly available
•  Multi-DC
11
WHY SWITCH?
12
INTERACTION BETWEEN REDIS + CASSANDRA
HUGETOP
64 Redis Shards
(Cache-only)
CRAPI
8 Cassandra Nodes
Video position updates
1.  Write position info to cassandra
2.  Update Redis
Video position requests
Check redis:
If data is loaded in redis,
return it.
Else:
Fetch user’s history from cassandra,
Queue job to update redis,
Return data fetched from cassandra.
Redis
•  Maintains complex indices
•  Enrich data by simulating joins with Lua
Cassandra
•  Provides durability
•  Replenish Redis as necessary
Take one
•  Hadoop-class machines
•  Physical boxes (i.e., no VMs)
•  6 standard 7200rpm drives
•  32gb RAM
•  Leveled compaction + JBOD
•  Write throughput J
•  Read latency L
13
HARDWARE CONSIDERATIONS
Take two
•  SSD-based machines
•  Physical boxes (c-states disabled)
•  550gb RAID5
•  48gb RAM
•  Leveled compaction
•  Write throughput J
•  Read latency J
•  16 nodes split between 2 DCs
14
•  Query last position for user=X, video=Y
•  Query last position for user=X, video=*
•  Daily log of all views needed by other services
•  Two tables: one for updates; one for deletes.
•  Shard data across rows
•  TTL’d
KEYSPACE DESIGN
Copy 1
CREATE TABLE views (
u int, # User ID
v int, # Video ID
c boolean, # Is completed?
p float, # Video position
t timestamp, # Last viewed at
..., # Other fields
PRIMARY KEY (u, v)
);
CREATE TABLE daily_user_views (
s int, # Partition key
u int, # User ID
v int, # Video ID
..., # Other fields
PRIMARY KEY (s, u, v)
);
Copy 2
•  Single row containing one day’s worth of data = too BIG + causes hotspots
•  Fetching single row in parallel is slow
•  Solution: shard each day across 128 rows
=> Spreads data across multiple nodes
=> Query multiple nodes in parallel
15
SHARDING!?
Partition key
userID % 128
+ daysBetween(EPOCH, viewDate) * 128
April 7th, 2014 (daysBetween(EPOCH, “April 7th, 2014”) = 16167):
for(int i = 0; i < 128; i++) {
int k = i % 128 + 16167 * 128
execute(“SELECT * FROM daily_user_views WHERE s = “ + k)
}
16
MIGRATING FROM MYSQL ! CASSANDRA
HUGETOP
1 Read/write to MySQL
MySQL Cassandra
2 Duplicate writes+deletes to Cassandra
- column timestamps = last_played_at date ß Critical for next step
- apply deletions, but also temporarily store them in deletion_ledger
17
MIGRATING FROM MYSQL ! CASSANDRA
HUGETOP
Backfill old data
Again, write to Cassandra with column timestamp = last_viewed_at date (prevents old position from
overwriting new position)
MySQL Cassandra
3
Replay deletions stored in deletion_ledger
Just like inserts, you can specify a timestamp for deletions.
column timestamp = time at which original deletion occurred (prevents deleting new data)
4
•  Use internal tool for automating repairs, backups, etc.
•  Metrics
•  Dump metrics to graphite via custom -javaagent which hooks into yammer metrics
•  Implement a MetricPredicate to filter boring metrics
•  High level monitoring (something is usually wrong if):
•  d(hint count)/dt > 0
•  Large number of old gen collections
•  Lots of SSTables in L0 (and not importing data, bootstrapping, etc)
18
OPERATIONS
•  SSTable Corruption
•  nodetool scrub
•  sstablescrub – if things are really bad
•  Things to watch:
•  Snapshots awesome, but can quickly burn disk space
•  Keep nodes under 50% disk utilization, even if using Leveled Compaction.
19
OPERATIONS
20
THANK YOU
QUESTIONS?

Cassandra Day SV 2014: Scaling Hulu’s Video Progress Tracking Service with Apache Cassandra

  • 1.
    CASSANDRA DAY SILICONVALLEY 2014 – APRIL 7TH – MATT JURIK SCALING VIDEO PROGRESS TRACKING
  • 2.
  • 3.
  • 4.
    Help people findand enjoy the world’s premium content when, where and how they want it. HULU’S MISSION
  • 5.
    5 •  Service OrientedArchitecture •  Follow the Unix Philosophy •  Small services with specialized scopes •  Small teams focusing on specific areas •  Right tool for the job •  Many languages, frameworks, formats •  Cross team development encouraged •  If something you depend on needs fixing, feel free to fix it
  • 6.
  • 7.
  • 8.
    AGENDA •  Old architecture • New architecture •  Keyspace design •  Migrating to cassandra •  Operations
  • 9.
    9 OLD ARCHITECTURE (MYSQL) HUGETOP(PYTHON) OTHER SERVICESDEVICESHULU.COM 64 Redis Shards (Persistence-enabled) API (PYTHON) 8 MySQL Shards
  • 10.
    10 NEW ARCHITECTURE (C*) HUGETOP(PYTHON) OTHER SERVICESDEVICESHULU.COM 64 Redis Shards (Cache-only) CRAPI (JAVA) 8 Cassandra Nodes
  • 11.
    The dilemma •  Unboundeddata growth •  MySQL very stable, but servers running out of space •  “Manually resharding is fun!” – No one, ever Why cassandra? •  Our data fits cassandra’s data model well. •  Cassandra promises (and delivers) great scalability •  Highly available •  Multi-DC 11 WHY SWITCH?
  • 12.
    12 INTERACTION BETWEEN REDIS+ CASSANDRA HUGETOP 64 Redis Shards (Cache-only) CRAPI 8 Cassandra Nodes Video position updates 1.  Write position info to cassandra 2.  Update Redis Video position requests Check redis: If data is loaded in redis, return it. Else: Fetch user’s history from cassandra, Queue job to update redis, Return data fetched from cassandra. Redis •  Maintains complex indices •  Enrich data by simulating joins with Lua Cassandra •  Provides durability •  Replenish Redis as necessary
  • 13.
    Take one •  Hadoop-classmachines •  Physical boxes (i.e., no VMs) •  6 standard 7200rpm drives •  32gb RAM •  Leveled compaction + JBOD •  Write throughput J •  Read latency L 13 HARDWARE CONSIDERATIONS Take two •  SSD-based machines •  Physical boxes (c-states disabled) •  550gb RAID5 •  48gb RAM •  Leveled compaction •  Write throughput J •  Read latency J •  16 nodes split between 2 DCs
  • 14.
    14 •  Query lastposition for user=X, video=Y •  Query last position for user=X, video=* •  Daily log of all views needed by other services •  Two tables: one for updates; one for deletes. •  Shard data across rows •  TTL’d KEYSPACE DESIGN Copy 1 CREATE TABLE views ( u int, # User ID v int, # Video ID c boolean, # Is completed? p float, # Video position t timestamp, # Last viewed at ..., # Other fields PRIMARY KEY (u, v) ); CREATE TABLE daily_user_views ( s int, # Partition key u int, # User ID v int, # Video ID ..., # Other fields PRIMARY KEY (s, u, v) ); Copy 2
  • 15.
    •  Single rowcontaining one day’s worth of data = too BIG + causes hotspots •  Fetching single row in parallel is slow •  Solution: shard each day across 128 rows => Spreads data across multiple nodes => Query multiple nodes in parallel 15 SHARDING!? Partition key userID % 128 + daysBetween(EPOCH, viewDate) * 128 April 7th, 2014 (daysBetween(EPOCH, “April 7th, 2014”) = 16167): for(int i = 0; i < 128; i++) { int k = i % 128 + 16167 * 128 execute(“SELECT * FROM daily_user_views WHERE s = “ + k) }
  • 16.
    16 MIGRATING FROM MYSQL! CASSANDRA HUGETOP 1 Read/write to MySQL MySQL Cassandra 2 Duplicate writes+deletes to Cassandra - column timestamps = last_played_at date ß Critical for next step - apply deletions, but also temporarily store them in deletion_ledger
  • 17.
    17 MIGRATING FROM MYSQL! CASSANDRA HUGETOP Backfill old data Again, write to Cassandra with column timestamp = last_viewed_at date (prevents old position from overwriting new position) MySQL Cassandra 3 Replay deletions stored in deletion_ledger Just like inserts, you can specify a timestamp for deletions. column timestamp = time at which original deletion occurred (prevents deleting new data) 4
  • 18.
    •  Use internaltool for automating repairs, backups, etc. •  Metrics •  Dump metrics to graphite via custom -javaagent which hooks into yammer metrics •  Implement a MetricPredicate to filter boring metrics •  High level monitoring (something is usually wrong if): •  d(hint count)/dt > 0 •  Large number of old gen collections •  Lots of SSTables in L0 (and not importing data, bootstrapping, etc) 18 OPERATIONS
  • 19.
    •  SSTable Corruption • nodetool scrub •  sstablescrub – if things are really bad •  Things to watch: •  Snapshots awesome, but can quickly burn disk space •  Keep nodes under 50% disk utilization, even if using Leveled Compaction. 19 OPERATIONS
  • 20.
  • 21.
  • 22.