SlideShare a Scribd company logo
CASSANDRA
@
ULTRAVISUAL
Cassandra Day New York 2014
Skye Book
Lead Systems Architect
ULTRAVISUA
L
A visual network for
inspiration, expression,
and collaboration
The Feed
• A user’s first taste of UV
• More than just posts
• Constantly being
tweaked and re-thought
SELECT	
DISTINCT _post.*	
FROM	
_post	
JOIN	
_collection_post cp ON _post.uuid=cp.post_uuid	
JOIN	
_collection_follow cf ON cp.c_uuid=cf.collection_uuid	
WHERE	
cf.user_id = ?	
ORDER BY _post.created_at DESC	
LIMIT 20 OFFSET 0
The Old Way
Started Simple
!
“Show me recent posts in
collections I follow”
SELECT	
a.*	
FROM	
_user_follow a, _user_follow b	
WHERE	
b.follower=12345	
AND	
a.follower=b.followed	
ORDER BY a.followed_at DESC	
LIMIT 20 OFFSET 0
The Old Way
Added Complexity
!
“Show me people recently
followed by my connections”
The Old Way
Every new feature needs
another query
!
Feed requests generate a
disproportionate amount of
load to normal CRUD ops
Reframing the Problem
From This:
A place for posts, new
collections, social activity, and
anything else interesting
nitro404.com/computers/knex.php
Reframing the Problem
To This:
A list of items interesting to
the user
The New Way
Model First
• With an SQL background, this can be
misleading.
• Essential Question: “How do I need to access
this data?”
–Rick Branson, Instagram
Cassandra Summit 2013
“Try to model data as a log of user intent”
The New Way
}
The New Way
user statu
s
created_a
t
story json
2 0 61b97280 user_follow:3:5 {“foo”:”bar”}
2 1 5daa04c0 post:bfbd0a39 {“foo”:”bar”}
2 1 565752e0 collection_follow:
5:d70961c1
{“foo”:”bar”}
2 1 4a8189e0 user_follow:3:5 {“foo”:”bar”}
Primary Key Cached story JSON
Model for user feeds
• Fast to fetch user stories
• Cached JSON means almost zero SQL requests
Fast.
Response times cut from
over 100’s ms to 30ms
range
Launch Week
Featured by Apple!
Cluster Disk Usage
26%
74%
Don’t be too cute
cqlsh:ultravisual> ALTER TABLE latest_feed DROP json;
Handling Deletions
• Data is only appended,
never deleted from user
feeds
• Adapted Instagram’s ‘Anti-
Column’ solution
• Avoids missed deletions
for nodes down longer
than GCGraceSeconds
• Avoids race condition
where deletion arrives
before write.
Sam follows Sandy
use
r
created_a
t
statu
s
story
2 4a8189e0 1 user_follow:
3:5
Sam unfollows Sandy
use
r
created_a
t
statu
s
story
2 61b97280 0 user_follow:
3:5
2 4a8189e0 1 user_follow:
3:5
Negated Entries
use
r
created_a
t
statu
s
story
2 61b97280 0 user_follow:
3:5
2 4a8189e0 1 user_follow:
3:5
use
r
statu
s
created_a
t
story
2 0 61b97280 user_follow:
3:5
2 1 4a8189e0 user_follow:
3:5
Keeps all entries in a single
time series
First page can usually be
populated by a single read
Splits user’s row into two lists,
live and undo
Will always require at least
two reads
Further Uses
• User Notifications
• User Onboarding
• Reshare Statistics
• User & Content Reports
• API Statistics
User Onboarding
user created_a
t
sequence step content
2 61b97280 onboaring_v2 1 rec_collections_1
3 5daa04c0 onboaring_v2 2 rec_collections_2
5 565752e0 onboaring_v3 1 find_friends
6 4a8189e0 onboaring_v3 1 find_friends
Sequenced feed entries
for users on signup
Production Experiences
Drivers
• Java: Started with Astyanax, moved to Datastax
v2
• Node.js: node-cassandra-cql
Cryptic message with large batch updates in pre-release versions of
2.0 driver
DS Driver Issue 229
com.datastax.driver.core.exceptions.DriverInternalError: An
unexpected protocol error occured. This is a bug in this library,
please report: Unknown code 256 for a consistency level
As of 2.0, batches with more than 64k statements throw a better
exception:
java.lang.IllagalStateException: Batch statement cannot contain
more than 65536 statements.
Just use LZ4
Compression
Cassandra-4851
Unfortunate truth in Cassandra 2.0.5
!
cqlsh:test> SELECT *	
	 	 FROM user_feed	
	 	 WHERE user = 2	
	 	 	 AND created_at > :some_uuid	
	 	 	 AND status=0;	
!
cqlsh:test> Bad Request: PRIMARY KEY part status cannot be	
	 	 	 	 	 restricted (preceding part created_at is either not 	
	 	 	 	 	 restricted or by a non-EQ relation)
Cassandra-4851
Adds CQL3 support for vector
comparison syntax
!
cqlsh:test> SELECT *	
	 	 FROM timeline	
	 	 WHERE day = ’21 Jun 2014’	
	 	 	 AND (hour,min) >= (3,50)	
	 	 	 AND (hour,min,sec) <= (4,37,30);
Available in 2.0.6
Production Experiences
Upgrades
• Manual package installs (dsc20 from Datastax)
• One node at a time
• Upgrade, wait for healthy status &
operations, move on
• OpsCenter provides good overview
Production Experiences
Speaking of OpsCenter…
• Don’t be alarmed if nodes appear but agent
data does not
• opscenterd often needs a restart after cluster
upgrade to see agents again
Production Experiences
Service Discovery
• Running on AWS using EC2MultiRegionSnitch
• Using OpsWorks (Amazon’s Chef service) for
seed config
Chef Cookbook
github.com/skyebook/cassandra-opsworks-chef-
cookbook
• Forked from Michael Klishin’s awesome C* cookbook
• Added integration with OpsWorks’ stack.json
# Add this node as the first seed	
# If using the multi-region snitch, we must use the public IP address	
if node["cassandra"]["snitch"] == "Ec2MultiRegionSnitch"	
seed_array << node["opsworks"]["instance"]["ip"]	
else	
seed_array << node["opsworks"]["instance"]["private_ip"]	
end	
!
node["opsworks"]["layers"]["cassandra"]["instances"].each do |instance_name, values|	
if node["cassandra"]["snitch"] == "Ec2MultiRegionSnitch"	
seed_array << values["ip"]	
else	
seed_array << values["private_ip"]	
end	
end	
	
set[:cassandra][:seeds] = seed_array
Questions

More Related Content

Similar to Cassandra Day NY 2014: Utilizing Apache Cassandra at UltraVisual

Strategic Autovacuum
Strategic AutovacuumStrategic Autovacuum
Strategic Autovacuum
Scott Mead
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
DataStax
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
Arunit Gupta
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
DataStax
 
My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2
Morgan Tocker
 
Intro to Databases
Intro to DatabasesIntro to Databases
Intro to Databases
Sargun Dhillon
 
Why MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it BackWhy MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it Back
Sveta Smirnova
 
An introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methodsAn introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methods
Ajith Narayanan
 
Boot Strapping in Cassandra
Boot Strapping  in CassandraBoot Strapping  in Cassandra
Boot Strapping in Cassandra
Arunit Gupta
 
Optimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and CreativityOptimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and Creativity
MongoDB
 
Mysql 57-upcoming-changes
Mysql 57-upcoming-changesMysql 57-upcoming-changes
Mysql 57-upcoming-changes
Morgan Tocker
 
Strategic autovacuum
Strategic autovacuumStrategic autovacuum
Strategic autovacuum
Jim Mlodgenski
 
Training Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & TroubleshootingTraining Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & Troubleshooting
Continuent
 
Cassandra 3.0
Cassandra 3.0Cassandra 3.0
Cassandra 3.0
Robert Stupp
 
Slide presentation pycassa_upload
Slide presentation pycassa_uploadSlide presentation pycassa_upload
Slide presentation pycassa_upload
Rajini Ramesh
 
Devops kc
Devops kcDevops kc
Devops kc
Philip Thompson
 
2019 Blackhat Booth Presentation - PowerUpSQL
2019 Blackhat Booth Presentation - PowerUpSQL2019 Blackhat Booth Presentation - PowerUpSQL
2019 Blackhat Booth Presentation - PowerUpSQL
Scott Sutherland
 
PowerUpSQL - 2018 Blackhat USA Arsenal Presentation
PowerUpSQL - 2018 Blackhat USA Arsenal PresentationPowerUpSQL - 2018 Blackhat USA Arsenal Presentation
PowerUpSQL - 2018 Blackhat USA Arsenal Presentation
Scott Sutherland
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
DataStax Academy
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra Project
Morningstar Tech Talks
 

Similar to Cassandra Day NY 2014: Utilizing Apache Cassandra at UltraVisual (20)

Strategic Autovacuum
Strategic AutovacuumStrategic Autovacuum
Strategic Autovacuum
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
 
My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2
 
Intro to Databases
Intro to DatabasesIntro to Databases
Intro to Databases
 
Why MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it BackWhy MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it Back
 
An introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methodsAn introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methods
 
Boot Strapping in Cassandra
Boot Strapping  in CassandraBoot Strapping  in Cassandra
Boot Strapping in Cassandra
 
Optimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and CreativityOptimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and Creativity
 
Mysql 57-upcoming-changes
Mysql 57-upcoming-changesMysql 57-upcoming-changes
Mysql 57-upcoming-changes
 
Strategic autovacuum
Strategic autovacuumStrategic autovacuum
Strategic autovacuum
 
Training Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & TroubleshootingTraining Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & Troubleshooting
 
Cassandra 3.0
Cassandra 3.0Cassandra 3.0
Cassandra 3.0
 
Slide presentation pycassa_upload
Slide presentation pycassa_uploadSlide presentation pycassa_upload
Slide presentation pycassa_upload
 
Devops kc
Devops kcDevops kc
Devops kc
 
2019 Blackhat Booth Presentation - PowerUpSQL
2019 Blackhat Booth Presentation - PowerUpSQL2019 Blackhat Booth Presentation - PowerUpSQL
2019 Blackhat Booth Presentation - PowerUpSQL
 
PowerUpSQL - 2018 Blackhat USA Arsenal Presentation
PowerUpSQL - 2018 Blackhat USA Arsenal PresentationPowerUpSQL - 2018 Blackhat USA Arsenal Presentation
PowerUpSQL - 2018 Blackhat USA Arsenal Presentation
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra Project
 

More from DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
DataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
DataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
DataStax Academy
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
DataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
DataStax Academy
 

More from DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 

Recently uploaded

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 

Recently uploaded (20)

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 

Cassandra Day NY 2014: Utilizing Apache Cassandra at UltraVisual

  • 1. CASSANDRA @ ULTRAVISUAL Cassandra Day New York 2014 Skye Book Lead Systems Architect
  • 2. ULTRAVISUA L A visual network for inspiration, expression, and collaboration
  • 3. The Feed • A user’s first taste of UV • More than just posts • Constantly being tweaked and re-thought
  • 4. SELECT DISTINCT _post.* FROM _post JOIN _collection_post cp ON _post.uuid=cp.post_uuid JOIN _collection_follow cf ON cp.c_uuid=cf.collection_uuid WHERE cf.user_id = ? ORDER BY _post.created_at DESC LIMIT 20 OFFSET 0 The Old Way Started Simple ! “Show me recent posts in collections I follow”
  • 5. SELECT a.* FROM _user_follow a, _user_follow b WHERE b.follower=12345 AND a.follower=b.followed ORDER BY a.followed_at DESC LIMIT 20 OFFSET 0 The Old Way Added Complexity ! “Show me people recently followed by my connections”
  • 6. The Old Way Every new feature needs another query ! Feed requests generate a disproportionate amount of load to normal CRUD ops
  • 7. Reframing the Problem From This: A place for posts, new collections, social activity, and anything else interesting nitro404.com/computers/knex.php
  • 8. Reframing the Problem To This: A list of items interesting to the user
  • 9. The New Way Model First • With an SQL background, this can be misleading. • Essential Question: “How do I need to access this data?”
  • 10. –Rick Branson, Instagram Cassandra Summit 2013 “Try to model data as a log of user intent” The New Way
  • 11. } The New Way user statu s created_a t story json 2 0 61b97280 user_follow:3:5 {“foo”:”bar”} 2 1 5daa04c0 post:bfbd0a39 {“foo”:”bar”} 2 1 565752e0 collection_follow: 5:d70961c1 {“foo”:”bar”} 2 1 4a8189e0 user_follow:3:5 {“foo”:”bar”} Primary Key Cached story JSON Model for user feeds • Fast to fetch user stories • Cached JSON means almost zero SQL requests
  • 12. Fast. Response times cut from over 100’s ms to 30ms range
  • 13. Launch Week Featured by Apple! Cluster Disk Usage 26% 74%
  • 14. Don’t be too cute cqlsh:ultravisual> ALTER TABLE latest_feed DROP json;
  • 15. Handling Deletions • Data is only appended, never deleted from user feeds • Adapted Instagram’s ‘Anti- Column’ solution • Avoids missed deletions for nodes down longer than GCGraceSeconds • Avoids race condition where deletion arrives before write. Sam follows Sandy use r created_a t statu s story 2 4a8189e0 1 user_follow: 3:5 Sam unfollows Sandy use r created_a t statu s story 2 61b97280 0 user_follow: 3:5 2 4a8189e0 1 user_follow: 3:5
  • 16. Negated Entries use r created_a t statu s story 2 61b97280 0 user_follow: 3:5 2 4a8189e0 1 user_follow: 3:5 use r statu s created_a t story 2 0 61b97280 user_follow: 3:5 2 1 4a8189e0 user_follow: 3:5 Keeps all entries in a single time series First page can usually be populated by a single read Splits user’s row into two lists, live and undo Will always require at least two reads
  • 17. Further Uses • User Notifications • User Onboarding • Reshare Statistics • User & Content Reports • API Statistics
  • 18. User Onboarding user created_a t sequence step content 2 61b97280 onboaring_v2 1 rec_collections_1 3 5daa04c0 onboaring_v2 2 rec_collections_2 5 565752e0 onboaring_v3 1 find_friends 6 4a8189e0 onboaring_v3 1 find_friends Sequenced feed entries for users on signup
  • 19. Production Experiences Drivers • Java: Started with Astyanax, moved to Datastax v2 • Node.js: node-cassandra-cql
  • 20. Cryptic message with large batch updates in pre-release versions of 2.0 driver DS Driver Issue 229 com.datastax.driver.core.exceptions.DriverInternalError: An unexpected protocol error occured. This is a bug in this library, please report: Unknown code 256 for a consistency level As of 2.0, batches with more than 64k statements throw a better exception: java.lang.IllagalStateException: Batch statement cannot contain more than 65536 statements.
  • 22. Cassandra-4851 Unfortunate truth in Cassandra 2.0.5 ! cqlsh:test> SELECT * FROM user_feed WHERE user = 2 AND created_at > :some_uuid AND status=0; ! cqlsh:test> Bad Request: PRIMARY KEY part status cannot be restricted (preceding part created_at is either not restricted or by a non-EQ relation)
  • 23. Cassandra-4851 Adds CQL3 support for vector comparison syntax ! cqlsh:test> SELECT * FROM timeline WHERE day = ’21 Jun 2014’ AND (hour,min) >= (3,50) AND (hour,min,sec) <= (4,37,30); Available in 2.0.6
  • 24. Production Experiences Upgrades • Manual package installs (dsc20 from Datastax) • One node at a time • Upgrade, wait for healthy status & operations, move on • OpsCenter provides good overview
  • 25. Production Experiences Speaking of OpsCenter… • Don’t be alarmed if nodes appear but agent data does not • opscenterd often needs a restart after cluster upgrade to see agents again
  • 26. Production Experiences Service Discovery • Running on AWS using EC2MultiRegionSnitch • Using OpsWorks (Amazon’s Chef service) for seed config
  • 27. Chef Cookbook github.com/skyebook/cassandra-opsworks-chef- cookbook • Forked from Michael Klishin’s awesome C* cookbook • Added integration with OpsWorks’ stack.json # Add this node as the first seed # If using the multi-region snitch, we must use the public IP address if node["cassandra"]["snitch"] == "Ec2MultiRegionSnitch" seed_array << node["opsworks"]["instance"]["ip"] else seed_array << node["opsworks"]["instance"]["private_ip"] end ! node["opsworks"]["layers"]["cassandra"]["instances"].each do |instance_name, values| if node["cassandra"]["snitch"] == "Ec2MultiRegionSnitch" seed_array << values["ip"] else seed_array << values["private_ip"] end end set[:cassandra][:seeds] = seed_array