SlideShare a Scribd company logo
CASSANDRA
IN E-COMMERCE

Alexander Solovyev
solovyov.a.g@gmail.com
A PILOT PROJECT: ONLINE PRODUCT CATALOG
FOR A E-COMMERCE PLATFORM, MIGRATION TO
CASSANDRA
Previous version was based on In-Memory Data Grid Oracle
Coherence. All data from the primary storage (a relational database)
is cached in the data grid.
Goals of the migration:
•  minimization of time required for system restart
•  at least two copies of the data in different data-centers
•  quick and simple backup
ARCHITECTURE IN A NUTSHELL
•  Application server: all business logic + web-services
•  stateless
•  with local caches
•  Data storage
•  Oracle Coherence, then Cassandra via DataStax Java Driver
•  Batch data loading based on Spring Batch
HOW A PRODUCT SHOULD LOOK LIKE TO MEET
THE REQUIREMENTS?
Some hypotheses:
•  Data is on disk – available immediately after restart
•  OS disk cache brings all the data to memory
•  Key-value storage to simplify migration of the codebase
Nice to have:
•  Simple deployment configuration as a plus
•  Java-based solution as a plus
BASIC REQUIREMENTS / USE-CASES
•  reads: ~5K TPS
•  transactions can include more that one round-trip to the
storage, as well as more than one key in a query (“multi-gets”)
•  ~50K TPS on side of the storage
•  full data reload (once per 24 hours)
•  partial update of values (e.g. of product attributes)
•  availability 24x7
•  millions of products
•  tens of millions of related entities (product attributes etc.)
CANDIDATES
•  MongoDB
•  HBase
•  Oracle Coherence + data persistence (a la Riak)
•  Cassandra
PERFORMANCE TESTING ENVIRONMENT
•  Production-ready implementation
•  4 boxes (16 CPU, 24 GB) x 1 Cassandra instance
•  2 boxes x 2 app servers
•  100 GB of test data - fits in memory
•  Main test is read queries:
•  one hour
•  up to 500 users
•  even distribution of requested keys
WHAT DID HELP
• 

configure your Cassandra cluster
•  “OS swap off”
•  different physical disks for different file-sets - e.g. data vs. commit log
•  choose right (“private”) network interface

• 

async queries for multi-gets + token-aware rouring on the app server side:
+15% TPS and latency

• 

use last Cassandra version
•  a good example: 1.2.6 => 1.2.8 – 15% TPS, latency 2x better
WHAT DID HELP
• 

Use the key of a parent entity as a first component of the children keys:
PRIMARY KEY (parent-ID, child-ID)
•  to minimize number of queries / disk seeks
•  +15% TPS, latency 2x better

• 

use local (“near”) caches on app server side: +15% TPS
•  local EHCache
WHAT DID NOT HELP
• 

Java GC monitoring on Cassandra boxes
•  with recommended settings GC takes 7% maximum from overall time of
the tests

• 

caching == ALL
•  all data in OS disk cache
INTERESTING EXPERIMENTS
• 

another implementation of the token-aware query routing

• 

JSON or any other data format, if partial updates are not needed – a pure
key-value model
•  allows to avoid creation of tombstones in the case of updates, if values
contain Cassandra collections
•  another option is tuning of tombstone GC
SUMMARY
• 

Cassandra is stable and mature enough product

• 

Can compete with in-memory caches and data grids, at least if dataset is
small enough to be placed into memory

• 

Actively developing. Has a large community. Good commercial support from
DataStax
THANK YOU

…and your questions J

More Related Content

What's hot

[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse
Vianney FOUCAULT
 
Elk
Elk Elk
Data pipelines observability: OpenLineage & Marquez
Data pipelines observability:  OpenLineage & MarquezData pipelines observability:  OpenLineage & Marquez
Data pipelines observability: OpenLineage & Marquez
Julien Le Dem
 
Online MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackupOnline MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackup
Kenny Gryp
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLMorgan Tocker
 
Mux loves Clickhouse. By Adam Brown, Mux founder
Mux loves Clickhouse. By Adam Brown, Mux founderMux loves Clickhouse. By Adam Brown, Mux founder
Mux loves Clickhouse. By Adam Brown, Mux founder
Altinity Ltd
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
Altinity Ltd
 
Indexes in postgres
Indexes in postgresIndexes in postgres
Indexes in postgres
Louise Grandjonc
 
Data Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache ArrowData Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache Arrow
Databricks
 
PostgreSQL Extensions: A deeper look
PostgreSQL Extensions:  A deeper lookPostgreSQL Extensions:  A deeper look
PostgreSQL Extensions: A deeper look
Jignesh Shah
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
MongoDB
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
MIJIN AN
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
Altinity Ltd
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
The Hive
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
Amazon Web Services
 
Transposition cipher
Transposition cipherTransposition cipher
Transposition cipher
Antony Alex
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and how
Altinity Ltd
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
Alexander Korotkov
 
腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化areyouok
 

What's hot (20)

[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse
 
Elk
Elk Elk
Elk
 
Data pipelines observability: OpenLineage & Marquez
Data pipelines observability:  OpenLineage & MarquezData pipelines observability:  OpenLineage & Marquez
Data pipelines observability: OpenLineage & Marquez
 
Online MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackupOnline MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackup
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
 
Mux loves Clickhouse. By Adam Brown, Mux founder
Mux loves Clickhouse. By Adam Brown, Mux founderMux loves Clickhouse. By Adam Brown, Mux founder
Mux loves Clickhouse. By Adam Brown, Mux founder
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
 
Indexes in postgres
Indexes in postgresIndexes in postgres
Indexes in postgres
 
Data Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache ArrowData Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache Arrow
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
PostgreSQL Extensions: A deeper look
PostgreSQL Extensions:  A deeper lookPostgreSQL Extensions:  A deeper look
PostgreSQL Extensions: A deeper look
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
Transposition cipher
Transposition cipherTransposition cipher
Transposition cipher
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and how
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化
 

Viewers also liked

Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
DataStax Academy
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis Price
DataStax Academy
 
Cassandra and Riak at BestBuy.com
Cassandra and Riak at BestBuy.comCassandra and Riak at BestBuy.com
Cassandra and Riak at BestBuy.com
joelcrabb
 
Datastax Expedia
Datastax ExpediaDatastax Expedia
Datastax Expedia
Eddie Satterly
 
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
DataStax
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
DataStax Academy
 
Advanced search and Top-k queries in Cassandra - Cassandra Summit Europe 2014
Advanced search and Top-k queries in Cassandra - Cassandra Summit Europe 2014Advanced search and Top-k queries in Cassandra - Cassandra Summit Europe 2014
Advanced search and Top-k queries in Cassandra - Cassandra Summit Europe 2014
dhiguero
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
DataStax
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
ScyllaDB
 
The Upstream Game, 2hr version
The Upstream Game, 2hr versionThe Upstream Game, 2hr version
The Upstream Game, 2hr version
Sean Roberts
 
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
DataStax
 
Web Store with Catalog and Product Management
Web Store with Catalog and Product ManagementWeb Store with Catalog and Product Management
Web Store with Catalog and Product Management
Mike Taylor
 
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
DataStax
 
2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai
Anand Haridass
 
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
Anand Haridass
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
Patrick McFadin
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
Jeremy Hanna
 
Target: Escaping Disco-Era Data Modeling
Target: Escaping Disco-Era Data ModelingTarget: Escaping Disco-Era Data Modeling
Target: Escaping Disco-Era Data Modeling
DataStax Academy
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
MongoDB
 
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
DataStax
 

Viewers also liked (20)

Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis Price
 
Cassandra and Riak at BestBuy.com
Cassandra and Riak at BestBuy.comCassandra and Riak at BestBuy.com
Cassandra and Riak at BestBuy.com
 
Datastax Expedia
Datastax ExpediaDatastax Expedia
Datastax Expedia
 
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 
Advanced search and Top-k queries in Cassandra - Cassandra Summit Europe 2014
Advanced search and Top-k queries in Cassandra - Cassandra Summit Europe 2014Advanced search and Top-k queries in Cassandra - Cassandra Summit Europe 2014
Advanced search and Top-k queries in Cassandra - Cassandra Summit Europe 2014
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
 
The Upstream Game, 2hr version
The Upstream Game, 2hr versionThe Upstream Game, 2hr version
The Upstream Game, 2hr version
 
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
 
Web Store with Catalog and Product Management
Web Store with Catalog and Product ManagementWeb Store with Catalog and Product Management
Web Store with Catalog and Product Management
 
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
 
2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai
 
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Target: Escaping Disco-Era Data Modeling
Target: Escaping Disco-Era Data ModelingTarget: Escaping Disco-Era Data Modeling
Target: Escaping Disco-Era Data Modeling
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
 
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
 

Similar to Cassandra in e-commerce

Oracle 12c New Features_RAC_slides
Oracle 12c New Features_RAC_slidesOracle 12c New Features_RAC_slides
Oracle 12c New Features_RAC_slidesSaiful
 
Scylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the DatabaseScylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the Database
ScyllaDB
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld
 
TechBeats #2
TechBeats #2TechBeats #2
TechBeats #2
applausepoland
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Data Con LA
 
Database as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance PlatformDatabase as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance Platform
Maris Elsins
 
Running Oracle EBS in the cloud (DOAG TECH17 edition)
Running Oracle EBS in the cloud (DOAG TECH17 edition)Running Oracle EBS in the cloud (DOAG TECH17 edition)
Running Oracle EBS in the cloud (DOAG TECH17 edition)
Andrejs Prokopjevs
 
NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5UniFabric
 
The Design, Implementation and Open Source Way of Apache Pegasus
The Design, Implementation and Open Source Way of Apache PegasusThe Design, Implementation and Open Source Way of Apache Pegasus
The Design, Implementation and Open Source Way of Apache Pegasus
acelyc1112009
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015
Patrick McFadin
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceEnkitec
 
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Bobby Curtis
 
OGG Architecture Performance
OGG Architecture PerformanceOGG Architecture Performance
OGG Architecture PerformanceEnkitec
 
Intro to Apache Kudu (short) - Big Data Application Meetup
Intro to Apache Kudu (short) - Big Data Application MeetupIntro to Apache Kudu (short) - Big Data Application Meetup
Intro to Apache Kudu (short) - Big Data Application Meetup
Mike Percy
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Mladen Kovacevic
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Splunk
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
Jeff Holoman
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
Ishan Bhawantha Hewanayake
 
PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized WorldPostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized World
Jignesh Shah
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
James Serra
 

Similar to Cassandra in e-commerce (20)

Oracle 12c New Features_RAC_slides
Oracle 12c New Features_RAC_slidesOracle 12c New Features_RAC_slides
Oracle 12c New Features_RAC_slides
 
Scylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the DatabaseScylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the Database
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
 
TechBeats #2
TechBeats #2TechBeats #2
TechBeats #2
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
 
Database as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance PlatformDatabase as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance Platform
 
Running Oracle EBS in the cloud (DOAG TECH17 edition)
Running Oracle EBS in the cloud (DOAG TECH17 edition)Running Oracle EBS in the cloud (DOAG TECH17 edition)
Running Oracle EBS in the cloud (DOAG TECH17 edition)
 
NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5
 
The Design, Implementation and Open Source Way of Apache Pegasus
The Design, Implementation and Open Source Way of Apache PegasusThe Design, Implementation and Open Source Way of Apache Pegasus
The Design, Implementation and Open Source Way of Apache Pegasus
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance
 
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
 
OGG Architecture Performance
OGG Architecture PerformanceOGG Architecture Performance
OGG Architecture Performance
 
Intro to Apache Kudu (short) - Big Data Application Meetup
Intro to Apache Kudu (short) - Big Data Application MeetupIntro to Apache Kudu (short) - Big Data Application Meetup
Intro to Apache Kudu (short) - Big Data Application Meetup
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized WorldPostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized World
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 

Cassandra in e-commerce

  • 2. A PILOT PROJECT: ONLINE PRODUCT CATALOG FOR A E-COMMERCE PLATFORM, MIGRATION TO CASSANDRA Previous version was based on In-Memory Data Grid Oracle Coherence. All data from the primary storage (a relational database) is cached in the data grid. Goals of the migration: •  minimization of time required for system restart •  at least two copies of the data in different data-centers •  quick and simple backup
  • 3. ARCHITECTURE IN A NUTSHELL •  Application server: all business logic + web-services •  stateless •  with local caches •  Data storage •  Oracle Coherence, then Cassandra via DataStax Java Driver •  Batch data loading based on Spring Batch
  • 4. HOW A PRODUCT SHOULD LOOK LIKE TO MEET THE REQUIREMENTS? Some hypotheses: •  Data is on disk – available immediately after restart •  OS disk cache brings all the data to memory •  Key-value storage to simplify migration of the codebase Nice to have: •  Simple deployment configuration as a plus •  Java-based solution as a plus
  • 5. BASIC REQUIREMENTS / USE-CASES •  reads: ~5K TPS •  transactions can include more that one round-trip to the storage, as well as more than one key in a query (“multi-gets”) •  ~50K TPS on side of the storage •  full data reload (once per 24 hours) •  partial update of values (e.g. of product attributes) •  availability 24x7 •  millions of products •  tens of millions of related entities (product attributes etc.)
  • 6. CANDIDATES •  MongoDB •  HBase •  Oracle Coherence + data persistence (a la Riak) •  Cassandra
  • 7. PERFORMANCE TESTING ENVIRONMENT •  Production-ready implementation •  4 boxes (16 CPU, 24 GB) x 1 Cassandra instance •  2 boxes x 2 app servers •  100 GB of test data - fits in memory •  Main test is read queries: •  one hour •  up to 500 users •  even distribution of requested keys
  • 8. WHAT DID HELP •  configure your Cassandra cluster •  “OS swap off” •  different physical disks for different file-sets - e.g. data vs. commit log •  choose right (“private”) network interface •  async queries for multi-gets + token-aware rouring on the app server side: +15% TPS and latency •  use last Cassandra version •  a good example: 1.2.6 => 1.2.8 – 15% TPS, latency 2x better
  • 9. WHAT DID HELP •  Use the key of a parent entity as a first component of the children keys: PRIMARY KEY (parent-ID, child-ID) •  to minimize number of queries / disk seeks •  +15% TPS, latency 2x better •  use local (“near”) caches on app server side: +15% TPS •  local EHCache
  • 10. WHAT DID NOT HELP •  Java GC monitoring on Cassandra boxes •  with recommended settings GC takes 7% maximum from overall time of the tests •  caching == ALL •  all data in OS disk cache
  • 11. INTERESTING EXPERIMENTS •  another implementation of the token-aware query routing •  JSON or any other data format, if partial updates are not needed – a pure key-value model •  allows to avoid creation of tombstones in the case of updates, if values contain Cassandra collections •  another option is tuning of tombstone GC
  • 12. SUMMARY •  Cassandra is stable and mature enough product •  Can compete with in-memory caches and data grids, at least if dataset is small enough to be placed into memory •  Actively developing. Has a large community. Good commercial support from DataStax
  • 13. THANK YOU …and your questions J