Submit Search
Upload
DataStax Enterprise in the Field – 20160920
•
1 like
•
179 views
Daniel Cohen
Follow
Presented on 20 September 2016 at "DataStax Data London".
Read less
Read more
Technology
Report
Share
Report
Share
1 of 70
Download now
Download to read offline
Recommended
Master tuning
Master tuning
Thomas Kejser
2014 july 24_what_ishadoop
2014 july 24_what_ishadoop
Adam Muise
2014 feb 5_what_ishadoop_mda
2014 feb 5_what_ishadoop_mda
Adam Muise
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of Hadoop
Adam Muise
Webinar 5-reasons-object-storage.pptx
Webinar 5-reasons-object-storage.pptx
Cloudian
2013 Dec 9 Data Marketing 2013 - Hadoop
2013 Dec 9 Data Marketing 2013 - Hadoop
Adam Muise
DataStax 6 and Beyond
DataStax 6 and Beyond
David Jones-Gilardi
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Stefan Lipp
Recommended
Master tuning
Master tuning
Thomas Kejser
2014 july 24_what_ishadoop
2014 july 24_what_ishadoop
Adam Muise
2014 feb 5_what_ishadoop_mda
2014 feb 5_what_ishadoop_mda
Adam Muise
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of Hadoop
Adam Muise
Webinar 5-reasons-object-storage.pptx
Webinar 5-reasons-object-storage.pptx
Cloudian
2013 Dec 9 Data Marketing 2013 - Hadoop
2013 Dec 9 Data Marketing 2013 - Hadoop
Adam Muise
DataStax 6 and Beyond
DataStax 6 and Beyond
David Jones-Gilardi
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Stefan Lipp
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analytics
Imply
Introduction to hadoop
Introduction to hadoop
Marc Cluet
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012
Ben Stopford
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
Cloudera, Inc.
20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar
Cloudera, Inc.
implementation of a big data architecture for real-time analytics with data s...
implementation of a big data architecture for real-time analytics with data s...
Joseph Arriola
Archiving is a No-brainer - Bloor Analyst and RainStor Executive Discuss
Archiving is a No-brainer - Bloor Analyst and RainStor Executive Discuss
RainStor
Shaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDB
MongoDB
100 Exadata Implementations Later-Tim Fox
100 Exadata Implementations Later-Tim Fox
Enkitec
Can My Inventory Survive Eventual Consistency?
Can My Inventory Survive Eventual Consistency?
DataStax
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
Kathleen Ting
An introduction to Big Data
An introduction to Big Data
ForwardSprint
Aerospike: The Enterprise Class NoSQL Database for Real-Time Applications
Aerospike: The Enterprise Class NoSQL Database for Real-Time Applications
Brillix
Aerospike Meetup - Introduction - Ami - 04 March 2020
Aerospike Meetup - Introduction - Ami - 04 March 2020
Aerospike
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Mark Rittman
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
DataWorks Summit
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Mark Rittman
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...
Mark Rittman
5 step for deploying cost effective cloud ecommerce
5 step for deploying cost effective cloud ecommerce
Wiudo Laos
GigaSpaces Flash Memory Summit 2014
GigaSpaces Flash Memory Summit 2014
Shay Hassidim
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data Systems
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data Systems
InsightDataScience
Why does my choice of storage matter with cassandra?
Why does my choice of storage matter with cassandra?
Johnny Miller
More Related Content
What's hot
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analytics
Imply
Introduction to hadoop
Introduction to hadoop
Marc Cluet
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012
Ben Stopford
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
Cloudera, Inc.
20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar
Cloudera, Inc.
implementation of a big data architecture for real-time analytics with data s...
implementation of a big data architecture for real-time analytics with data s...
Joseph Arriola
Archiving is a No-brainer - Bloor Analyst and RainStor Executive Discuss
Archiving is a No-brainer - Bloor Analyst and RainStor Executive Discuss
RainStor
Shaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDB
MongoDB
100 Exadata Implementations Later-Tim Fox
100 Exadata Implementations Later-Tim Fox
Enkitec
Can My Inventory Survive Eventual Consistency?
Can My Inventory Survive Eventual Consistency?
DataStax
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
Kathleen Ting
An introduction to Big Data
An introduction to Big Data
ForwardSprint
Aerospike: The Enterprise Class NoSQL Database for Real-Time Applications
Aerospike: The Enterprise Class NoSQL Database for Real-Time Applications
Brillix
Aerospike Meetup - Introduction - Ami - 04 March 2020
Aerospike Meetup - Introduction - Ami - 04 March 2020
Aerospike
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Mark Rittman
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
DataWorks Summit
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Mark Rittman
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...
Mark Rittman
5 step for deploying cost effective cloud ecommerce
5 step for deploying cost effective cloud ecommerce
Wiudo Laos
GigaSpaces Flash Memory Summit 2014
GigaSpaces Flash Memory Summit 2014
Shay Hassidim
What's hot
(20)
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analytics
Introduction to hadoop
Introduction to hadoop
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar
implementation of a big data architecture for real-time analytics with data s...
implementation of a big data architecture for real-time analytics with data s...
Archiving is a No-brainer - Bloor Analyst and RainStor Executive Discuss
Archiving is a No-brainer - Bloor Analyst and RainStor Executive Discuss
Shaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDB
100 Exadata Implementations Later-Tim Fox
100 Exadata Implementations Later-Tim Fox
Can My Inventory Survive Eventual Consistency?
Can My Inventory Survive Eventual Consistency?
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
An introduction to Big Data
An introduction to Big Data
Aerospike: The Enterprise Class NoSQL Database for Real-Time Applications
Aerospike: The Enterprise Class NoSQL Database for Real-Time Applications
Aerospike Meetup - Introduction - Ami - 04 March 2020
Aerospike Meetup - Introduction - Ami - 04 March 2020
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...
5 step for deploying cost effective cloud ecommerce
5 step for deploying cost effective cloud ecommerce
GigaSpaces Flash Memory Summit 2014
GigaSpaces Flash Memory Summit 2014
Similar to DataStax Enterprise in the Field – 20160920
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data Systems
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data Systems
InsightDataScience
Why does my choice of storage matter with cassandra?
Why does my choice of storage matter with cassandra?
Johnny Miller
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
DataStax
Everyday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
Instaclustr
The Right Data for the Right Job
The Right Data for the Right Job
Emily Curtin
Big Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with Azure
Christos Charmatzis
From 100s to 100s of Millions
From 100s to 100s of Millions
Erik Onnen
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Rackspace
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
DataStax
M6d cassandrapresentation
M6d cassandrapresentation
Edward Capriolo
CASSANDRA MEETUP - Choosing the right cloud instances for success
CASSANDRA MEETUP - Choosing the right cloud instances for success
Erick Ramirez
Pros_and_Cons_of_DW_Apps pdf.pdf
Pros_and_Cons_of_DW_Apps pdf.pdf
HernanKlint
To Cloud or Not To Cloud?
To Cloud or Not To Cloud?
Greg Lindahl
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
MariaDB Corporation
Big data nyu
Big data nyu
Edward Capriolo
Webinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the Enterprise
Storage Switzerland
Storage Systems For Scalable systems
Storage Systems For Scalable systems
elliando dias
Presto at Tivo, Boston Hadoop Meetup
Presto at Tivo, Boston Hadoop Meetup
Justin Borgman
Nimble Storage Series A presentation 2007
Nimble Storage Series A presentation 2007
Wing Venture Capital
Introduction to Big Data by Manouj Bongirr
Introduction to Big Data by Manouj Bongirr
Pranav Kulkarni
Similar to DataStax Enterprise in the Field – 20160920
(20)
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data Systems
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data Systems
Why does my choice of storage matter with cassandra?
Why does my choice of storage matter with cassandra?
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
The Right Data for the Right Job
The Right Data for the Right Job
Big Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with Azure
From 100s to 100s of Millions
From 100s to 100s of Millions
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
M6d cassandrapresentation
M6d cassandrapresentation
CASSANDRA MEETUP - Choosing the right cloud instances for success
CASSANDRA MEETUP - Choosing the right cloud instances for success
Pros_and_Cons_of_DW_Apps pdf.pdf
Pros_and_Cons_of_DW_Apps pdf.pdf
To Cloud or Not To Cloud?
To Cloud or Not To Cloud?
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
Big data nyu
Big data nyu
Webinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the Enterprise
Storage Systems For Scalable systems
Storage Systems For Scalable systems
Presto at Tivo, Boston Hadoop Meetup
Presto at Tivo, Boston Hadoop Meetup
Nimble Storage Series A presentation 2007
Nimble Storage Series A presentation 2007
Introduction to Big Data by Manouj Bongirr
Introduction to Big Data by Manouj Bongirr
Recently uploaded
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Mattias Andersson
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
BookNet Canada
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
gvaughan
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Addepto
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Padma Pradeep
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
null - The Open Security Community
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
2toLead Limited
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
null - The Open Security Community
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April Automation LPDG
MarianaLemus7
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
carlostorres15106
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
The Digital Insurer
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
UiPathCommunity
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Scott Keck-Warren
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Dubai Multi Commodity Centre
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Slibray Presentation
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Wonjun Hwang
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
LBM Solutions
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Mark Simos
Recently uploaded
(20)
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April Automation LPDG
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
DataStax Enterprise in the Field – 20160920
1.
DataStax Enterprise in
the Field Daniel Cohen Solutions Engineer @ DataStax
2.
© DataStax, All
Rights Reserved. But Enough About Me… • Solutions Engineer at DataStax • LA ➜ SF ➜ NYC ➜ SF ➜ London • Previously at JP Morgan in London • Finance & digital media 2
3.
© DataStax, All
Rights Reserved. But Enough About Me… • Solutions Engineer at DataStax • LA ➜ SF ➜ NYC ➜ SF ➜ London • Previously at JP Morgan in London • Finance & digital media 2
4.
© DataStax, All
Rights Reserved. But Enough About Me… • Solutions Engineer at DataStax • LA ➜ SF ➜ NYC ➜ SF ➜ London • Previously at JP Morgan in London • Finance & digital media 2
5.
© DataStax, All
Rights Reserved. 1 Introductions 2 Top Customer Questions 3 Field Lessons: Big Irish Bank 4 Field Lessons: Big British Bank 3
6.
© DataStax, All
Rights Reserved. Top Customer Questions • What are all the other [banks] doing? • How many nodes do I need? • What do you mean SSDs? • How do I load data from [Oracle]? • We already have [MongoDB] for NoSQL. What’s the difference? • What are all the other [banks] doing? 4
7.
What are all
the other [banks] doing? “Tell me secrets about my competitors.”
8.
© DataStax, All
Rights Reserved. Transform Legacy Infrastructure 6 …USA Equities UK FX UK Bonds Global Users Legacy Systems USA FX DataStax Enterprise ClusterDSE User Interface / Application Services
9.
© DataStax, All
Rights Reserved. Transition Legacy to Microservices 7 Users µServices DC NY1 A B C D DC LDN1 A Z B Messages DC NY1 DC LDN1 DC NY1 DC LDN1 USA Customers Data UK Accounts Legacy C DSE DSE
10.
How many nodes
do I need? “How long is a piece of string?”
11.
© DataStax, All
Rights Reserved. The Node Count Dance 9
12.
© DataStax, All
Rights Reserved. The Node Count Dance • “How many nodes do I need?” is a natural question. – Large organizations buy hardware months in advance. 9
13.
© DataStax, All
Rights Reserved. The Node Count Dance • “How many nodes do I need?” is a natural question. – Large organizations buy hardware months in advance. • Desires ➔ Storage, Throughput, Latency, SLAs 9
14.
© DataStax, All
Rights Reserved. The Node Count Dance • “How many nodes do I need?” is a natural question. – Large organizations buy hardware months in advance. • Desires ➔ Storage, Throughput, Latency, SLAs • Realities – Cost – Data center capacity (space) – Operational capacity (people) – Your hardware – Your use cases 9
15.
© DataStax, All
Rights Reserved. The Node Count Dance • “How many nodes do I need?” is a natural question. – Large organizations buy hardware months in advance. • Desires ➔ Storage, Throughput, Latency, SLAs • Realities – Cost – Data center capacity (space) – Operational capacity (people) – Your hardware – Your use cases 9
16.
© DataStax, All
Rights Reserved. The Node Count Dance • “How many nodes do I need?” is a natural question. – Large organizations buy hardware months in advance. • Desires ➔ Storage, Throughput, Latency, SLAs • Realities – Cost – Data center capacity (space) – Operational capacity (people) – Your hardware – Your use cases • Lesson 1 ➔ Computer science is about trade-offs. 9
17.
© DataStax, All
Rights Reserved. The Node Count Dance • “How many nodes do I need?” is a natural question. – Large organizations buy hardware months in advance. • Desires ➔ Storage, Throughput, Latency, SLAs • Realities – Cost – Data center capacity (space) – Operational capacity (people) – Your hardware – Your use cases • Lesson 1 ➔ Computer science is about trade-offs. • Lesson 2 ➔ Test, iterate, test. 9
18.
© DataStax, All
Rights Reserved. The Node Count Dance • “How many nodes do I need?” is a natural question. – Large organizations buy hardware months in advance. • Desires ➔ Storage, Throughput, Latency, SLAs • Realities – Cost – Data center capacity (space) – Operational capacity (people) – Your hardware – Your use cases • Lesson 1 ➔ Computer science is about trade-offs. • Lesson 2 ➔ Test, iterate, test. • Lesson 3 ➔ Good news! DSE scales linearly. 9
19.
© DataStax, All
Rights Reserved. The Node Count Dance • “How many nodes do I need?” is a natural question. – Large organizations buy hardware months in advance. • Desires ➔ Storage, Throughput, Latency, SLAs • Realities – Cost – Data center capacity (space) – Operational capacity (people) – Your hardware – Your use cases • Lesson 1 ➔ Computer science is about trade-offs. • Lesson 2 ➔ Test, iterate, test. • Lesson 3 ➔ Good news! DSE scales linearly. 9
20.
What do you
mean SSDs? “We have an amazing SAN.”
21.
© DataStax, All
Rights Reserved. Storage Matters 11 SSD (consumer grade) • 10K – 1M IOPS • 400 MB – 3 GB bandwidth • < 200us latency ✴ Acknowledgements to my colleague Kathryn Erickson. 15K RPM HDD (spinning rust) • ~ 200 IOPS • ~ 160 MB bandwidth • > 5 ms latency
22.
© DataStax, All
Rights Reserved. Storage Interfaces Matter 12 Interface Transfer Rate SATA III 6 Gb/s SAS II 6 Gb/s SAS III 12 Gb/s PCIe Gen 2 x8 32 Gb/s
23.
© DataStax, All
Rights Reserved. A Nondeterministic Path to Failure 13
24.
© DataStax, All
Rights Reserved. A Nondeterministic Path to Failure • What about my incredible SAN? 13
25.
© DataStax, All
Rights Reserved. A Nondeterministic Path to Failure • What about my incredible SAN? – Do not use network attached storage with DSE. 13
26.
© DataStax, All
Rights Reserved. A Nondeterministic Path to Failure • What about my incredible SAN? – Do not use network attached storage with DSE. • But our SAN is awesome! We paid a lot of money for it. 13
27.
© DataStax, All
Rights Reserved. A Nondeterministic Path to Failure • What about my incredible SAN? – Do not use network attached storage with DSE. • But our SAN is awesome! We paid a lot of money for it. – No! Do not use network attached storage with DSE. 13
28.
© DataStax, All
Rights Reserved. A Nondeterministic Path to Failure • What about my incredible SAN? – Do not use network attached storage with DSE. • But our SAN is awesome! We paid a lot of money for it. – No! Do not use network attached storage with DSE. • Fine. What about EBS? 13
29.
© DataStax, All
Rights Reserved. A Nondeterministic Path to Failure • What about my incredible SAN? – Do not use network attached storage with DSE. • But our SAN is awesome! We paid a lot of money for it. – No! Do not use network attached storage with DSE. • Fine. What about EBS? – Let’s discuss! 13
30.
© DataStax, All
Rights Reserved. A Nondeterministic Path to Failure • What about my incredible SAN? – Do not use network attached storage with DSE. • But our SAN is awesome! We paid a lot of money for it. – No! Do not use network attached storage with DSE. • Fine. What about EBS? – Let’s discuss! 13
31.
© DataStax, All
Rights Reserved. Starting Points Workload CPU RAM Storage DSE (Read Heavy) 8-24 cores 32-128 GB ✴ Local SSD (.5 - 2 TB) DSE (Write Heavy) 12-32 cores 32-128 GB Local SSD (1-3 TB) DSE + Search 16-32 cores 128 GB Local SSD (1-3 TB) DSE + Analytics 16-32 cores 128+ GB Local SSD (1-3 TB) ✴ Got extra RAM? Cache is king. ✴✴ 1 Gb ethernet is fine. 10Gb is future-proof. 14
32.
We already have
[MongoDB] for NoSQL. What’s the difference? “Behold the one true NoSQL database.”
33.
© DataStax, All
Rights Reserved. NoSQL 16
34.
© DataStax, All
Rights Reserved. NoSQL 16
35.
© DataStax, All
Rights Reserved. NoSQL 16
36.
© DataStax, All
Rights Reserved. NoSQL 16
37.
© DataStax, All
Rights Reserved. NoSQL Fantasy 16
38.
© DataStax, All
Rights Reserved. NoSQL Fantasy 16
39.
© DataStax, All
Rights Reserved. 1 Introductions 2 Top Customer Questions 3 Field Lessons: Big Irish Bank 4 Field Lessons: Big British Bank 17
40.
© DataStax, All
Rights Reserved. Proof of Technology @ Big Irish Bank 18 Initial Goals • Deploy on AWS • Ingest ten years of (fake) customer data efficiently • Fast retrieval & search Synopsis • Payment Services Directive (PSD II) and Open Banking • Customer access to current and historical data via APIs • Competitive PoT versus other database vendors
41.
© DataStax, All
Rights Reserved. Hardware 19
42.
© DataStax, All
Rights Reserved. Hardware 19 PoT Recommendation • 6 x i2.xlarge (AWS) • 4 vCPU, 30.5 GB RAM • 1 x 800 local SSD
43.
© DataStax, All
Rights Reserved. PoT Mark 1 • c4.8xlarge (AWS) • 36 vCPU, 60 GB RAM • EBS only Hardware 19 PoT Recommendation • 6 x i2.xlarge (AWS) • 4 vCPU, 30.5 GB RAM • 1 x 800 local SSD
44.
© DataStax, All
Rights Reserved. PoT Mark 1 • c4.8xlarge (AWS) • 36 vCPU, 60 GB RAM • EBS only Hardware 19 PoT Recommendation • 6 x i2.xlarge (AWS) • 4 vCPU, 30.5 GB RAM • 1 x 800 local SSD
45.
© DataStax, All
Rights Reserved. PoT Mark 1 • c4.8xlarge (AWS) • 36 vCPU, 60 GB RAM • EBS only Hardware 19 PoT Recommendation • 6 x i2.xlarge (AWS) • 4 vCPU, 30.5 GB RAM • 1 x 800 local SSD PoT Final • 6 x i2.xlarge (AWS) • 4 vCPU, 30.5 GB RAM • 1 x 800 local SSD
46.
© DataStax, All
Rights Reserved. PoT Mark 1 • c4.8xlarge (AWS) • 36 vCPU, 60 GB RAM • EBS only Hardware 19 PoT Recommendation • 6 x i2.xlarge (AWS) • 4 vCPU, 30.5 GB RAM • 1 x 800 local SSD Production • 8 nodes across 2 data centers (4:4) • HP DL380 Gen9 ➔ 32 cores, 256 GB RAM, 3.2 TB SSDs on SAS III • 10 Gb ethernet, fiber between DCs PoT Final • 6 x i2.xlarge (AWS) • 4 vCPU, 30.5 GB RAM • 1 x 800 local SSD
47.
© DataStax, All
Rights Reserved. PoT Mark 1 • c4.8xlarge (AWS) • 36 vCPU, 60 GB RAM • EBS only Hardware 19 PoT Recommendation • 6 x i2.xlarge (AWS) • 4 vCPU, 30.5 GB RAM • 1 x 800 local SSD Production • 8 nodes across 2 data centers (4:4) • HP DL380 Gen9 ➔ 32 cores, 256 GB RAM, 3.2 TB SSDs on SAS III • 10 Gb ethernet, fiber between DCs PoT Final • 6 x i2.xlarge (AWS) • 4 vCPU, 30.5 GB RAM • 1 x 800 local SSD
48.
© DataStax, All
Rights Reserved. Lessons 20
49.
© DataStax, All
Rights Reserved. Lessons 20 1) The Node Count Dance is iterative. • Initial node count estimates were low. • Early refusal to modify AWS setup. • Avoid rigidity. Test, iterate, test.
50.
© DataStax, All
Rights Reserved. Lessons 20 2) Quis custodiet ipsos custodes? • Hit performance plateau at 5,000 ops/s. • Added second jMeter, performance doubled to 10,000 ops/s. • jMeter was the bottleneck! • Who will test the testers? 1) The Node Count Dance is iterative. • Initial node count estimates were low. • Early refusal to modify AWS setup. • Avoid rigidity. Test, iterate, test.
51.
© DataStax, All
Rights Reserved. Lessons 20 2) Quis custodiet ipsos custodes? • Hit performance plateau at 5,000 ops/s. • Added second jMeter, performance doubled to 10,000 ops/s. • jMeter was the bottleneck! • Who will test the testers? 1) The Node Count Dance is iterative. • Initial node count estimates were low. • Early refusal to modify AWS setup. • Avoid rigidity. Test, iterate, test. 3) EBS is still network attached. • 99% Read Latency (milliseconds) ▫ 3.311 ➔ local SSD ▫ 35.425 ➔ EBS Provisioned SSD • Competing vendor falsified numbers. • Lies, damned lies, and statistics.
52.
© DataStax, All
Rights Reserved. Lessons 20 2) Quis custodiet ipsos custodes? • Hit performance plateau at 5,000 ops/s. • Added second jMeter, performance doubled to 10,000 ops/s. • jMeter was the bottleneck! • Who will test the testers? 1) The Node Count Dance is iterative. • Initial node count estimates were low. • Early refusal to modify AWS setup. • Avoid rigidity. Test, iterate, test. 4) Not all data needs to be hot. • PoT Mark 1 ➔ 10 years of hot data ▫ ~ 20 billion transactions ▫ ~ 30 nodes to reach latency targets • PoT Final ➔ 2 years of hot data • Do not architect by convenience. 3) EBS is still network attached. • 99% Read Latency (milliseconds) ▫ 3.311 ➔ local SSD ▫ 35.425 ➔ EBS Provisioned SSD • Competing vendor falsified numbers. • Lies, damned lies, and statistics.
53.
© DataStax, All
Rights Reserved. Lessons 20 2) Quis custodiet ipsos custodes? • Hit performance plateau at 5,000 ops/s. • Added second jMeter, performance doubled to 10,000 ops/s. • jMeter was the bottleneck! • Who will test the testers? 1) The Node Count Dance is iterative. • Initial node count estimates were low. • Early refusal to modify AWS setup. • Avoid rigidity. Test, iterate, test. 4) Not all data needs to be hot. • PoT Mark 1 ➔ 10 years of hot data ▫ ~ 20 billion transactions ▫ ~ 30 nodes to reach latency targets • PoT Final ➔ 2 years of hot data • Do not architect by convenience. 3) EBS is still network attached. • 99% Read Latency (milliseconds) ▫ 3.311 ➔ local SSD ▫ 35.425 ➔ EBS Provisioned SSD • Competing vendor falsified numbers. • Lies, damned lies, and statistics.
54.
© DataStax, All
Rights Reserved. 1 Introductions 2 Top Customer Questions 3 Field Lessons: Big Irish Bank 4 Field Lessons: Big British Bank 21
55.
© DataStax, All
Rights Reserved. Production Pilot @ Big British Bank 22 Initial Goals • Transition from mothballed trials of OrientDB, Titan • Ingest enormous quantities of data from legacy DB • Prove graph at scale Synopsis • Customer 360° use case across banking group • DSE Graph • Dissatisfied with other graph databases
56.
© DataStax, All
Rights Reserved. Hardware 23
57.
© DataStax, All
Rights Reserved. Hardware 23 Pilot Mark 1 • “Private Cloud” • N x Hosted VM • 8 vCPU, 112 GB RAM • SAN only (for now)
58.
© DataStax, All
Rights Reserved. Hardware 23 Pilot Mark 1 • “Private Cloud” • N x Hosted VM • 8 vCPU, 112 GB RAM • SAN only (for now)
59.
© DataStax, All
Rights Reserved. Pilot Mark 2 • “Hadoop Leftovers” • 4 x HP DL380s • 24 cores, 512 GB RAM • 1 x 2.1 TB SSD • 14 x 2 TB HDDs Hardware 23 Pilot Mark 1 • “Private Cloud” • N x Hosted VM • 8 vCPU, 112 GB RAM • SAN only (for now)
60.
© DataStax, All
Rights Reserved. Pilot Mark 2 • “Hadoop Leftovers” • 4 x HP DL380s • 24 cores, 512 GB RAM • 1 x 2.1 TB SSD • 14 x 2 TB HDDs Hardware 23 Pilot Mark 1 • “Private Cloud” • N x Hosted VM • 8 vCPU, 112 GB RAM • SAN only (for now)
61.
© DataStax, All
Rights Reserved. Pilot Mark 2 • “Hadoop Leftovers” • 4 x HP DL380s • 24 cores, 512 GB RAM • 1 x 2.1 TB SSD • 14 x 2 TB HDDs Hardware 23 Pilot Mark 1 • “Private Cloud” • N x Hosted VM • 8 vCPU, 112 GB RAM • SAN only (for now) Pilot Final • 3 x Dell C6220 • 12 cores, 128 GB RAM • 6 x 1 TB SATA HDDs ▫ 2 x OS ▫ 1 x commit log ▫ 3 x data, caches
62.
© DataStax, All
Rights Reserved. Pilot Mark 2 • “Hadoop Leftovers” • 4 x HP DL380s • 24 cores, 512 GB RAM • 1 x 2.1 TB SSD • 14 x 2 TB HDDs Hardware 23 Pilot Mark 1 • “Private Cloud” • N x Hosted VM • 8 vCPU, 112 GB RAM • SAN only (for now) Production Target 16 nodes across 2 data centers (8:8) HP DL380 Gen9 ➔ 24 cores, 528 GB RAM, 3.4 TB SSDs Pilot Final • 3 x Dell C6220 • 12 cores, 128 GB RAM • 6 x 1 TB SATA HDDs ▫ 2 x OS ▫ 1 x commit log ▫ 3 x data, caches
63.
© DataStax, All
Rights Reserved. Pilot Mark 2 • “Hadoop Leftovers” • 4 x HP DL380s • 24 cores, 512 GB RAM • 1 x 2.1 TB SSD • 14 x 2 TB HDDs Hardware 23 Pilot Mark 1 • “Private Cloud” • N x Hosted VM • 8 vCPU, 112 GB RAM • SAN only (for now) Production Target 16 nodes across 2 data centers (8:8) HP DL380 Gen9 ➔ 24 cores, 528 GB RAM, 3.4 TB SSDs Pilot Final • 3 x Dell C6220 • 12 cores, 128 GB RAM • 6 x 1 TB SATA HDDs ▫ 2 x OS ▫ 1 x commit log ▫ 3 x data, caches
64.
© DataStax, All
Rights Reserved. Lessons 24
65.
© DataStax, All
Rights Reserved. Lessons 24 1) DSE essentials are critical. • Great team but zero DSE experience. • Ad hoc education introduces risk. • Walk before you run.
66.
© DataStax, All
Rights Reserved. Lessons 24 2) Node Count Dance applies to Graph. • Data size unknown due to privacy. • Load 5% of data, extrapolate. • Test, iterate, test. 1) DSE essentials are critical. • Great team but zero DSE experience. • Ad hoc education introduces risk. • Walk before you run.
67.
© DataStax, All
Rights Reserved. Lessons 24 2) Node Count Dance applies to Graph. • Data size unknown due to privacy. • Load 5% of data, extrapolate. • Test, iterate, test. 1) DSE essentials are critical. • Great team but zero DSE experience. • Ad hoc education introduces risk. • Walk before you run. 3) Hardware matters, of course. • Leftover Hadoop boxes, spinning rust. • Get creative with configuration & tuning. • “Under no circumstances should you do load tests on these boxes.”
68.
© DataStax, All
Rights Reserved. Lessons 24 2) Node Count Dance applies to Graph. • Data size unknown due to privacy. • Load 5% of data, extrapolate. • Test, iterate, test. 1) DSE essentials are critical. • Great team but zero DSE experience. • Ad hoc education introduces risk. • Walk before you run. 4) Avoid surprises before deadlines. • Upgraded from RHEL 6.7 to 7.1. • CPU spikes made nodes unusably slow. • Revert! • Nobody move, nobody gets hurt. 3) Hardware matters, of course. • Leftover Hadoop boxes, spinning rust. • Get creative with configuration & tuning. • “Under no circumstances should you do load tests on these boxes.”
69.
© DataStax, All
Rights Reserved. Lessons 24 2) Node Count Dance applies to Graph. • Data size unknown due to privacy. • Load 5% of data, extrapolate. • Test, iterate, test. 1) DSE essentials are critical. • Great team but zero DSE experience. • Ad hoc education introduces risk. • Walk before you run. 4) Avoid surprises before deadlines. • Upgraded from RHEL 6.7 to 7.1. • CPU spikes made nodes unusably slow. • Revert! • Nobody move, nobody gets hurt. 3) Hardware matters, of course. • Leftover Hadoop boxes, spinning rust. • Get creative with configuration & tuning. • “Under no circumstances should you do load tests on these boxes.”
70.
Thank you! Daniel Cohen Solutions
Engineer @ DataStax daniel.cohen@datastax.com @CodaAzzurra
Download now