Xanadu Big Data Platform Technology BMT@ Rackspace Cloud

Alex G. Lee, Ph.D. Esq. CLP
Alex G. Lee, Ph.D. Esq. CLPTechnology Innovation Expert
©2017 Xanadu Big Data, LLC All Rights Reserved www.xanadubigdata.com
Xanadu
Big Data Platform Technology
BMT@ Rackspace Cloud
May 9, 2017
Alex G. Lee (alexglee@xanadubigdata.com)
©2017 Xanadu Big Data, LLC All Rights Reserved
NoSQL databases are designed to deliver faster performance than
traditional Relational Database Management Systems (RDBMS) in many
cases, particularly when big data is involved.
Differences in the performance of NoSQL stores can be understood
using industry standard benchmarking techniques under different
assumed-scenario workloads.
Using the Yahoo Cloud Storage Benchmark (YCSB) , we show that
Xanadu outperforms other NoSQL databases while offering strong
consistency, high throughput, low latency and high scalability.
Summary
©2017 Xanadu Big Data, LLC All Rights Reserved
The tests were conducted on Rackspace Cloud servers. The instances
used were exclusively of the “15GB Performance” server instance class,
on the service network. Each server was equipped with 15 GB of RAM,
one 40 GB SSD disk, four vCPUs and provisioned for 1 Gb/s of network
traffic.
For every two instances configured to run a Xanadu / Cassandra server
process there was one additional instance acting as a client.
Benchmarking Configuration
©2017 Xanadu Big Data, LLC All Rights Reserved
For a benchmark test to show valid relative differences between data-
store performances the test environment must be well understood,
especially in a cloud environment with shared tenancy.
In particular, the input / output (I/O) performance must be stable and
highly performant. We therefore profiled the network and disk I/O
performance of these instances while achieving the highest
throughput available.
To test the network I/O performance, four client instances each with 16
threads sent traffic to each of four server instances.
Test Environment Parameterization
©2017 Xanadu Big Data, LLC All Rights Reserved
To test the disk I/O performance, four instances wrote to and read from
one 5 GB file on the file system of each instance. For the sequential
tests, the starting position of the read / write within the file increased
monotonically by 64 kB, wrapping from the end of the file to its beginning
and the random read performance was profiled by generating a random
position each time.
The choice of file size is motivated by the maximum size of the data
written to by a data-store product after the full range of YCSB tests were
completed.
The network and disk I/O tests were run simultaneously for 10 minutes,
sampling the I/O rate every 100ms. The differential I/O rates obtained
are shown in the two plots below.
Test Environment Parameterization
©2017 Xanadu Big Data, LLC All Rights Reserved
Network/Disk I/O Performance
©2017 Xanadu Big Data, LLC All Rights Reserved
Xanadu Configuration
Xanadu was deployed to four instances, each of which was running a
Xanadu storage process. Three of these instances were also running a
Xanadu real-time process and the other was running a Xanadu Registry
process.
Cassandra Configuration
Cassandra 2.0 was deployed to four instances. All nodes were
configured to listen on the service network address and pointed to the
same two initial seed processes to initiate the gossip protocol.
MongoDB and Hbase configuration
These results are extrapolated from a similar benchmarking test paper
released by Datastax using the same test parameters in a similar
cloud environment.
DB Testing Configuration
©2017 Xanadu Big Data, LLC All Rights Reserved
Riak
Riak 1.4.7 was deployed to four instances. All nodes were configured to
listen on the service network address. The last three Riak nodes to be
started were joined sequentially to the cluster before starting any YCSB
workload.
DB Testing Configuration
©2017 Xanadu Big Data, LLC All Rights Reserved
To compare Xanadu with other NoSQL data-stores, the industry
standard Yahoo! Cloud Storage Benchmark (YCSB) package was used
under the following conditions:
1. Load: Clients exclusively insert new keys and values based on YCSB.
2. Workload a: an equal ratio of reading and updating keys & values. An
application example is a session store recording recent actions.
3. Workload b: a read heavy 95:5 ratio of reading and updating keys and
values. Application example: photo tagging; add a tag is an update, but
most operations are to read tags.
For each store a four node cluster was configured, and in each test
phase there were additionally four client nodes.
Testing Parameters
©2017 Xanadu Big Data, LLC All Rights Reserved
Each workload was run until the system had performed 4 million
operations. In each read, write or update made there were 20 columns in
each update, each with 100 bytes of data (so each update contained
about 2kB of data). Keys to be read or updated were chosen randomly
from a Zipfian distribution.
Each insert or update required the data to be stored on two store nodes,
with the highest consistency level available in Cassandra and Riak while
each read was required to be read from onemonly store node.
Testing Parameters
©2017 Xanadu Big Data, LLC All Rights Reserved
Throughput
©2017 Xanadu Big Data, LLC All Rights Reserved
Latency
©2017 Xanadu Big Data, LLC All Rights Reserved
Scalability
To demonstrate the scalability of Xanadu we also measured the average
insert throughput as a function of the number of instances running Xanadu.
This is shown below for the case that each insert is 200 bytes, 2 kB and 16
kB in size.
©2017 Xanadu Big Data, LLC All Rights Reserved
Scalability
The average data-rate is shown below for each of 200 bytes, 2 kB and 16
kB data sizes.
1 of 14

Recommended

Xanadu for Big Data + Deep Learning + Cloud + IoT Integration Strategy by
Xanadu for Big Data + Deep Learning + Cloud + IoT Integration StrategyXanadu for Big Data + Deep Learning + Cloud + IoT Integration Strategy
Xanadu for Big Data + Deep Learning + Cloud + IoT Integration StrategyAlex G. Lee, Ph.D. Esq. CLP
279 views3 slides
Xanadu Based Big Data Deep Learning for Medical Data Analysis by
Xanadu Based Big Data Deep Learning for Medical Data AnalysisXanadu Based Big Data Deep Learning for Medical Data Analysis
Xanadu Based Big Data Deep Learning for Medical Data AnalysisAlex G. Lee, Ph.D. Esq. CLP
3.7K views27 slides
Xanadu for Big Data + IoT + Deep Learning + Cloud Integration Strategy by
Xanadu for Big Data + IoT + Deep Learning + Cloud Integration StrategyXanadu for Big Data + IoT + Deep Learning + Cloud Integration Strategy
Xanadu for Big Data + IoT + Deep Learning + Cloud Integration StrategyAlex G. Lee, Ph.D. Esq. CLP
3.6K views24 slides
Xanadu for Big Data + IoT + Deep Learning + Cloud Integration Strategy (YouTu... by
Xanadu for Big Data + IoT + Deep Learning + Cloud Integration Strategy (YouTu...Xanadu for Big Data + IoT + Deep Learning + Cloud Integration Strategy (YouTu...
Xanadu for Big Data + IoT + Deep Learning + Cloud Integration Strategy (YouTu...Alex G. Lee, Ph.D. Esq. CLP
275 views3 slides
IoT and Big Data - Iot Asia 2014 by
IoT and Big Data - Iot Asia 2014IoT and Big Data - Iot Asia 2014
IoT and Big Data - Iot Asia 2014John Berns
13.8K views49 slides
Big Data Application Architectures - IoT by
Big Data Application Architectures - IoTBig Data Application Architectures - IoT
Big Data Application Architectures - IoTDataWorks Summit/Hadoop Summit
6.8K views37 slides

More Related Content

What's hot

Digital Transformation, OSS, 모두를 위한 AI - 마이크로소프트의 관점 by
Digital Transformation, OSS, 모두를 위한 AI - 마이크로소프트의 관점Digital Transformation, OSS, 모두를 위한 AI - 마이크로소프트의 관점
Digital Transformation, OSS, 모두를 위한 AI - 마이크로소프트의 관점r-kor
379 views34 slides
Big Data in the Cloud by
Big Data in the CloudBig Data in the Cloud
Big Data in the CloudNati Shalom
6.6K views34 slides
Snowflake Data Science and AI/ML at Scale by
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleAdam Doyle
646 views34 slides
IoT Analytics from Edge to Cloud - using IBM Informix by
IoT Analytics from Edge to Cloud - using IBM InformixIoT Analytics from Edge to Cloud - using IBM Informix
IoT Analytics from Edge to Cloud - using IBM InformixPradeep Muthalpuredathe
1.7K views17 slides
Momentum in Big Data, IoT and Machine Intelligence by
Momentum in Big Data, IoT and Machine IntelligenceMomentum in Big Data, IoT and Machine Intelligence
Momentum in Big Data, IoT and Machine IntelligenceShamshad Ansari
19.1K views25 slides
Cloud-centric Internet of Things by
Cloud-centric Internet of ThingsCloud-centric Internet of Things
Cloud-centric Internet of ThingsLynn Langit
2.5K views55 slides

What's hot(20)

Digital Transformation, OSS, 모두를 위한 AI - 마이크로소프트의 관점 by r-kor
Digital Transformation, OSS, 모두를 위한 AI - 마이크로소프트의 관점Digital Transformation, OSS, 모두를 위한 AI - 마이크로소프트의 관점
Digital Transformation, OSS, 모두를 위한 AI - 마이크로소프트의 관점
r-kor379 views
Big Data in the Cloud by Nati Shalom
Big Data in the CloudBig Data in the Cloud
Big Data in the Cloud
Nati Shalom6.6K views
Snowflake Data Science and AI/ML at Scale by Adam Doyle
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
Adam Doyle646 views
Momentum in Big Data, IoT and Machine Intelligence by Shamshad Ansari
Momentum in Big Data, IoT and Machine IntelligenceMomentum in Big Data, IoT and Machine Intelligence
Momentum in Big Data, IoT and Machine Intelligence
Shamshad Ansari19.1K views
Cloud-centric Internet of Things by Lynn Langit
Cloud-centric Internet of ThingsCloud-centric Internet of Things
Cloud-centric Internet of Things
Lynn Langit2.5K views
Device to Intelligence, IOT and Big Data in Oracle by JunSeok Seo
Device to Intelligence, IOT and Big Data in OracleDevice to Intelligence, IOT and Big Data in Oracle
Device to Intelligence, IOT and Big Data in Oracle
JunSeok Seo10.1K views
Overview of big data in cloud computing by Viet-Trung TRAN
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computing
Viet-Trung TRAN5.3K views
Google Cloud IoT Core by Ido Flatow
Google Cloud IoT CoreGoogle Cloud IoT Core
Google Cloud IoT Core
Ido Flatow1.4K views
Connecting Legacy Data Sources to the Data Lifecycle by Precisely
 Connecting Legacy Data Sources to the Data Lifecycle Connecting Legacy Data Sources to the Data Lifecycle
Connecting Legacy Data Sources to the Data Lifecycle
Precisely91 views
Cloud-Based Big Data Analytics by Sateeshreddy N
Cloud-Based Big Data AnalyticsCloud-Based Big Data Analytics
Cloud-Based Big Data Analytics
Sateeshreddy N1.2K views
Relationship between cloud computing and big data by Jazan University
Relationship between cloud computing and big dataRelationship between cloud computing and big data
Relationship between cloud computing and big data
Jazan University2.6K views
Internet of things (IoT) and big data- r.nabati by nabati
Internet of things (IoT) and big data- r.nabatiInternet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabati
nabati846 views
Windows for Raspberry Pi 2 Makers (and more!) by Guy Barrette
Windows for Raspberry Pi 2Makers (and more!)Windows for Raspberry Pi 2Makers (and more!)
Windows for Raspberry Pi 2 Makers (and more!)
Guy Barrette544 views
How Can Edge Computing and IoT Transform Your Business? by Amazon Web Services
How Can Edge Computing and IoT Transform Your Business?How Can Edge Computing and IoT Transform Your Business?
How Can Edge Computing and IoT Transform Your Business?
Amazon Web Services1.6K views
Introduction to Cloud Computing and Big Data by waheed751
Introduction to Cloud Computing and Big DataIntroduction to Cloud Computing and Big Data
Introduction to Cloud Computing and Big Data
waheed751824 views
Data Democratization at Nubank by Databricks
 Data Democratization at Nubank Data Democratization at Nubank
Data Democratization at Nubank
Databricks1.4K views
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble by Databricks
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao KambleGoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
Databricks780 views
Demystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy by Denodo
Demystifying Data Virtualization: Why it’s Now Critical for Your Data StrategyDemystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
Demystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
Denodo 154 views

Similar to Xanadu Big Data Platform Technology BMT@ Rackspace Cloud

Deploying Apache Spark and testing big data applications on servers powered b... by
Deploying Apache Spark and testing big data applications on servers powered b...Deploying Apache Spark and testing big data applications on servers powered b...
Deploying Apache Spark and testing big data applications on servers powered b...Principled Technologies
145 views18 slides
performance_tuning.pdf by
performance_tuning.pdfperformance_tuning.pdf
performance_tuning.pdfAlexadiaz52
2 views19 slides
performance_tuning.pdf by
performance_tuning.pdfperformance_tuning.pdf
performance_tuning.pdfAlexadiaz52
31 views19 slides
The Best Infrastructure for OpenStack: VMware vSphere and Virtual SAN by
The Best Infrastructure for OpenStack: VMware vSphere and Virtual SANThe Best Infrastructure for OpenStack: VMware vSphere and Virtual SAN
The Best Infrastructure for OpenStack: VMware vSphere and Virtual SANEMC
2.6K views19 slides
Cost and performance comparison for OpenStack compute and storage infrastructure by
Cost and performance comparison for OpenStack compute and storage infrastructureCost and performance comparison for OpenStack compute and storage infrastructure
Cost and performance comparison for OpenStack compute and storage infrastructurePrincipled Technologies
1.3K views19 slides
Seneca, Pittsburgh Supercomputer, and LSI by
Seneca, Pittsburgh Supercomputer, and LSI Seneca, Pittsburgh Supercomputer, and LSI
Seneca, Pittsburgh Supercomputer, and LSI Jan Robin
376 views3 slides

Similar to Xanadu Big Data Platform Technology BMT@ Rackspace Cloud (20)

Deploying Apache Spark and testing big data applications on servers powered b... by Principled Technologies
Deploying Apache Spark and testing big data applications on servers powered b...Deploying Apache Spark and testing big data applications on servers powered b...
Deploying Apache Spark and testing big data applications on servers powered b...
performance_tuning.pdf by Alexadiaz52
performance_tuning.pdfperformance_tuning.pdf
performance_tuning.pdf
Alexadiaz522 views
performance_tuning.pdf by Alexadiaz52
performance_tuning.pdfperformance_tuning.pdf
performance_tuning.pdf
Alexadiaz5231 views
The Best Infrastructure for OpenStack: VMware vSphere and Virtual SAN by EMC
The Best Infrastructure for OpenStack: VMware vSphere and Virtual SANThe Best Infrastructure for OpenStack: VMware vSphere and Virtual SAN
The Best Infrastructure for OpenStack: VMware vSphere and Virtual SAN
EMC2.6K views
Cost and performance comparison for OpenStack compute and storage infrastructure by Principled Technologies
Cost and performance comparison for OpenStack compute and storage infrastructureCost and performance comparison for OpenStack compute and storage infrastructure
Cost and performance comparison for OpenStack compute and storage infrastructure
Seneca, Pittsburgh Supercomputer, and LSI by Jan Robin
Seneca, Pittsburgh Supercomputer, and LSI Seneca, Pittsburgh Supercomputer, and LSI
Seneca, Pittsburgh Supercomputer, and LSI
Jan Robin376 views
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark by Lenovo Data Center
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmarkThe Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
AWS June Webinar Series - Getting Started: Amazon Redshift by Amazon Web Services
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon Redshift
Amazon Web Services1.5K views
Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010 by CLOUDIAN KK
Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010
Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010
CLOUDIAN KK2K views
Amazon Relational Database Service – How is it different to what you do today ? by Amazon Web Services
Amazon Relational Database Service – How is it different to what you do today ?Amazon Relational Database Service – How is it different to what you do today ?
Amazon Relational Database Service – How is it different to what you do today ?
Amazon Web Services1.2K views
IBM POWER - An ideal platform for scale-out deployments by thinkASG
IBM POWER - An ideal platform for scale-out deploymentsIBM POWER - An ideal platform for scale-out deployments
IBM POWER - An ideal platform for scale-out deployments
thinkASG152 views
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory... by DataWorks Summit
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit771 views
Using ТРСС to study Firebird performance by Mind The Firebird
Using ТРСС to study Firebird performanceUsing ТРСС to study Firebird performance
Using ТРСС to study Firebird performance
Mind The Firebird1.9K views
TechTalk v2.0 - Performance tuning Cassandra + AWS by Pythian
TechTalk v2.0 - Performance tuning Cassandra + AWSTechTalk v2.0 - Performance tuning Cassandra + AWS
TechTalk v2.0 - Performance tuning Cassandra + AWS
Pythian2.5K views
STG329_ProtectWise optimizes performance of Cassandra and Kafka workloads wit... by Amazon Web Services
STG329_ProtectWise optimizes performance of Cassandra and Kafka workloads wit...STG329_ProtectWise optimizes performance of Cassandra and Kafka workloads wit...
STG329_ProtectWise optimizes performance of Cassandra and Kafka workloads wit...
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh... by Principled Technologies
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
Randall's re:Invent Recap by Randall Hunt
Randall's re:Invent RecapRandall's re:Invent Recap
Randall's re:Invent Recap
Randall Hunt1.4K views
DAT304_Amazon Aurora Performance Optimization with MySQL by Kamal Gupta
DAT304_Amazon Aurora Performance Optimization with MySQLDAT304_Amazon Aurora Performance Optimization with MySQL
DAT304_Amazon Aurora Performance Optimization with MySQL
Kamal Gupta347 views

More from Alex G. Lee, Ph.D. Esq. CLP

[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG... by
[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG...[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG...
[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG...Alex G. Lee, Ph.D. Esq. CLP
16 views42 slides
Metaverse x AI x Web3 x Sustainability Convergence by
Metaverse x AI x  Web3 x Sustainability ConvergenceMetaverse x AI x  Web3 x Sustainability Convergence
Metaverse x AI x Web3 x Sustainability ConvergenceAlex G. Lee, Ph.D. Esq. CLP
24 views20 slides
Tokenization, Securitization, Monetization of Real-World Assets by
Tokenization, Securitization, Monetization of Real-World AssetsTokenization, Securitization, Monetization of Real-World Assets
Tokenization, Securitization, Monetization of Real-World AssetsAlex G. Lee, Ph.D. Esq. CLP
32 views6 slides
Maximizing Innovation through ChatGPT Powered Patent Analysis by
Maximizing Innovation through ChatGPT Powered Patent AnalysisMaximizing Innovation through ChatGPT Powered Patent Analysis
Maximizing Innovation through ChatGPT Powered Patent AnalysisAlex G. Lee, Ph.D. Esq. CLP
139 views12 slides
Maximizing AI Business Value Creation Utilizing Patents by
Maximizing AI Business Value Creation Utilizing PatentsMaximizing AI Business Value Creation Utilizing Patents
Maximizing AI Business Value Creation Utilizing PatentsAlex G. Lee, Ph.D. Esq. CLP
19 views24 slides
Real-World Assets STO + Institutional DeFi Integration by
Real-World Assets STO + Institutional DeFi IntegrationReal-World Assets STO + Institutional DeFi Integration
Real-World Assets STO + Institutional DeFi IntegrationAlex G. Lee, Ph.D. Esq. CLP
27 views6 slides

More from Alex G. Lee, Ph.D. Esq. CLP(20)

Recently uploaded

2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx by
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptxanimuscrm
13 views19 slides
A first look at MariaDB 11.x features and ideas on how to use them by
A first look at MariaDB 11.x features and ideas on how to use themA first look at MariaDB 11.x features and ideas on how to use them
A first look at MariaDB 11.x features and ideas on how to use themFederico Razzoli
45 views36 slides
WebAssembly by
WebAssemblyWebAssembly
WebAssemblyJens Siebert
33 views18 slides
Headless JS UG Presentation.pptx by
Headless JS UG Presentation.pptxHeadless JS UG Presentation.pptx
Headless JS UG Presentation.pptxJack Spektor
7 views24 slides
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs by
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDeltares
7 views17 slides
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t... by
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...Deltares
9 views26 slides

Recently uploaded(20)

2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx by animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm13 views
A first look at MariaDB 11.x features and ideas on how to use them by Federico Razzoli
A first look at MariaDB 11.x features and ideas on how to use themA first look at MariaDB 11.x features and ideas on how to use them
A first look at MariaDB 11.x features and ideas on how to use them
Federico Razzoli45 views
Headless JS UG Presentation.pptx by Jack Spektor
Headless JS UG Presentation.pptxHeadless JS UG Presentation.pptx
Headless JS UG Presentation.pptx
Jack Spektor7 views
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs by Deltares
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
Deltares7 views
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t... by Deltares
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...
Deltares9 views
Copilot Prompting Toolkit_All Resources.pdf by Riccardo Zamana
Copilot Prompting Toolkit_All Resources.pdfCopilot Prompting Toolkit_All Resources.pdf
Copilot Prompting Toolkit_All Resources.pdf
Riccardo Zamana6 views
Generic or specific? Making sensible software design decisions by Bert Jan Schrijver
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the... by Deltares
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...
Deltares6 views
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ... by Deltares
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
Deltares9 views
Tridens DevOps by Tridens
Tridens DevOpsTridens DevOps
Tridens DevOps
Tridens9 views
360 graden fabriek by info33492
360 graden fabriek360 graden fabriek
360 graden fabriek
info3349224 views
El Arte de lo Possible by Neo4j
El Arte de lo PossibleEl Arte de lo Possible
El Arte de lo Possible
Neo4j38 views
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ... by Donato Onofri
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Donato Onofri711 views
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut... by Deltares
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
Deltares6 views
Advanced API Mocking Techniques by Dimpy Adhikary
Advanced API Mocking TechniquesAdvanced API Mocking Techniques
Advanced API Mocking Techniques
Dimpy Adhikary19 views
Consulting for Data Monetization Maximizing the Profit Potential of Your Data... by Flexsin
Consulting for Data Monetization Maximizing the Profit Potential of Your Data...Consulting for Data Monetization Maximizing the Profit Potential of Your Data...
Consulting for Data Monetization Maximizing the Profit Potential of Your Data...
Flexsin 15 views
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -... by Deltares
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
Deltares6 views
DSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - Afternoon by Deltares
DSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - AfternoonDSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - Afternoon
DSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - Afternoon
Deltares13 views

Xanadu Big Data Platform Technology BMT@ Rackspace Cloud

  • 1. ©2017 Xanadu Big Data, LLC All Rights Reserved www.xanadubigdata.com Xanadu Big Data Platform Technology BMT@ Rackspace Cloud May 9, 2017 Alex G. Lee (alexglee@xanadubigdata.com)
  • 2. ©2017 Xanadu Big Data, LLC All Rights Reserved NoSQL databases are designed to deliver faster performance than traditional Relational Database Management Systems (RDBMS) in many cases, particularly when big data is involved. Differences in the performance of NoSQL stores can be understood using industry standard benchmarking techniques under different assumed-scenario workloads. Using the Yahoo Cloud Storage Benchmark (YCSB) , we show that Xanadu outperforms other NoSQL databases while offering strong consistency, high throughput, low latency and high scalability. Summary
  • 3. ©2017 Xanadu Big Data, LLC All Rights Reserved The tests were conducted on Rackspace Cloud servers. The instances used were exclusively of the “15GB Performance” server instance class, on the service network. Each server was equipped with 15 GB of RAM, one 40 GB SSD disk, four vCPUs and provisioned for 1 Gb/s of network traffic. For every two instances configured to run a Xanadu / Cassandra server process there was one additional instance acting as a client. Benchmarking Configuration
  • 4. ©2017 Xanadu Big Data, LLC All Rights Reserved For a benchmark test to show valid relative differences between data- store performances the test environment must be well understood, especially in a cloud environment with shared tenancy. In particular, the input / output (I/O) performance must be stable and highly performant. We therefore profiled the network and disk I/O performance of these instances while achieving the highest throughput available. To test the network I/O performance, four client instances each with 16 threads sent traffic to each of four server instances. Test Environment Parameterization
  • 5. ©2017 Xanadu Big Data, LLC All Rights Reserved To test the disk I/O performance, four instances wrote to and read from one 5 GB file on the file system of each instance. For the sequential tests, the starting position of the read / write within the file increased monotonically by 64 kB, wrapping from the end of the file to its beginning and the random read performance was profiled by generating a random position each time. The choice of file size is motivated by the maximum size of the data written to by a data-store product after the full range of YCSB tests were completed. The network and disk I/O tests were run simultaneously for 10 minutes, sampling the I/O rate every 100ms. The differential I/O rates obtained are shown in the two plots below. Test Environment Parameterization
  • 6. ©2017 Xanadu Big Data, LLC All Rights Reserved Network/Disk I/O Performance
  • 7. ©2017 Xanadu Big Data, LLC All Rights Reserved Xanadu Configuration Xanadu was deployed to four instances, each of which was running a Xanadu storage process. Three of these instances were also running a Xanadu real-time process and the other was running a Xanadu Registry process. Cassandra Configuration Cassandra 2.0 was deployed to four instances. All nodes were configured to listen on the service network address and pointed to the same two initial seed processes to initiate the gossip protocol. MongoDB and Hbase configuration These results are extrapolated from a similar benchmarking test paper released by Datastax using the same test parameters in a similar cloud environment. DB Testing Configuration
  • 8. ©2017 Xanadu Big Data, LLC All Rights Reserved Riak Riak 1.4.7 was deployed to four instances. All nodes were configured to listen on the service network address. The last three Riak nodes to be started were joined sequentially to the cluster before starting any YCSB workload. DB Testing Configuration
  • 9. ©2017 Xanadu Big Data, LLC All Rights Reserved To compare Xanadu with other NoSQL data-stores, the industry standard Yahoo! Cloud Storage Benchmark (YCSB) package was used under the following conditions: 1. Load: Clients exclusively insert new keys and values based on YCSB. 2. Workload a: an equal ratio of reading and updating keys & values. An application example is a session store recording recent actions. 3. Workload b: a read heavy 95:5 ratio of reading and updating keys and values. Application example: photo tagging; add a tag is an update, but most operations are to read tags. For each store a four node cluster was configured, and in each test phase there were additionally four client nodes. Testing Parameters
  • 10. ©2017 Xanadu Big Data, LLC All Rights Reserved Each workload was run until the system had performed 4 million operations. In each read, write or update made there were 20 columns in each update, each with 100 bytes of data (so each update contained about 2kB of data). Keys to be read or updated were chosen randomly from a Zipfian distribution. Each insert or update required the data to be stored on two store nodes, with the highest consistency level available in Cassandra and Riak while each read was required to be read from onemonly store node. Testing Parameters
  • 11. ©2017 Xanadu Big Data, LLC All Rights Reserved Throughput
  • 12. ©2017 Xanadu Big Data, LLC All Rights Reserved Latency
  • 13. ©2017 Xanadu Big Data, LLC All Rights Reserved Scalability To demonstrate the scalability of Xanadu we also measured the average insert throughput as a function of the number of instances running Xanadu. This is shown below for the case that each insert is 200 bytes, 2 kB and 16 kB in size.
  • 14. ©2017 Xanadu Big Data, LLC All Rights Reserved Scalability The average data-rate is shown below for each of 200 bytes, 2 kB and 16 kB data sizes.