SlideShare a Scribd company logo
1 of 22
Azure + DSE Powers O365
Per-User Store
© 2015. All Rights Reserved.
1 Introduction
2 What We Built
3 What to Pay Close Attention To
4 Deployment
5 Wrap Up
© 2015. All Rights Reserved.
Overview
Sean Usher
Office 365
Email: seusher@microsoft.com
Twitter: @seanushermsft
Introduction
© 2015. All Rights Reserved.
Mahesh Thiagarajan
Microsoft Azure
Email: mahthi@microsoft.com
Twitter: @_cloudguy
Ben Lackey
DataStax
Email: ben.lackey@datastax.com
Introduction – Office 365
© 2015. All Rights Reserved.
Email
Collaboration
Document Authoring
Social Networking
Calendaring
File Storage
Business Intelligence
Etc…
Introduction – Azure
© 2015. All Rights Reserved.
Azure is Microsoft’s cloud computing platform, a growing collection of
integrated services—analytics, computing, database, mobile, networking,
storage, and web—for moving faster, achieving more, and saving money.
What We Built - Overview
© 2015. All Rights Reserved.
A way to understand our users and organizations at a deeper level!
• Are users happy with the service they are receiving?
• Are users fully utilizing the services they are paying us for?
• Are users hitting issues that we can proactively help them with?
• How has a user’s experience been over their lifetime?
• Can we discover insights that we aren’t even aware of?
This requires ingesting and storing a lot of data. We need to be able to
perform fast, scalable analytics on that data, or we will discover issues too
late!
Questions:
What We Built – Why Cassandra
© 2015. All Rights Reserved.
The Good
• Low Latency ✓
• Linear Scale ✓
• Highly Available ✓
• Aggregations (Spark/Spark Streaming) ✓
• Machine Learning (Spark ML) ✓
• No Enforcement of Full Consistency ✓ ✓ ✓
The Not-So-Good
• No Hosted Option in Azure ✗
• Have to Install and Configure it Ourselves ✗
Cassandra: 12 Nodes
Analytics: 12 Nodes
VM Size: G4
Heap Size: 30 GB
GC: G1
Ingestion: 20k – 50k events/sec
Data on ephemeral SSD drives.
RF = 3 in both DCs
Cassandra: 30 Nodes
Analytics: 15 Nodes (30 within 1 month)
VM Size: G4
Heap Size: 30 GB
GC: G1
Ingestion: 200k+ events/sec
Data on ephemeral SSD drives.
RF = 3 in both DCs
© 2015. All Rights Reserved.
What We Built – DSE Clusters
Cluster 1:
Cluster 2:
What We Built - Pipeline Evolution
RESTAPI
O365
Event Hub
Ingestion
Worker
(Azure worker role
using DataStax C#
driver)
C* Analytics
RESTAPI
O365
Kafka
C*/
Spark
Streaming
Analytics
G4 – Local SSD
Kafka: G4 – Data Disk
ZooKeeper: A7 – Data Disk
PaaS Small
G4 – Local SSD
© 2015. All Rights Reserved.
Cluster 1:
Cluster 2:
What to Pay Close Attention To – Azure Disks
VHD Storage: No more than 40 VMs per-storage account
“… and for a Standard Tier VM, it is about 40 (20,000/500 IOPS per disk)…..”
https://azure.microsoft.com/en-us/documentation/articles/azure-subscription-service-limits/
Disk Choice:
1. Local SSD (Ephemeral) – Fast but allows data loss.
2. Data Disk (Standard Storage) – No data loss, network-attached which can add latency. 20k IOPs account Limit.
3. Data Disk (Premium Storage) – No data loss, network-attached which can add latency. Per-disk IOPs Limit.
https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-how-to-attach-disk/
https://azure.microsoft.com/en-us/documentation/articles/azure-subscription-service-limits/#storage-limits
VM
SSD: /dev/sdb
Storage Account
(Data Disk)
Storage Account
(OS Disk)
OS: /dev/sda
© 2015. All Rights Reserved.
What to Pay Close Attention To – Azure VM Size
VM Size: We chose G4 nodes, but are investigating moving to D14 nodes. Having a larger number of smaller
nodes will allow for faster rebuild which can reduce recovery time.
https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-size-specs/
© 2015. All Rights Reserved.
What to Pay Close Attention To – Azure Networking
Networking: Virtual Network (VNet) vs Public IP
1. Public IPs – Default limit of 5 per subscription. Allows geo-redundant replication over Internet.
2. VNet – Define your own subnets and IP ranges. Allows geo-redundant replication via Gateways/Express Route.
No bandwidth limit within Vnet.
1. Standard Gateway – Max 100Mbs.
2. High-Performance Gateway – Max 200Mbs.
3. Express Route – Max 10Gbs.
https://azure.microsoft.com/en-us/documentation/articles/virtual-networks-instance-level-public-ip/
https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-vnet-vnet-rm-ps/
https://msdn.microsoft.com/en-us/library/azure/mt586720.aspx
© 2015. All Rights Reserved.
What to Pay Close Attention To – Azure Networking
Test performance of every dependency and see if it meets the expectations of your application.
Network Performance: Iperf (https://iperf.fr/) – Test bandwidth between two VMs within various DCs
VNet
VM
10.1.0.10
Iperf -s
VM
10.1.0.11
Iperf –c 10.1.0.10
user@machine:~$ iperf -c 10.1.0.10
------------------------------------------------------------
Client connecting to 10.1.0.10, TCP port 5001
TCP window size: 2.50 MByte (default)
------------------------------------------------------------
[ 3] local 10.1.0.10 port 42892 connected with 10.1.0.10 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 45.7 GBytes 39.2 Gbits/sec
© 2015. All Rights Reserved.
What to Pay Close Attention To – Azure Storage
Test performance of every dependency and see if it meets the expectations of your application.
Disk: SysBench (https://wiki.gentoo.org/wiki/Sysbench) – Test write throughput and IOPs
user@machine:/mnt$ sysbench --test=fileio --file-total-size=1000G --file-test-mode=rndrw --init-rng=on --max-time=300 --max-requests=0 run
sysbench 0.4.12: multi-threaded system evaluation benchmark
<….. Excess Logging Removed….>
Operations performed: 402240 Read, 268160 Write, 858065 Other = 1528465 Total
Read 6.1377Gb Written 4.0918Gb Total transferred 10.229Gb (34.917Mb/sec)
2234.67 Requests/sec executed
Test execution summary:
total time: 300.0002s
total number of events: 670400
total time taken by event execution: 16.1526
per-request statistics:
min: 0.00ms
avg: 0.02ms
max: 2.20ms
approx. 95 percentile: 0.05ms
Threads fairness:
events (avg/stddev): 670400.0000/0.00
execution time (avg/stddev): 16.1526/0.00 © 2015. All Rights Reserved.
What to Pay Close Attention To – Cassandra
Metrics!
Need to tune? Al Tobey can help - https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html
© 2015. All Rights Reserved.
What to Pay Close Attention To – Cassandra
SSTable Count
• Too many SSTables can lead to OOM errors and nodes becoming unavailable.
• Watch count and balance compaction throughput with system limits.
• SSTable count may spike during repairs if data is inconsistent.
Dropped Mutations
• Dropped mutations mean more repairs need to be done.
• Impact of dropped mutations can be controlled by tuning write consistency.
• Check iostat to see if disk queue is building up or write latency is high.
• iostat -x /dev/sdb 1 5
• Do drops only happen when Spark Jobs batch write? Tune Spark write throughput
(https://github.com/datastax/spark-cassandra-connector/blob/v1.2.5/doc/FAQ.md)
See memtables & flushing in Al’s Tuning Guide.
© 2015. All Rights Reserved.
What to Pay Close Attention To – Cassandra
Pending Compactions
• If you aren’t keeping up with compactions, performance will suffer.
• Too many SSTables impact read speed, but also can lead to hitting OS limits. See:
• /etc/sysctl.conf - vm.max_map_count
• /etc/security/limits.d/cassandra.conf – nofile
• /etc/init.d/dse – Certain DSE versions overwrite nofile with: FD_LIMIT=100000
Heap Used
• Heap usage changes over time. What works in week one, may not work in week 10.
• We used a 20GB heap until nodes started hitting OOM when they needed 25 GB.
• Use G1 if at all possible to see GC times decrease, and use a large (25 – 30 GB) heap.
• Let G1 tune your young generation heap size.
© 2015. All Rights Reserved.
What to Pay Close Attention To – Spark
We are still learning!
Scheduler Output:
NOT CRON!
Spark UI: Spark Job Logs:
If you don’t enable Spark UI for
security reasons, ship your Spark
logs off box for analysis.
You may also find that jobs fail to
read data because partitions are
missing or nodes are timing out.
This can indicate you are
overwhelming Cassandra.
© 2015. All Rights Reserved.
Deployment
Use the Azure/DataStax Template
Azure will be investing in building more features into the Azure template, and you will get those easier if you use the
existing template.
https://www.youtube.com/watch?v=vacp267zLBA&noredirect=1
https://github.com/DSPN/azure-resource-manager-dse
We Didn’t Use the Template because it wasn’t ready yet. We had to write our own logic to deploy nodes and need to
transition to the template so we can get all of these new features. We are scheduling time to do this because it will
save us a lot of work!
Consider Security and Compliance: This will influence how you deploy (VNet vs Public IP), what Cassandra configuration
you use (internode encryption, require_client_auth: true), and what OS configuration you use (CIS standards).
C* Hardening: http://thelastpickle.com/blog/2015/09/30/hardening-cassandra-step-by-step-part-1-server-to-server.html
CIS Standards: https://benchmarks.cisecurity.org/downloads/show-single/?file=ubuntu1404.100
© 2015. All Rights Reserved.
Azure Templates can:
• Ensure Idempotency
• Simplify Orchestration
• Simplify Roll-back
• Provide Cross-Resource Configuration
and Update Support
Azure Templates are:
• Source file, checked-in
• Specifies resources and dependencies
(VMs, WebSites, DBs) and connections
(config, LB sets)
• Parametized input/output
Instantiation of repeatable config.
Configuration  Resource Group
Power of Repeatability
SQL - A Website
Virtual
Machines
SQL-A
Website
[SQL CONFIG] VM (2x)
DEPENDS ON SQLDEPENDS ON SQL
SQL CONFIG
Extending the power of your VM
Enable easier management
Support partner ecosystem
Full control still with you!
Azure VM Extensions
Curated
ExtensionsAgent
Thank you
Sean Usher
Office 365
Email: seusher@microsoft.com
Twitter: @seanushermsft
Mahesh Thiagarajan
Microsoft Azure
Email: mahthi@microsoft.com
Twitter: @_cloudguy
Ben Lackey
DataStax
Email: ben.lackey@datastax.com

More Related Content

What's hot

Real-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackReal-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackAnirvan Chakraborty
 
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...DataStax Academy
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7DataStax
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2DataStax Academy
 
Cassandra Development Nirvana
Cassandra Development Nirvana Cassandra Development Nirvana
Cassandra Development Nirvana DataStax
 
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra RockstarWebinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra RockstarDataStax
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownDataStax
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityHiromitsu Komatsu
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Nyc summit intro_to_cassandra
Nyc summit intro_to_cassandraNyc summit intro_to_cassandra
Nyc summit intro_to_cassandrazznate
 
From PoCs to Production
From PoCs to ProductionFrom PoCs to Production
From PoCs to ProductionDataStax
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseDataStax
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarDataStax Academy
 
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 Andrey Vykhodtsev
 
Running Analytics at the Speed of Your Business
Running Analytics at the Speed of Your BusinessRunning Analytics at the Speed of Your Business
Running Analytics at the Speed of Your BusinessRedis Labs
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japanHiromitsu Komatsu
 
How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxDataStax
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopEvans Ye
 
Managing Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveManaging Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveTesora
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...ivmaykov
 

What's hot (20)

Real-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackReal-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stack
 
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2
 
Cassandra Development Nirvana
Cassandra Development Nirvana Cassandra Development Nirvana
Cassandra Development Nirvana
 
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra RockstarWebinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra Community
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Nyc summit intro_to_cassandra
Nyc summit intro_to_cassandraNyc summit intro_to_cassandra
Nyc summit intro_to_cassandra
 
From PoCs to Production
From PoCs to ProductionFrom PoCs to Production
From PoCs to Production
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax Enterprise
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
 
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3
 
Running Analytics at the Speed of Your Business
Running Analytics at the Speed of Your BusinessRunning Analytics at the Speed of Your Business
Running Analytics at the Speed of Your Business
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japan
 
How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStax
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
Managing Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveManaging Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack Trove
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
 

Viewers also liked

British Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data EngineeringBritish Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data EngineeringDataStax Academy
 
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...DataStax Academy
 
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...DataStax Academy
 
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...DataStax Academy
 
Apache Cassandra at Narmal 2014
Apache Cassandra at Narmal 2014Apache Cassandra at Narmal 2014
Apache Cassandra at Narmal 2014DataStax Academy
 
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...DataStax Academy
 
Introduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraIntroduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraDataStax Academy
 
Cassandra Summit 2014: Apache Cassandra at Telefonica CBS
Cassandra Summit 2014: Apache Cassandra at Telefonica CBSCassandra Summit 2014: Apache Cassandra at Telefonica CBS
Cassandra Summit 2014: Apache Cassandra at Telefonica CBSDataStax Academy
 
Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!DataStax Academy
 
Coursera's Adoption of Cassandra
Coursera's Adoption of CassandraCoursera's Adoption of Cassandra
Coursera's Adoption of CassandraDataStax Academy
 
Production Ready Cassandra (Beginner)
Production Ready Cassandra (Beginner)Production Ready Cassandra (Beginner)
Production Ready Cassandra (Beginner)DataStax Academy
 
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax Academy
 
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsBattery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsDataStax Academy
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax Academy
 
Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2
Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2
Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2DataStax Academy
 
The Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to DatabaseThe Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to DatabaseDataStax Academy
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessJon Haddad
 
Crash course intro to cassandra
Crash course   intro to cassandraCrash course   intro to cassandra
Crash course intro to cassandraJon Haddad
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core ConceptsJon Haddad
 

Viewers also liked (20)

British Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data EngineeringBritish Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data Engineering
 
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
 
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
 
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
 
Apache Cassandra at Narmal 2014
Apache Cassandra at Narmal 2014Apache Cassandra at Narmal 2014
Apache Cassandra at Narmal 2014
 
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
 
Introduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraIntroduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for Cassandra
 
Cassandra Summit 2014: Apache Cassandra at Telefonica CBS
Cassandra Summit 2014: Apache Cassandra at Telefonica CBSCassandra Summit 2014: Apache Cassandra at Telefonica CBS
Cassandra Summit 2014: Apache Cassandra at Telefonica CBS
 
Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!
 
Coursera's Adoption of Cassandra
Coursera's Adoption of CassandraCoursera's Adoption of Cassandra
Coursera's Adoption of Cassandra
 
Production Ready Cassandra (Beginner)
Production Ready Cassandra (Beginner)Production Ready Cassandra (Beginner)
Production Ready Cassandra (Beginner)
 
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
 
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsBattery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
 
New features in 3.0
New features in 3.0New features in 3.0
New features in 3.0
 
Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2
Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2
Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2
 
The Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to DatabaseThe Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to Database
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 Awesomeness
 
Crash course intro to cassandra
Crash course   intro to cassandraCrash course   intro to cassandra
Crash course intro to cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 

Similar to Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store

Azure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreAzure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreDataStax Academy
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
Intro to Azure SQL database
Intro to Azure SQL databaseIntro to Azure SQL database
Intro to Azure SQL databaseSteve Knutson
 
Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...
Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...
Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...Pierre GRANDIN
 
Microsoft Azure News - 2019 April
Microsoft Azure News - 2019 AprilMicrosoft Azure News - 2019 April
Microsoft Azure News - 2019 AprilDaniel Toomey
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureRevolution Analytics
 
Migrating Customers to Microsoft Azure: Lessons Learned From the Field
Migrating Customers to Microsoft Azure: Lessons Learned From the FieldMigrating Customers to Microsoft Azure: Lessons Learned From the Field
Migrating Customers to Microsoft Azure: Lessons Learned From the FieldIdo Flatow
 
analytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsanalytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsScott Miao
 
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...Amazon Web Services
 
Spark Streaming with Azure Databricks
Spark Streaming with Azure DatabricksSpark Streaming with Azure Databricks
Spark Streaming with Azure DatabricksDustin Vannoy
 
Web Speed And Scalability
Web Speed And ScalabilityWeb Speed And Scalability
Web Speed And ScalabilityJason Ragsdale
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesQAware GmbH
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesJosef Adersberger
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weitingWei Ting Chen
 
Azure Nights August2017
Azure Nights August2017Azure Nights August2017
Azure Nights August2017Michael Frank
 
20171122 aws usergrp_coretech-spn-cicd-aws-v01
20171122 aws usergrp_coretech-spn-cicd-aws-v0120171122 aws usergrp_coretech-spn-cicd-aws-v01
20171122 aws usergrp_coretech-spn-cicd-aws-v01Scott Miao
 
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginners
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginnersKoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginners
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginnersTobias Koprowski
 

Similar to Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store (20)

Azure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreAzure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User Store
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Windows azure overview for SharePoint Pros
Windows azure overview for SharePoint Pros Windows azure overview for SharePoint Pros
Windows azure overview for SharePoint Pros
 
Apache ignite v1.3
Apache ignite v1.3Apache ignite v1.3
Apache ignite v1.3
 
Intro to Azure SQL database
Intro to Azure SQL databaseIntro to Azure SQL database
Intro to Azure SQL database
 
Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...
Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...
Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...
 
Microsoft Azure News - 2019 April
Microsoft Azure News - 2019 AprilMicrosoft Azure News - 2019 April
Microsoft Azure News - 2019 April
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
 
Migrating Customers to Microsoft Azure: Lessons Learned From the Field
Migrating Customers to Microsoft Azure: Lessons Learned From the FieldMigrating Customers to Microsoft Azure: Lessons Learned From the Field
Migrating Customers to Microsoft Azure: Lessons Learned From the Field
 
analytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsanalytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the aws
 
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
 
Spark Streaming with Azure Databricks
Spark Streaming with Azure DatabricksSpark Streaming with Azure Databricks
Spark Streaming with Azure Databricks
 
Web Speed And Scalability
Web Speed And ScalabilityWeb Speed And Scalability
Web Speed And Scalability
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting
 
Azure Nights August2017
Azure Nights August2017Azure Nights August2017
Azure Nights August2017
 
20171122 aws usergrp_coretech-spn-cicd-aws-v01
20171122 aws usergrp_coretech-spn-cicd-aws-v0120171122 aws usergrp_coretech-spn-cicd-aws-v01
20171122 aws usergrp_coretech-spn-cicd-aws-v01
 
JOSA TechTalks - Downgrade your Costs
JOSA TechTalks - Downgrade your CostsJOSA TechTalks - Downgrade your Costs
JOSA TechTalks - Downgrade your Costs
 
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginners
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginnersKoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginners
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginners
 

More from DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 

More from DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 

Recently uploaded

Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekCzechDreamin
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoUXDXConf
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIES VE
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024TopCSSGallery
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101vincent683379
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKUXDXConf
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreelreely ones
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 

Recently uploaded (20)

Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreel
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 

Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store

  • 1. Azure + DSE Powers O365 Per-User Store © 2015. All Rights Reserved.
  • 2. 1 Introduction 2 What We Built 3 What to Pay Close Attention To 4 Deployment 5 Wrap Up © 2015. All Rights Reserved. Overview
  • 3. Sean Usher Office 365 Email: seusher@microsoft.com Twitter: @seanushermsft Introduction © 2015. All Rights Reserved. Mahesh Thiagarajan Microsoft Azure Email: mahthi@microsoft.com Twitter: @_cloudguy Ben Lackey DataStax Email: ben.lackey@datastax.com
  • 4. Introduction – Office 365 © 2015. All Rights Reserved. Email Collaboration Document Authoring Social Networking Calendaring File Storage Business Intelligence Etc…
  • 5. Introduction – Azure © 2015. All Rights Reserved. Azure is Microsoft’s cloud computing platform, a growing collection of integrated services—analytics, computing, database, mobile, networking, storage, and web—for moving faster, achieving more, and saving money.
  • 6. What We Built - Overview © 2015. All Rights Reserved. A way to understand our users and organizations at a deeper level! • Are users happy with the service they are receiving? • Are users fully utilizing the services they are paying us for? • Are users hitting issues that we can proactively help them with? • How has a user’s experience been over their lifetime? • Can we discover insights that we aren’t even aware of? This requires ingesting and storing a lot of data. We need to be able to perform fast, scalable analytics on that data, or we will discover issues too late! Questions:
  • 7. What We Built – Why Cassandra © 2015. All Rights Reserved. The Good • Low Latency ✓ • Linear Scale ✓ • Highly Available ✓ • Aggregations (Spark/Spark Streaming) ✓ • Machine Learning (Spark ML) ✓ • No Enforcement of Full Consistency ✓ ✓ ✓ The Not-So-Good • No Hosted Option in Azure ✗ • Have to Install and Configure it Ourselves ✗
  • 8. Cassandra: 12 Nodes Analytics: 12 Nodes VM Size: G4 Heap Size: 30 GB GC: G1 Ingestion: 20k – 50k events/sec Data on ephemeral SSD drives. RF = 3 in both DCs Cassandra: 30 Nodes Analytics: 15 Nodes (30 within 1 month) VM Size: G4 Heap Size: 30 GB GC: G1 Ingestion: 200k+ events/sec Data on ephemeral SSD drives. RF = 3 in both DCs © 2015. All Rights Reserved. What We Built – DSE Clusters Cluster 1: Cluster 2:
  • 9. What We Built - Pipeline Evolution RESTAPI O365 Event Hub Ingestion Worker (Azure worker role using DataStax C# driver) C* Analytics RESTAPI O365 Kafka C*/ Spark Streaming Analytics G4 – Local SSD Kafka: G4 – Data Disk ZooKeeper: A7 – Data Disk PaaS Small G4 – Local SSD © 2015. All Rights Reserved. Cluster 1: Cluster 2:
  • 10. What to Pay Close Attention To – Azure Disks VHD Storage: No more than 40 VMs per-storage account “… and for a Standard Tier VM, it is about 40 (20,000/500 IOPS per disk)…..” https://azure.microsoft.com/en-us/documentation/articles/azure-subscription-service-limits/ Disk Choice: 1. Local SSD (Ephemeral) – Fast but allows data loss. 2. Data Disk (Standard Storage) – No data loss, network-attached which can add latency. 20k IOPs account Limit. 3. Data Disk (Premium Storage) – No data loss, network-attached which can add latency. Per-disk IOPs Limit. https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-how-to-attach-disk/ https://azure.microsoft.com/en-us/documentation/articles/azure-subscription-service-limits/#storage-limits VM SSD: /dev/sdb Storage Account (Data Disk) Storage Account (OS Disk) OS: /dev/sda © 2015. All Rights Reserved.
  • 11. What to Pay Close Attention To – Azure VM Size VM Size: We chose G4 nodes, but are investigating moving to D14 nodes. Having a larger number of smaller nodes will allow for faster rebuild which can reduce recovery time. https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-size-specs/ © 2015. All Rights Reserved.
  • 12. What to Pay Close Attention To – Azure Networking Networking: Virtual Network (VNet) vs Public IP 1. Public IPs – Default limit of 5 per subscription. Allows geo-redundant replication over Internet. 2. VNet – Define your own subnets and IP ranges. Allows geo-redundant replication via Gateways/Express Route. No bandwidth limit within Vnet. 1. Standard Gateway – Max 100Mbs. 2. High-Performance Gateway – Max 200Mbs. 3. Express Route – Max 10Gbs. https://azure.microsoft.com/en-us/documentation/articles/virtual-networks-instance-level-public-ip/ https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-vnet-vnet-rm-ps/ https://msdn.microsoft.com/en-us/library/azure/mt586720.aspx © 2015. All Rights Reserved.
  • 13. What to Pay Close Attention To – Azure Networking Test performance of every dependency and see if it meets the expectations of your application. Network Performance: Iperf (https://iperf.fr/) – Test bandwidth between two VMs within various DCs VNet VM 10.1.0.10 Iperf -s VM 10.1.0.11 Iperf –c 10.1.0.10 user@machine:~$ iperf -c 10.1.0.10 ------------------------------------------------------------ Client connecting to 10.1.0.10, TCP port 5001 TCP window size: 2.50 MByte (default) ------------------------------------------------------------ [ 3] local 10.1.0.10 port 42892 connected with 10.1.0.10 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 45.7 GBytes 39.2 Gbits/sec © 2015. All Rights Reserved.
  • 14. What to Pay Close Attention To – Azure Storage Test performance of every dependency and see if it meets the expectations of your application. Disk: SysBench (https://wiki.gentoo.org/wiki/Sysbench) – Test write throughput and IOPs user@machine:/mnt$ sysbench --test=fileio --file-total-size=1000G --file-test-mode=rndrw --init-rng=on --max-time=300 --max-requests=0 run sysbench 0.4.12: multi-threaded system evaluation benchmark <….. Excess Logging Removed….> Operations performed: 402240 Read, 268160 Write, 858065 Other = 1528465 Total Read 6.1377Gb Written 4.0918Gb Total transferred 10.229Gb (34.917Mb/sec) 2234.67 Requests/sec executed Test execution summary: total time: 300.0002s total number of events: 670400 total time taken by event execution: 16.1526 per-request statistics: min: 0.00ms avg: 0.02ms max: 2.20ms approx. 95 percentile: 0.05ms Threads fairness: events (avg/stddev): 670400.0000/0.00 execution time (avg/stddev): 16.1526/0.00 © 2015. All Rights Reserved.
  • 15. What to Pay Close Attention To – Cassandra Metrics! Need to tune? Al Tobey can help - https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html © 2015. All Rights Reserved.
  • 16. What to Pay Close Attention To – Cassandra SSTable Count • Too many SSTables can lead to OOM errors and nodes becoming unavailable. • Watch count and balance compaction throughput with system limits. • SSTable count may spike during repairs if data is inconsistent. Dropped Mutations • Dropped mutations mean more repairs need to be done. • Impact of dropped mutations can be controlled by tuning write consistency. • Check iostat to see if disk queue is building up or write latency is high. • iostat -x /dev/sdb 1 5 • Do drops only happen when Spark Jobs batch write? Tune Spark write throughput (https://github.com/datastax/spark-cassandra-connector/blob/v1.2.5/doc/FAQ.md) See memtables & flushing in Al’s Tuning Guide. © 2015. All Rights Reserved.
  • 17. What to Pay Close Attention To – Cassandra Pending Compactions • If you aren’t keeping up with compactions, performance will suffer. • Too many SSTables impact read speed, but also can lead to hitting OS limits. See: • /etc/sysctl.conf - vm.max_map_count • /etc/security/limits.d/cassandra.conf – nofile • /etc/init.d/dse – Certain DSE versions overwrite nofile with: FD_LIMIT=100000 Heap Used • Heap usage changes over time. What works in week one, may not work in week 10. • We used a 20GB heap until nodes started hitting OOM when they needed 25 GB. • Use G1 if at all possible to see GC times decrease, and use a large (25 – 30 GB) heap. • Let G1 tune your young generation heap size. © 2015. All Rights Reserved.
  • 18. What to Pay Close Attention To – Spark We are still learning! Scheduler Output: NOT CRON! Spark UI: Spark Job Logs: If you don’t enable Spark UI for security reasons, ship your Spark logs off box for analysis. You may also find that jobs fail to read data because partitions are missing or nodes are timing out. This can indicate you are overwhelming Cassandra. © 2015. All Rights Reserved.
  • 19. Deployment Use the Azure/DataStax Template Azure will be investing in building more features into the Azure template, and you will get those easier if you use the existing template. https://www.youtube.com/watch?v=vacp267zLBA&noredirect=1 https://github.com/DSPN/azure-resource-manager-dse We Didn’t Use the Template because it wasn’t ready yet. We had to write our own logic to deploy nodes and need to transition to the template so we can get all of these new features. We are scheduling time to do this because it will save us a lot of work! Consider Security and Compliance: This will influence how you deploy (VNet vs Public IP), what Cassandra configuration you use (internode encryption, require_client_auth: true), and what OS configuration you use (CIS standards). C* Hardening: http://thelastpickle.com/blog/2015/09/30/hardening-cassandra-step-by-step-part-1-server-to-server.html CIS Standards: https://benchmarks.cisecurity.org/downloads/show-single/?file=ubuntu1404.100 © 2015. All Rights Reserved.
  • 20. Azure Templates can: • Ensure Idempotency • Simplify Orchestration • Simplify Roll-back • Provide Cross-Resource Configuration and Update Support Azure Templates are: • Source file, checked-in • Specifies resources and dependencies (VMs, WebSites, DBs) and connections (config, LB sets) • Parametized input/output Instantiation of repeatable config. Configuration  Resource Group Power of Repeatability SQL - A Website Virtual Machines SQL-A Website [SQL CONFIG] VM (2x) DEPENDS ON SQLDEPENDS ON SQL SQL CONFIG
  • 21. Extending the power of your VM Enable easier management Support partner ecosystem Full control still with you! Azure VM Extensions Curated ExtensionsAgent
  • 22. Thank you Sean Usher Office 365 Email: seusher@microsoft.com Twitter: @seanushermsft Mahesh Thiagarajan Microsoft Azure Email: mahthi@microsoft.com Twitter: @_cloudguy Ben Lackey DataStax Email: ben.lackey@datastax.com

Editor's Notes

  1. Premium – p10, p20, p30