SlideShare a Scribd company logo
Full scan frenzy at Amadeus
Unpredictable & interactive
analysis of terabytes of data
MongoDB World, June 1 2015
Laurent Dollé
Attila Tozser
Nicolas Motte
265ced1609a17cf1a5979880a2ad364653895ae8
Amadeus
today
1
265ced1609a17cf1a5979880a2ad364653895ae8
Amadeus
In a few words
Amadeus is a technology company dedicated to the
global travel industry.
We are present in 195 countries
with a worldwide team of more than 11,000 people.
Our solutions help improve the
business performance
of travel agencies, corporations, airlines,
airports, hotels, railways and more.
©2015AmadeusITGroupSA
Connecting
The travel industry
Cruiselines
Hotels
Car rental
Ground handlers
Ferry operators
Ground transportation
Airports
Travel agencies
Insurance companies
Airlines
©2015AmadeusITGroupSA
Supporting
The traveler life cycle
Post-trip
On trip
Pre-trip Buy/Purchase
Search
Inspire
©2015AmadeusITGroupSA
Robust
Global operations
We designed & own our Data Processing Centres
_ Central DC @ Erding, Germany
_ Remote DCs all over the globe
_ Recovery DC on standby in case of natural disasters
1.6+
billion
transactions
processed per day
526+
million
travel agency
bookings
processed in 2014
695+
million
Passengers
Boarded
in 2014
95%of the world’s
scheduled network
airline seats
©2015AmadeusITGroupSA
Close
To our customers
©2015AmadeusITGroupSA
Our commitment
To innovation
_ Amadeus has invested €3.5bn in
Research & Development
since 2004.
_ Nominated within “top 3” software
companies in 2014 European Union
Industrial R&D Investment Scorecard.
©2015AmadeusITGroupSA
Amadeus
Revenue Accounting Search
2
265ced1609a17cf1a5979880a2ad364653895ae8
Revenue of a flight ticket
is shared
_ Travel agent
_ Governments
_ Airlines: many can be involved
(marketing & operating)
What for?
Passenger Revenue Accounting
Amadeus
Revenue Accounting
handles cash flows
on behalf of airlines
_ Tracking
_ Error handling & optimisation
_ Reporting: analysis & audit
©2015AmadeusITGroupSA
One of our launch partners is a
large European airline
_ transporting 35m+
passengers a year
_ key player in the
revenue accounting industry
Business needs
Gathered from a Revenue Accounting launch partner
They requested a user-friendly way to query any data
in our main operational database
_ Unpredictable ad-hoc search
_ Many advanced reporting requirements
Migrating
_ from their
in-house data warehouse
_ to our
cloud-based solution
©2015AmadeusITGroupSA
_Graphical
user interface
_ based on the SQL paradigm
_ to edit, import, save & share
queries
Revenue Accounting Search
The main promises
©2015AmadeusITGroupSA
_Data warehouse
fed in real time
4 years history (1.5bn documents, versioned)
_ Interactive response times
Revenue Accounting Search
The main promises
©2015AmadeusITGroupSA
Expecting fast answer
to unpredictable queries
No index, no hint (almost)
_ Fields to be scanned unknown
_ In-memory full scans to decrease response time
Need to use all the available hardware power
& scale out for sustainable performances
Support mainstream SQL DML statements
_ Aggregation
_ Cross-column comparison, Boolean logic
_ Sort
©2015AmadeusITGroupSA
Technical
architecture
3
265ced1609a17cf1a5979880a2ad364653895ae8
6 physical data servers
_ Server
HP ProLiant DL580 Gen8
4 sockets, x86, rack
_ 4x CPU
Intel Xeon E7-4850 v2
2.30 GHz, 12 physical cores
_ RAM 512GB
40GB/s scanning speed
_ 2x flash cards
Fusion-io ioScale 3.2TB
1.5GB/s read
3 virtual config servers
_ RAM 8GB
Production cluster setup
Facts & figures
Overall cluster
_ 288 cores, 3TB RAM, 38.4TB flash card
storage
Currently 1 year of production data (4 expected)
_ 310m+ docs (1bn)
_ Data size 3,6TB (11TB)
_ Average object size 12,5KB
_ File size 4.8TB (16TB)
©2015AmadeusITGroupSA
We have many cores, but only 6 boxes, if we would follow all the recommendations that
would end up in:
Microsharding coming from Microservices?
Enforce parallel processing
A MongoDB daemon (mongod) processes
each incoming query on a single thread.
_ It is not recommended to:
• Collocate many mongod processes on a single
box
Our online analytical processing
use-case implies:
_ full scans (ad-hoc queries)
_ limited concurrency for
queries (requests are from a
queue)
SHARD1
Node 1 Node 2 Node 4 Node 5 Node 6Node 3
SecondaryPrimary Secondary Secondary SecondaryPrimary
SHARD2
_ 2 cores running 286 idling
_ 2/3 of the memory idling
_ 4 flash cards working at
around 6% each and 8
idling
We need to go against some of the recommendations!
©2015AmadeusITGroupSA
_ Queries either CPU or memory scanning speed bound
_ On a fix amount of shards, the speed scales linearly with the data size
Benchmarking
0
2
4
6
8
10
12
0 200 400 600 800 1000 1200
TIME
DATA SIZE
FULL SCAN
0
100
200
300
400
500
0 200 400 600 800 1000 1200
TIME
DATA SIZE
FULL SCAN WITH AGGREGATION
Behaviour reproduced for 2 shard distributions
24 & 48 shards on 6 physical servers, 100% in-memory
©2015AmadeusITGroupSA
Microsharding coming from Microservices?
Enforce parallel processing
Problem Reason Solution
2 cores running
286 idling
2 primaries processing the
requests
We need more primaries
processing the requests
(to use all the 288 CPUs)
2/3 of the memory
idling
Primaries only on 2 nodes We need to run primaries
on all the available nodes
4 flash cards working
at around 6% each
and 8 idling
Only 2 threads used,
on 2 nodes
We need many threads
working on the cards (ideally
64 per box)
©2015AmadeusITGroupSA
Validation, from 6 to 48 shards on 6 physical servers
for 2 selected fairly complex queries
The behavior is logarithmic as the assigned proportion of the data per shard changes
0
50
100
150
200
250
300
350
400
0 20 40 60
TIME
SHARDS
FULL SCAN
0
200
400
600
800
1000
1200
1400
1600
1800
0 10 20 30 40 50 60
TIME
SHARDS
FULL SCAN WITH AGGREGATION
Microsharding
Measure the benefit
©2015AmadeusITGroupSA
arb
Microsharding (how to align the services ?)
265ced1609a17cf1a5979880a2ad364653895ae8
Node 1
Primary
Secondary
Arbiter
Shard, replicate & stripe
Node 2 Node 3 Node 4 Node 5 Node 6
1st 2nd
1st 2nd
1st 2nd
1st 2nd
1st
1st
1st
1st
1st
1st
1st
1st
1st
1st
1st
1st
1st
1st
1st
1st
1st
1st
1st
2nd
1st
1st
arb
arb
arb
arb
©2015AmadeusITGroupSA
Interleaving has serious penalties on
the performance
never do this unless you do not care about
performance…
_ Depends on the HW but can be up to 50-60%
NUMA: Non-unified memory access
Fit your workload to modern HW
How modern hardware handles the memory?
_ Local memory access:
local memory access from a local thread
_ Remote memory access:
memory of a different socket from a local thread
_ Interleaving
force the HW to mimic UMA
_ Binding
force the tasks to use only given resources
Socket
Server
Core
L1
cache
L2
cache
Core
1
Core
2
Core
n
Main
memory
L3 cache
Socket 2
Socket 1Socket 0
Socket 3
QPI
QPI
QPI
QPI
©2015AmadeusITGroupSA
NUMA: Non-unified memory access
The recommendation is to interleave, but:
Use node & memory binding!
numactl --physcpubind xx --localalloc mongod –f …
1
2
4
8
16
32
64
128
256
0.00049
0.00195
0.00391
0.00781
0.01562
0.03125
0.0625
0.125
0.25
0.5
1
2
4
8
16
32
64
128
256
512
Latency/ns
Dataset Size / MB
MEMORY LATENCY
186,943
229,000
191,303
49,378
61,919
43,124
1 DIMM PER
CHANNEL
2 DIMM PER
CHANNEL
3 DIMM PER
CHANNEL
BANDWIDTHMB/S
MEMORY BANDWITH
(STREAM TRIAD)
NUMA UMA
©2015AmadeusITGroupSA
Tuning for better CPU utilization
Can be achieved with couple of small changes using sysctl:
kernel.sched_min_granularity_ns set 2-10 times bigger
kernel.sched_migration_cost set 2-10 times bigger
Tipp: Look for guidelines from your HW vendor, how to tune your BIOS settings for
latency
Kernel tuning
How Linux schedules the CPU workload
IO-intensive workload scheduling
_ Default in Linux
_ Small slices on the cores
_ Often migrations between cores
CPU-intensive workload scheduling (MongoDB)
_ Needs tuning/experimenting
_ Longer slices on the cores
_ Rare migrations between cores
Use /proc/sched_debug or Intel PCM or any
similar tool to find the optimal settings:
©2015AmadeusITGroupSA
Cgroups
Light weight resource management
Mongod processes running on the
same hardware compete for resources
_ Memory
One big pool  competition for free
pages
_CPU
• Aggregation is really CPU intensive in
our case
• Often context switching
Above a certain size of memory
we had serious issues
Resource management for the services
_ Memory
Fine grained memory allocation limits
_ CPUset
CPU binding like in NUMA
_ CPU
Resource sharing between tasks (restrict
some resources for the operation system)
©2015AmadeusITGroupSA
Cgroups
Tiered storage concept with resource management
_ MongoDB uses mmap to cache
data in memory (<3.0)
• No good influence on the caching
• Due to LRU works as a FIFO queue
in this case
_ Example:
• 1., We have 200GB data and 100GB
memory
• Or
• 2., 200GB data and 1GB memory
• The scanning speed is the same
_ With cgroups the first case could
be 40-50% faster.
Query 2 : progress at 70%
Query 2 : progress at 0%
Query 1 : progress at 100%
In cachePaging GAP
Query 1 : progress at 100%
Query 2 : progress at 20%
In cacheIn cache Paging GAP
Query 1 : progress at 100%
In cache
_ 50% memory 2 subsequent queries
_ 100% paged in and out
1
2
3
©2015AmadeusITGroupSA
Q 1
_ Using many shards instead of one divides
the work to smaller chunks
_ Define a high memory and a low memory
cgroup and assign the shards to them
_ 40% served from memory 60% from disk
_ The analogy can be applied for many tiers
• Memory -> SSD -> spinning disk
Query 1 : progress at 100%
Q 1 Q 1 Q 1 Q 1
Q 1
Query 1 : progress at 100%
Q 1 Q 1 Q 1 Q 1
In cache In cache
• High memory cgroup
All served from memory
• Low memory cgroup
All served from disk
Cgroups
Tiered storage concept with resource management
©2015AmadeusITGroupSA
Microsharding is a powerful way to increase response times, what else can bring value?
Database customization
And its results
NUMA
Kernel
tuning
Striped
replica set
Cgroups
Cgroups
Prevent shards from competing for memory when data
does not fit into RAM – especially with microsharding.
Low-memory Cgroups may be compressed with zRAM/WiredTiger.
Kernel tuning
Optimize Linux in case of CPU-bound effort (vs. IO-bound):
small readahead, THP off, increase task scheduler.
NUMA
Restrict access to CPU & memory for secondary daemons.
Striped replica set
Span shards on all the available hardware, with secondary
daemons replicated on different nodes for smooth failover.
©2015AmadeusITGroupSA
Production
benchmarks
4
265ced1609a17cf1a5979880a2ad364653895ae8
Full scan aggregation is CPU-bound,
with a fixed entry cost for unwinds.
_ no unwind 3s
_ 1, 2 or 3 unwinds 70s
_ additional cost if more unwinds
Interactive response times promise is complied with
on basic use-cases.
In the absence of concurrency, response times are
consistent across all tests.
Production response times
And their lessons learnt
©2015AmadeusITGroupSA
Operability &
Monitoring Tools
5
265ced1609a17cf1a5979880a2ad364653895ae8
Operability & Monitoring
Tooling Architecture
Software Upgrade
Topology
Operability
Orchestrator
Alerting
Monitoring Data Store
Internal Tools
©2015AmadeusITGroupSA
2.3 Puppet Setup
Orchestrator
1. Mount Servers
4. Install OS and NoSQL store
6. Ticket Tracker Setup
7. Tools Validation
8. Dev Validation
9. Handover to Ops
Only for Physical Node
Only for VM
Common for all Data Stores
2.2 Create VM2.1 Network Setup
3. Assign DNS names
System Setup
Application Setup
5. Monitoring Setup
©2015AmadeusITGroupSA
Monitoring
©2015AmadeusITGroupSA
Merger
MMS
MMS
Monitoring
Architecture
MMSParserMonitoring
REST
API
c
s
v
CGI
pythonc++
©2015AmadeusITGroupSA
Monitoring
Demo
©2015AmadeusITGroupSA
Alerting
©2015AmadeusITGroupSA
Alerting
Architecture
MMS
MMS
MMS
REST
API
AlertingTicket Pinger
Configuration
shell
TCP
©2015AmadeusITGroupSA
Operability
©2015AmadeusITGroupSA
MMS
Operability Server
MMS
MMS
Operability
Architecture
MMS
Operability
Status
MCollective
Manual
Action
MMS
MMS
MongoDB
java
python
REST API
Active MQ
SSH SSH
©2015AmadeusITGroupSA
Operability
Demo
©2015AmadeusITGroupSA
You can follow us on:
AmadeusITGroup
amadeus.com/blog
amadeus.com
Thank you
265ced1609a17cf1a5979880a2ad364653895ae8

More Related Content

What's hot

Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Cloudera, Inc.
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
DataStax Academy
 
Going native with Apache Cassandra
Going native with Apache CassandraGoing native with Apache Cassandra
Going native with Apache Cassandra
Johnny Miller
 
TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)
TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)
TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)
Mark Rittman
 
DataStax Enterprise in the Field – 20160920
DataStax Enterprise in the Field – 20160920DataStax Enterprise in the Field – 20160920
DataStax Enterprise in the Field – 20160920
Daniel Cohen
 
Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)
Imply
 
Oracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hopeOracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hope
DataStax
 
Army of arm - NYC downtown tech meetup
Army of arm - NYC downtown tech meetupArmy of arm - NYC downtown tech meetup
Army of arm - NYC downtown tech meetup
Andy Pernsteiner
 
Apache Druid®: A Dance of Distributed Processes
 Apache Druid®: A Dance of Distributed Processes Apache Druid®: A Dance of Distributed Processes
Apache Druid®: A Dance of Distributed Processes
Imply
 
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
DataStax
 
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...
DataStax
 
implementation of a big data architecture for real-time analytics with data s...
implementation of a big data architecture for real-time analytics with data s...implementation of a big data architecture for real-time analytics with data s...
implementation of a big data architecture for real-time analytics with data s...
Joseph Arriola
 
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR Technologies
 
Intro to hadoop tutorial
Intro to hadoop tutorialIntro to hadoop tutorial
Intro to hadoop tutorial
markgrover
 
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data ConnectorsDeep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Mark Rittman
 
BigDataTech 2016 How to manage authorization rules on Hadoop cluster with Apa...
BigDataTech 2016 How to manage authorization rules on Hadoop cluster with Apa...BigDataTech 2016 How to manage authorization rules on Hadoop cluster with Apa...
BigDataTech 2016 How to manage authorization rules on Hadoop cluster with Apa...
Krzysztof Adamski
 
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platformcloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
Rakuten Group, Inc.
 
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory ComputingIMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
In-Memory Computing Summit
 
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
DataStax
 
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax
 

What's hot (20)

Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
 
Going native with Apache Cassandra
Going native with Apache CassandraGoing native with Apache Cassandra
Going native with Apache Cassandra
 
TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)
TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)
TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)
 
DataStax Enterprise in the Field – 20160920
DataStax Enterprise in the Field – 20160920DataStax Enterprise in the Field – 20160920
DataStax Enterprise in the Field – 20160920
 
Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)
 
Oracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hopeOracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hope
 
Army of arm - NYC downtown tech meetup
Army of arm - NYC downtown tech meetupArmy of arm - NYC downtown tech meetup
Army of arm - NYC downtown tech meetup
 
Apache Druid®: A Dance of Distributed Processes
 Apache Druid®: A Dance of Distributed Processes Apache Druid®: A Dance of Distributed Processes
Apache Druid®: A Dance of Distributed Processes
 
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
 
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...
 
implementation of a big data architecture for real-time analytics with data s...
implementation of a big data architecture for real-time analytics with data s...implementation of a big data architecture for real-time analytics with data s...
implementation of a big data architecture for real-time analytics with data s...
 
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community Edition
 
Intro to hadoop tutorial
Intro to hadoop tutorialIntro to hadoop tutorial
Intro to hadoop tutorial
 
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data ConnectorsDeep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
 
BigDataTech 2016 How to manage authorization rules on Hadoop cluster with Apa...
BigDataTech 2016 How to manage authorization rules on Hadoop cluster with Apa...BigDataTech 2016 How to manage authorization rules on Hadoop cluster with Apa...
BigDataTech 2016 How to manage authorization rules on Hadoop cluster with Apa...
 
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platformcloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
 
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory ComputingIMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
 
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
Real Time Business Intelligence with Cassandra, Kafka and Hadoop - A Real Sto...
 
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
 

Viewers also liked

MongoDB Europe 2016 - Choosing Between 100 Billion Travel Options – Instant S...
MongoDB Europe 2016 - Choosing Between 100 Billion Travel Options – Instant S...MongoDB Europe 2016 - Choosing Between 100 Billion Travel Options – Instant S...
MongoDB Europe 2016 - Choosing Between 100 Billion Travel Options – Instant S...
MongoDB
 
Red Hat OpenShift V3 Overview and Deep Dive
Red Hat OpenShift V3 Overview and Deep DiveRed Hat OpenShift V3 Overview and Deep Dive
Red Hat OpenShift V3 Overview and Deep DiveGreg Hoelzer
 
Big Data Paris - Air France: Stratégie BigData et Use Cases
Big Data Paris - Air France: Stratégie BigData et Use CasesBig Data Paris - Air France: Stratégie BigData et Use Cases
Big Data Paris - Air France: Stratégie BigData et Use Cases
MongoDB
 
OpenShift v3 Technical Overview
OpenShift v3 Technical OverviewOpenShift v3 Technical Overview
OpenShift v3 Technical Overview
Nakayama Kenjiro
 
OpenShift Enterprise 3.1 vs kubernetes
OpenShift Enterprise 3.1 vs kubernetesOpenShift Enterprise 3.1 vs kubernetes
OpenShift Enterprise 3.1 vs kubernetes
Samuel Terburg
 
Amadeus Ticket Changer
Amadeus Ticket ChangerAmadeus Ticket Changer
Amadeus Ticket Changer
Tariq Thowfeek
 
Common MongoDB Use Cases
Common MongoDB Use Cases Common MongoDB Use Cases
Common MongoDB Use Cases MongoDB
 
Great american teach in 2014
Great american teach in 2014Great american teach in 2014
Great american teach in 2014
Rajasekar Elango
 
MongoDB at Baidu
MongoDB at BaiduMongoDB at Baidu
MongoDB at Baidu
Mat Keep
 
Red Hat Forum Benelux 2015
Red Hat Forum Benelux 2015Red Hat Forum Benelux 2015
Red Hat Forum Benelux 2015
Microsoft
 
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell
Amrita Prasad
 
Metodos para comprobar numeros aleatorios
Metodos para comprobar numeros aleatoriosMetodos para comprobar numeros aleatorios
Metodos para comprobar numeros aleatorios
Yoyicto Alvarado
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache Storm
Andrea Iacono
 
Sox Compliance Solution
Sox Compliance SolutionSox Compliance Solution
Sox Compliance Solution
guest586cf0
 
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2
Amrita Prasad
 
OpenShift for Java EE Developers
OpenShift for Java EE DevelopersOpenShift for Java EE Developers
OpenShift for Java EE Developers
Markus Eisele
 
FICO Open Shift presentation
FICO Open Shift presentationFICO Open Shift presentation
FICO Open Shift presentation
Nicholas Gerasimatos
 
OpenShift In a Nutshell - Episode 01 - Introduction
OpenShift In a Nutshell - Episode 01 - IntroductionOpenShift In a Nutshell - Episode 01 - Introduction
OpenShift In a Nutshell - Episode 01 - Introduction
Behnam Loghmani
 
MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...
MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...
MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...
MongoDB
 
Build a PaaS with OpenShift Origin
Build a PaaS with OpenShift OriginBuild a PaaS with OpenShift Origin
Build a PaaS with OpenShift Origin
Steven Pousty
 

Viewers also liked (20)

MongoDB Europe 2016 - Choosing Between 100 Billion Travel Options – Instant S...
MongoDB Europe 2016 - Choosing Between 100 Billion Travel Options – Instant S...MongoDB Europe 2016 - Choosing Between 100 Billion Travel Options – Instant S...
MongoDB Europe 2016 - Choosing Between 100 Billion Travel Options – Instant S...
 
Red Hat OpenShift V3 Overview and Deep Dive
Red Hat OpenShift V3 Overview and Deep DiveRed Hat OpenShift V3 Overview and Deep Dive
Red Hat OpenShift V3 Overview and Deep Dive
 
Big Data Paris - Air France: Stratégie BigData et Use Cases
Big Data Paris - Air France: Stratégie BigData et Use CasesBig Data Paris - Air France: Stratégie BigData et Use Cases
Big Data Paris - Air France: Stratégie BigData et Use Cases
 
OpenShift v3 Technical Overview
OpenShift v3 Technical OverviewOpenShift v3 Technical Overview
OpenShift v3 Technical Overview
 
OpenShift Enterprise 3.1 vs kubernetes
OpenShift Enterprise 3.1 vs kubernetesOpenShift Enterprise 3.1 vs kubernetes
OpenShift Enterprise 3.1 vs kubernetes
 
Amadeus Ticket Changer
Amadeus Ticket ChangerAmadeus Ticket Changer
Amadeus Ticket Changer
 
Common MongoDB Use Cases
Common MongoDB Use Cases Common MongoDB Use Cases
Common MongoDB Use Cases
 
Great american teach in 2014
Great american teach in 2014Great american teach in 2014
Great american teach in 2014
 
MongoDB at Baidu
MongoDB at BaiduMongoDB at Baidu
MongoDB at Baidu
 
Red Hat Forum Benelux 2015
Red Hat Forum Benelux 2015Red Hat Forum Benelux 2015
Red Hat Forum Benelux 2015
 
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell
 
Metodos para comprobar numeros aleatorios
Metodos para comprobar numeros aleatoriosMetodos para comprobar numeros aleatorios
Metodos para comprobar numeros aleatorios
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache Storm
 
Sox Compliance Solution
Sox Compliance SolutionSox Compliance Solution
Sox Compliance Solution
 
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2
 
OpenShift for Java EE Developers
OpenShift for Java EE DevelopersOpenShift for Java EE Developers
OpenShift for Java EE Developers
 
FICO Open Shift presentation
FICO Open Shift presentationFICO Open Shift presentation
FICO Open Shift presentation
 
OpenShift In a Nutshell - Episode 01 - Introduction
OpenShift In a Nutshell - Episode 01 - IntroductionOpenShift In a Nutshell - Episode 01 - Introduction
OpenShift In a Nutshell - Episode 01 - Introduction
 
MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...
MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...
MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...
 
Build a PaaS with OpenShift Origin
Build a PaaS with OpenShift OriginBuild a PaaS with OpenShift Origin
Build a PaaS with OpenShift Origin
 

Similar to Full scan frenzy at amadeus

Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Community
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Danielle Womboldt
 
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
In-Memory Computing Summit
 
Are your ready for in memory applications?
Are your ready for in memory applications?Are your ready for in memory applications?
Are your ready for in memory applications?
G2MCommunications
 
Blades for HPTC
Blades for HPTCBlades for HPTC
Blades for HPTC
Guy Coates
 
Oracle SPARC T7 a M7 servery
Oracle SPARC T7 a M7 serveryOracle SPARC T7 a M7 servery
Oracle SPARC T7 a M7 servery
MarketingArrowECS_CZ
 
Numascale Product IBM
Numascale Product IBMNumascale Product IBM
Numascale Product IBM
IBM Danmark
 
cachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Cachingcachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Caching
ScyllaDB
 
How AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changesHow AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changes
Danny Sabour
 
QCon London.pdf
QCon London.pdfQCon London.pdf
QCon London.pdf
Monica Beckwith
 
Nimble Storage Series A presentation 2007
Nimble Storage Series A presentation 2007Nimble Storage Series A presentation 2007
Nimble Storage Series A presentation 2007
Wing Venture Capital
 
Explore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and SnappydataExplore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and Snappydata
Data Con LA
 
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Memory Fabric Forum
 
Troubleshooting SQL Server
Troubleshooting SQL ServerTroubleshooting SQL Server
Troubleshooting SQL ServerStephen Rose
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 edition
Bob Ward
 
Dataswft Intel benchmark 2013
Dataswft Intel benchmark 2013Dataswft Intel benchmark 2013
Dataswft Intel benchmark 2013
dhulis
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
Felicia Haggarty
 
Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7
MarketingArrowECS_CZ
 
Security a SPARC M7 CPU
Security a SPARC M7 CPUSecurity a SPARC M7 CPU
Security a SPARC M7 CPU
MarketingArrowECS_CZ
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
Hakka Labs
 

Similar to Full scan frenzy at amadeus (20)

Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
 
Are your ready for in memory applications?
Are your ready for in memory applications?Are your ready for in memory applications?
Are your ready for in memory applications?
 
Blades for HPTC
Blades for HPTCBlades for HPTC
Blades for HPTC
 
Oracle SPARC T7 a M7 servery
Oracle SPARC T7 a M7 serveryOracle SPARC T7 a M7 servery
Oracle SPARC T7 a M7 servery
 
Numascale Product IBM
Numascale Product IBMNumascale Product IBM
Numascale Product IBM
 
cachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Cachingcachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Caching
 
How AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changesHow AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changes
 
QCon London.pdf
QCon London.pdfQCon London.pdf
QCon London.pdf
 
Nimble Storage Series A presentation 2007
Nimble Storage Series A presentation 2007Nimble Storage Series A presentation 2007
Nimble Storage Series A presentation 2007
 
Explore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and SnappydataExplore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and Snappydata
 
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
 
Troubleshooting SQL Server
Troubleshooting SQL ServerTroubleshooting SQL Server
Troubleshooting SQL Server
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 edition
 
Dataswft Intel benchmark 2013
Dataswft Intel benchmark 2013Dataswft Intel benchmark 2013
Dataswft Intel benchmark 2013
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
 
Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7
 
Security a SPARC M7 CPU
Security a SPARC M7 CPUSecurity a SPARC M7 CPU
Security a SPARC M7 CPU
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 

Recently uploaded (20)

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

Full scan frenzy at amadeus

  • 1. Full scan frenzy at Amadeus Unpredictable & interactive analysis of terabytes of data MongoDB World, June 1 2015 Laurent Dollé Attila Tozser Nicolas Motte 265ced1609a17cf1a5979880a2ad364653895ae8
  • 3. Amadeus In a few words Amadeus is a technology company dedicated to the global travel industry. We are present in 195 countries with a worldwide team of more than 11,000 people. Our solutions help improve the business performance of travel agencies, corporations, airlines, airports, hotels, railways and more. ©2015AmadeusITGroupSA
  • 4. Connecting The travel industry Cruiselines Hotels Car rental Ground handlers Ferry operators Ground transportation Airports Travel agencies Insurance companies Airlines ©2015AmadeusITGroupSA
  • 5. Supporting The traveler life cycle Post-trip On trip Pre-trip Buy/Purchase Search Inspire ©2015AmadeusITGroupSA
  • 6. Robust Global operations We designed & own our Data Processing Centres _ Central DC @ Erding, Germany _ Remote DCs all over the globe _ Recovery DC on standby in case of natural disasters 1.6+ billion transactions processed per day 526+ million travel agency bookings processed in 2014 695+ million Passengers Boarded in 2014 95%of the world’s scheduled network airline seats ©2015AmadeusITGroupSA
  • 8. Our commitment To innovation _ Amadeus has invested €3.5bn in Research & Development since 2004. _ Nominated within “top 3” software companies in 2014 European Union Industrial R&D Investment Scorecard. ©2015AmadeusITGroupSA
  • 10. Revenue of a flight ticket is shared _ Travel agent _ Governments _ Airlines: many can be involved (marketing & operating) What for? Passenger Revenue Accounting Amadeus Revenue Accounting handles cash flows on behalf of airlines _ Tracking _ Error handling & optimisation _ Reporting: analysis & audit ©2015AmadeusITGroupSA
  • 11. One of our launch partners is a large European airline _ transporting 35m+ passengers a year _ key player in the revenue accounting industry Business needs Gathered from a Revenue Accounting launch partner They requested a user-friendly way to query any data in our main operational database _ Unpredictable ad-hoc search _ Many advanced reporting requirements Migrating _ from their in-house data warehouse _ to our cloud-based solution ©2015AmadeusITGroupSA
  • 12. _Graphical user interface _ based on the SQL paradigm _ to edit, import, save & share queries Revenue Accounting Search The main promises ©2015AmadeusITGroupSA
  • 13. _Data warehouse fed in real time 4 years history (1.5bn documents, versioned) _ Interactive response times Revenue Accounting Search The main promises ©2015AmadeusITGroupSA
  • 14. Expecting fast answer to unpredictable queries No index, no hint (almost) _ Fields to be scanned unknown _ In-memory full scans to decrease response time Need to use all the available hardware power & scale out for sustainable performances Support mainstream SQL DML statements _ Aggregation _ Cross-column comparison, Boolean logic _ Sort ©2015AmadeusITGroupSA
  • 16. 6 physical data servers _ Server HP ProLiant DL580 Gen8 4 sockets, x86, rack _ 4x CPU Intel Xeon E7-4850 v2 2.30 GHz, 12 physical cores _ RAM 512GB 40GB/s scanning speed _ 2x flash cards Fusion-io ioScale 3.2TB 1.5GB/s read 3 virtual config servers _ RAM 8GB Production cluster setup Facts & figures Overall cluster _ 288 cores, 3TB RAM, 38.4TB flash card storage Currently 1 year of production data (4 expected) _ 310m+ docs (1bn) _ Data size 3,6TB (11TB) _ Average object size 12,5KB _ File size 4.8TB (16TB) ©2015AmadeusITGroupSA
  • 17. We have many cores, but only 6 boxes, if we would follow all the recommendations that would end up in: Microsharding coming from Microservices? Enforce parallel processing A MongoDB daemon (mongod) processes each incoming query on a single thread. _ It is not recommended to: • Collocate many mongod processes on a single box Our online analytical processing use-case implies: _ full scans (ad-hoc queries) _ limited concurrency for queries (requests are from a queue) SHARD1 Node 1 Node 2 Node 4 Node 5 Node 6Node 3 SecondaryPrimary Secondary Secondary SecondaryPrimary SHARD2 _ 2 cores running 286 idling _ 2/3 of the memory idling _ 4 flash cards working at around 6% each and 8 idling We need to go against some of the recommendations! ©2015AmadeusITGroupSA
  • 18. _ Queries either CPU or memory scanning speed bound _ On a fix amount of shards, the speed scales linearly with the data size Benchmarking 0 2 4 6 8 10 12 0 200 400 600 800 1000 1200 TIME DATA SIZE FULL SCAN 0 100 200 300 400 500 0 200 400 600 800 1000 1200 TIME DATA SIZE FULL SCAN WITH AGGREGATION Behaviour reproduced for 2 shard distributions 24 & 48 shards on 6 physical servers, 100% in-memory ©2015AmadeusITGroupSA
  • 19. Microsharding coming from Microservices? Enforce parallel processing Problem Reason Solution 2 cores running 286 idling 2 primaries processing the requests We need more primaries processing the requests (to use all the 288 CPUs) 2/3 of the memory idling Primaries only on 2 nodes We need to run primaries on all the available nodes 4 flash cards working at around 6% each and 8 idling Only 2 threads used, on 2 nodes We need many threads working on the cards (ideally 64 per box) ©2015AmadeusITGroupSA
  • 20. Validation, from 6 to 48 shards on 6 physical servers for 2 selected fairly complex queries The behavior is logarithmic as the assigned proportion of the data per shard changes 0 50 100 150 200 250 300 350 400 0 20 40 60 TIME SHARDS FULL SCAN 0 200 400 600 800 1000 1200 1400 1600 1800 0 10 20 30 40 50 60 TIME SHARDS FULL SCAN WITH AGGREGATION Microsharding Measure the benefit ©2015AmadeusITGroupSA
  • 21. arb Microsharding (how to align the services ?) 265ced1609a17cf1a5979880a2ad364653895ae8 Node 1 Primary Secondary Arbiter Shard, replicate & stripe Node 2 Node 3 Node 4 Node 5 Node 6 1st 2nd 1st 2nd 1st 2nd 1st 2nd 1st 1st 1st 1st 1st 1st 1st 1st 1st 1st 1st 1st 1st 1st 1st 1st 1st 1st 1st 2nd 1st 1st arb arb arb arb ©2015AmadeusITGroupSA
  • 22. Interleaving has serious penalties on the performance never do this unless you do not care about performance… _ Depends on the HW but can be up to 50-60% NUMA: Non-unified memory access Fit your workload to modern HW How modern hardware handles the memory? _ Local memory access: local memory access from a local thread _ Remote memory access: memory of a different socket from a local thread _ Interleaving force the HW to mimic UMA _ Binding force the tasks to use only given resources Socket Server Core L1 cache L2 cache Core 1 Core 2 Core n Main memory L3 cache Socket 2 Socket 1Socket 0 Socket 3 QPI QPI QPI QPI ©2015AmadeusITGroupSA
  • 23. NUMA: Non-unified memory access The recommendation is to interleave, but: Use node & memory binding! numactl --physcpubind xx --localalloc mongod –f … 1 2 4 8 16 32 64 128 256 0.00049 0.00195 0.00391 0.00781 0.01562 0.03125 0.0625 0.125 0.25 0.5 1 2 4 8 16 32 64 128 256 512 Latency/ns Dataset Size / MB MEMORY LATENCY 186,943 229,000 191,303 49,378 61,919 43,124 1 DIMM PER CHANNEL 2 DIMM PER CHANNEL 3 DIMM PER CHANNEL BANDWIDTHMB/S MEMORY BANDWITH (STREAM TRIAD) NUMA UMA ©2015AmadeusITGroupSA
  • 24. Tuning for better CPU utilization Can be achieved with couple of small changes using sysctl: kernel.sched_min_granularity_ns set 2-10 times bigger kernel.sched_migration_cost set 2-10 times bigger Tipp: Look for guidelines from your HW vendor, how to tune your BIOS settings for latency Kernel tuning How Linux schedules the CPU workload IO-intensive workload scheduling _ Default in Linux _ Small slices on the cores _ Often migrations between cores CPU-intensive workload scheduling (MongoDB) _ Needs tuning/experimenting _ Longer slices on the cores _ Rare migrations between cores Use /proc/sched_debug or Intel PCM or any similar tool to find the optimal settings: ©2015AmadeusITGroupSA
  • 25. Cgroups Light weight resource management Mongod processes running on the same hardware compete for resources _ Memory One big pool  competition for free pages _CPU • Aggregation is really CPU intensive in our case • Often context switching Above a certain size of memory we had serious issues Resource management for the services _ Memory Fine grained memory allocation limits _ CPUset CPU binding like in NUMA _ CPU Resource sharing between tasks (restrict some resources for the operation system) ©2015AmadeusITGroupSA
  • 26. Cgroups Tiered storage concept with resource management _ MongoDB uses mmap to cache data in memory (<3.0) • No good influence on the caching • Due to LRU works as a FIFO queue in this case _ Example: • 1., We have 200GB data and 100GB memory • Or • 2., 200GB data and 1GB memory • The scanning speed is the same _ With cgroups the first case could be 40-50% faster. Query 2 : progress at 70% Query 2 : progress at 0% Query 1 : progress at 100% In cachePaging GAP Query 1 : progress at 100% Query 2 : progress at 20% In cacheIn cache Paging GAP Query 1 : progress at 100% In cache _ 50% memory 2 subsequent queries _ 100% paged in and out 1 2 3 ©2015AmadeusITGroupSA
  • 27. Q 1 _ Using many shards instead of one divides the work to smaller chunks _ Define a high memory and a low memory cgroup and assign the shards to them _ 40% served from memory 60% from disk _ The analogy can be applied for many tiers • Memory -> SSD -> spinning disk Query 1 : progress at 100% Q 1 Q 1 Q 1 Q 1 Q 1 Query 1 : progress at 100% Q 1 Q 1 Q 1 Q 1 In cache In cache • High memory cgroup All served from memory • Low memory cgroup All served from disk Cgroups Tiered storage concept with resource management ©2015AmadeusITGroupSA
  • 28. Microsharding is a powerful way to increase response times, what else can bring value? Database customization And its results NUMA Kernel tuning Striped replica set Cgroups Cgroups Prevent shards from competing for memory when data does not fit into RAM – especially with microsharding. Low-memory Cgroups may be compressed with zRAM/WiredTiger. Kernel tuning Optimize Linux in case of CPU-bound effort (vs. IO-bound): small readahead, THP off, increase task scheduler. NUMA Restrict access to CPU & memory for secondary daemons. Striped replica set Span shards on all the available hardware, with secondary daemons replicated on different nodes for smooth failover. ©2015AmadeusITGroupSA
  • 30. Full scan aggregation is CPU-bound, with a fixed entry cost for unwinds. _ no unwind 3s _ 1, 2 or 3 unwinds 70s _ additional cost if more unwinds Interactive response times promise is complied with on basic use-cases. In the absence of concurrency, response times are consistent across all tests. Production response times And their lessons learnt ©2015AmadeusITGroupSA
  • 32. Operability & Monitoring Tooling Architecture Software Upgrade Topology Operability Orchestrator Alerting Monitoring Data Store Internal Tools ©2015AmadeusITGroupSA
  • 33. 2.3 Puppet Setup Orchestrator 1. Mount Servers 4. Install OS and NoSQL store 6. Ticket Tracker Setup 7. Tools Validation 8. Dev Validation 9. Handover to Ops Only for Physical Node Only for VM Common for all Data Stores 2.2 Create VM2.1 Network Setup 3. Assign DNS names System Setup Application Setup 5. Monitoring Setup ©2015AmadeusITGroupSA
  • 42. You can follow us on: AmadeusITGroup amadeus.com/blog amadeus.com Thank you 265ced1609a17cf1a5979880a2ad364653895ae8