SlideShare a Scribd company logo
1 of 30
Download to read offline
CREATING REDUNDANCY FOR BIG HADOOP CLUSTERS IS
HARD
Stuart Pook (s.pook@criteo.com @StuartPook)
BROUGHT TO YOU BY LAKE
Anna Savarin, Anthony Rabier, Meriam Lachkar, Nicolas Fraison, Rémy Gayet,
Rémy Saissy, Stuart Pook, Thierry Lefort & Yohan Bismuth
HADOOP AT CRITEO BACK IN 2014
39 PB raw capacity
> 100 000 jobs/day
10 000 CPUs
105 TB RAM
40 TB imported/day
PAID
COMPUTE ESSENTIAL
Extract, Transform & Load logs (72 hour buffers)
bidding models
billing
business analysis
HADOOP PROVIDES LOCAL REDUNDANCY
failing datanodes (1 or 2)
failing racks (1)
failing namenodes (1)
failing resourcemanager (1)
INFRASTRUCTURE PROVIDES
multiple routers
but single top of rack switch
multiple power sources
but one power supply per rack
NO PROTECTION AGAINST
data centre disaster
multiple datanode failures in a short time
multiple rack failures in a day
operator error
DATA BACKUP IN THE CLOUD
backup long
import 100 TB/day
create 50 TB/day
connection 50 Gb/s
restore 2 PB too long at 50 Gb/s
what about compute?
COMPUTE IN THE CLOUD
20 000 CPUs requires reservation
reservation expensive
no need for elasticity
Criteo has data centres
Criteo likes bare metal
cloud > 8 times more expensive
in-house get us exactly the network & hardware we need
BUILD A NEW DATA CENTRE IN 9 MONTHS
space for 5000 machines
non-blocking network
10 Gb/s endpoints
level 3 routing
Clos topology
power 1 megawatt + option for 1 MW
it's impossible
NEW HARDWARE
had one supplier
need competition to keep prices down
3 replies to our call for tenders
3 similar machines
16 (or 12) 6 TB SATA disks
2 Xeon E5-2650L v3, 24 cores, 48 threads
256 GB RAM
Mellanox 10 Gb/s network card
TEST THE HARDWARE
three 10 node clusters
disk bandwidth
network bandwidth
teragen
zero replication (disks)
high replication (network)
terasort
HARDWARE IS SIMILAR
eliminate the constructor with
4 DOA disks
other failed disks
20% higher power consumption
choose the cheapest and most dense → Huawei
MIX THE HARDWARE
operations are more difficult
multiple configurations needed
some clusters have both hardware types
we have more choice at each order
TWO MISTAKES (AT LEAST)
made master and datanodes the same
just one type of machine
easy migration (except kerberos)
moving master nodes to bigger machines
bad RAID card
change 700 cards on a running cluster
all blocks lost on each card change
no downtime allowed
HAVE THE DC, BUILD HADOOP
configure using Chef
infrastructure is code in git
automate to scale (because we don't)
test Hadoop with 10 hour petasorts
tune Hadoop for this scale
namenode machine crashes
upgrade kernel
rolling restart on all machines
rack by rack, not node by node
NEW CLUSTER ONLINE
more capacity as have 2 clusters
users love it
users find new uses
soon using more capacity than the new cluster provides
impossible to stop the old cluster
SITUATION NOW WORSE
two critical clusters
two Hadoop versions to support
not one but two SPOFs
WILL TRY (AGAIN) FOR REDUNDANCY
add hundreds of new nodes
soon the new cluster will be sufficient
move all jobs and data to the new cluster
stop the old cluster
upgrade the old cluster
run both in parallel
both clusters with capacity for essential jobs
both clusters have all essential data
procedure to handle a cluster loss
STAYING ALIVE
automate operations
disk changes
ramp ups
restarts (machines or services)
infrastructure as code
tests: kitchen, ChefSpec & preprod
merge requests with reviews
STAYING ALIVE — TUNE HADOOP
increase bandwidth and time limit for checkpoint
332 GB heap for namenode
180 GB reserved for native code & OS
tune GC for namenode
serial → not efficient on multi-thread
parallel → long pauses + high throughput
Concurrent Mark Sweep → short pauses + lower throughput
G1 → in prod
G1NewSizePercent=0
G1RSetRegionEntries=4096
+ParallelRefProcEnabled & +PerfDisableSharedMem
Azul not required for 1000 DN but might be for 3000
STAYING ALIVE — FIX BUGS
The cluster crashes, find the bug, if fixed, backport it, else fix
fix
HDFS-10220 expired leases make namenode unresponsive and failover
backport
YARN-4041 Slow delegation token renewal prolongs RM recovery
HDFS-9305 Delayed heartbeat processing causes storm of heartbeats
YARN-4546 ResourceManager crash due to scheduling opportunity overflow
HDFS-9906 Remove spammy log spew when a datanode is restarted
STAYING ALIVE — MONITORING
HDFS
namenode: missing blocks, GC, checkpoints, safemode, QPS, live datanode
datanodes: disks, read/write throughput, space
YARN
queue length, memory & CPU usage, job duration (scheduling + run time)
ResourceManager: QPS
bad nodes
probes to emulate client behavior with witness jobs
zookeeper: availability, probes
STAYING ALIVE — FASTER MASTER NODES
namenode does many sequential operations
long locks
failovers too slow
heartbeats lost
→ fast CPU
two big namenodes
512 GB RAM
2 × Intel Xeon CPU E5-2643 v4 @ 3.40GHz
3 × RAID 1 of 2 SSD
HANDLING SCALE
things that work at small scale then fail
download packages from Internet
DNS as database
CLUSTER COLLAPSE (FEBRUARY 2016)
lots of blocks → 132 GB namenode (NN) heap full
user creates 20 million files & 20 PB data on a Friday a ernoon
NN gets stuck doing GC → no throughput
increase standby heap size to 85% RAM via restart
too many requests during restart (iptables)
failover crashed
fsimage on active corrupt as too big for transfer
copy missing NN edits from journal node
restart 1200 datanodes in batches
36 hours to recover the cluster
NEED SERVICE-LEVEL AGREEMENT (SLA)
define
time for operations
job duration
request handling
measure
monitoring
respect
some services are “best effort”
OPERATOR ERROR
same operators on both clusters
one chef server for both clusters
single mistake → both clusters
WE HAVE
4 prod clusters
3 pre-prod clusters
2 running CDH4
5 running CDH5
2347 datanodes
40 986 cores
130 PB disk space
660 TB RAM
> 300 000 jobs/day
100 TB imported daily
6 PB created or read per day
UPCOMING CHALLENGES
Optimize and fix Hadoop
Add hundreds more datanodes
Create a new bare-metal data-centre
Make 2 big clusters work together
We are hiring
Come and join us
s.pook@criteo.com @StuartPook
Questions?

More Related Content

What's hot

Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadeaviadea
 
Building Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scaleBuilding Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scaleAlex Thompson
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0J.B. Langston
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing GuideJose De La Rosa
 
MapR Tutorial Series
MapR Tutorial SeriesMapR Tutorial Series
MapR Tutorial Seriesselvaraaju
 
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...DataStax
 
Quantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSQuantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSbigdatagurus_meetup
 
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...DataStax
 
ELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemAvleen Vig
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operationniallmilton
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...DataStax
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...DataStax Academy
 
The Automation Factory
The Automation FactoryThe Automation Factory
The Automation FactoryNathan Milford
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...DataStax
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyDataStax Academy
 
Building the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for HadoopBuilding the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for HadoopAll Things Open
 
Cassandra Troubleshooting (for 2.0 and earlier)
Cassandra Troubleshooting (for 2.0 and earlier)Cassandra Troubleshooting (for 2.0 and earlier)
Cassandra Troubleshooting (for 2.0 and earlier)J.B. Langston
 
Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2SATOSHI TAGOMORI
 

What's hot (20)

Apache Hadoop 0.22 and Other Versions
Apache Hadoop 0.22 and Other VersionsApache Hadoop 0.22 and Other Versions
Apache Hadoop 0.22 and Other Versions
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadea
 
Building Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scaleBuilding Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scale
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 
MapR Tutorial Series
MapR Tutorial SeriesMapR Tutorial Series
MapR Tutorial Series
 
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
 
Cassandra at teads
Cassandra at teadsCassandra at teads
Cassandra at teads
 
Quantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSQuantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFS
 
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
 
ELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log system
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operation
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 
The Automation Factory
The Automation FactoryThe Automation Factory
The Automation Factory
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
 
Building the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for HadoopBuilding the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for Hadoop
 
Cassandra Troubleshooting (for 2.0 and earlier)
Cassandra Troubleshooting (for 2.0 and earlier)Cassandra Troubleshooting (for 2.0 and earlier)
Cassandra Troubleshooting (for 2.0 and earlier)
 
Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2
 

Similar to Redundancy for Big Hadoop Clusters is hard - Stuart Pook

MapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR HadoopMapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR Hadoopabord
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInDataWorks Summit
 
Shift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
Shift into High Gear: Dramatically Improve Hadoop & NoSQL PerformanceShift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
Shift into High Gear: Dramatically Improve Hadoop & NoSQL PerformanceMapR Technologies
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7Ted Dunning
 
The Anatomy Of The Google Architecture Fina Lv1.1
The Anatomy Of The Google Architecture Fina Lv1.1The Anatomy Of The Google Architecture Fina Lv1.1
The Anatomy Of The Google Architecture Fina Lv1.1Hassy Veldstra
 
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...Avere Systems
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APImcsrivas
 
Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailInternet World
 
Hw09 Production Deep Dive With High Availability
Hw09   Production Deep Dive With High AvailabilityHw09   Production Deep Dive With High Availability
Hw09 Production Deep Dive With High AvailabilityCloudera, Inc.
 
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDiscoSD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDiscoBig Data Joe™ Rossi
 
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Cloudera, Inc.
 
Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Mandakini Kumari
 
Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!DataWorks Summit
 

Similar to Redundancy for Big Hadoop Clusters is hard - Stuart Pook (20)

MapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR HadoopMapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR Hadoop
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
 
Shift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
Shift into High Gear: Dramatically Improve Hadoop & NoSQL PerformanceShift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
Shift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
 
The Anatomy Of The Google Architecture Fina Lv1.1
The Anatomy Of The Google Architecture Fina Lv1.1The Anatomy Of The Google Architecture Fina Lv1.1
The Anatomy Of The Google Architecture Fina Lv1.1
 
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase API
 
Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, Whiptail
 
Hw09 Production Deep Dive With High Availability
Hw09   Production Deep Dive With High AvailabilityHw09   Production Deep Dive With High Availability
Hw09 Production Deep Dive With High Availability
 
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDiscoSD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
 
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
 
LUG 2014
LUG 2014LUG 2014
LUG 2014
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
 
SFHUG Kudu Talk
SFHUG Kudu TalkSFHUG Kudu Talk
SFHUG Kudu Talk
 
Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04
 
Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!
 

More from Evention

The Factorization Machines algorithm for building recommendation system - Paw...
The Factorization Machines algorithm for building recommendation system - Paw...The Factorization Machines algorithm for building recommendation system - Paw...
The Factorization Machines algorithm for building recommendation system - Paw...Evention
 
A/B testing powered by Big data - Saurabh Goyal, Booking.com
A/B testing powered by Big data - Saurabh Goyal, Booking.comA/B testing powered by Big data - Saurabh Goyal, Booking.com
A/B testing powered by Big data - Saurabh Goyal, Booking.comEvention
 
Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...
Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...
Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...Evention
 
Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...
Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...
Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...Evention
 
Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...
Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...
Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...Evention
 
Building a Modern Data Pipeline: Lessons Learned - Saulius Valatka, Adform
Building a Modern Data Pipeline: Lessons Learned - Saulius Valatka, AdformBuilding a Modern Data Pipeline: Lessons Learned - Saulius Valatka, Adform
Building a Modern Data Pipeline: Lessons Learned - Saulius Valatka, AdformEvention
 
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data ArtisansApache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data ArtisansEvention
 
Privacy by Design - Lars Albertsson, Mapflat
Privacy by Design - Lars Albertsson, MapflatPrivacy by Design - Lars Albertsson, Mapflat
Privacy by Design - Lars Albertsson, MapflatEvention
 
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...Evention
 
Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...
Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...
Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...Evention
 
Enhancing Spark - increase streaming capabilities of your applications - Kami...
Enhancing Spark - increase streaming capabilities of your applications - Kami...Enhancing Spark - increase streaming capabilities of your applications - Kami...
Enhancing Spark - increase streaming capabilities of your applications - Kami...Evention
 
7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...
7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...
7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...Evention
 
Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...
Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...
Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...Evention
 
Stream processing with Apache Flink - Maximilian Michels Data Artisans
Stream processing with Apache Flink - Maximilian Michels Data ArtisansStream processing with Apache Flink - Maximilian Michels Data Artisans
Stream processing with Apache Flink - Maximilian Michels Data ArtisansEvention
 
Scaling Cassandra in all directions - Jimmy Mardell Spotify
Scaling Cassandra in all directions - Jimmy Mardell SpotifyScaling Cassandra in all directions - Jimmy Mardell Spotify
Scaling Cassandra in all directions - Jimmy Mardell SpotifyEvention
 
Big Data for unstructured data Dariusz Śliwa
Big Data for unstructured data Dariusz ŚliwaBig Data for unstructured data Dariusz Śliwa
Big Data for unstructured data Dariusz ŚliwaEvention
 
Elastic development. Implementing Big Data search Grzegorz Kołpuć
Elastic development. Implementing Big Data search Grzegorz KołpućElastic development. Implementing Big Data search Grzegorz Kołpuć
Elastic development. Implementing Big Data search Grzegorz KołpućEvention
 
H2 o deep water making deep learning accessible to everyone -jo-fai chow
H2 o deep water   making deep learning accessible to everyone -jo-fai chowH2 o deep water   making deep learning accessible to everyone -jo-fai chow
H2 o deep water making deep learning accessible to everyone -jo-fai chowEvention
 
That won’t fit into RAM - Michał Brzezicki
That won’t fit into RAM -  Michał  BrzezickiThat won’t fit into RAM -  Michał  Brzezicki
That won’t fit into RAM - Michał BrzezickiEvention
 
Stream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian HueskeStream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian HueskeEvention
 

More from Evention (20)

The Factorization Machines algorithm for building recommendation system - Paw...
The Factorization Machines algorithm for building recommendation system - Paw...The Factorization Machines algorithm for building recommendation system - Paw...
The Factorization Machines algorithm for building recommendation system - Paw...
 
A/B testing powered by Big data - Saurabh Goyal, Booking.com
A/B testing powered by Big data - Saurabh Goyal, Booking.comA/B testing powered by Big data - Saurabh Goyal, Booking.com
A/B testing powered by Big data - Saurabh Goyal, Booking.com
 
Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...
Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...
Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...
 
Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...
Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...
Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...
 
Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...
Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...
Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...
 
Building a Modern Data Pipeline: Lessons Learned - Saulius Valatka, Adform
Building a Modern Data Pipeline: Lessons Learned - Saulius Valatka, AdformBuilding a Modern Data Pipeline: Lessons Learned - Saulius Valatka, Adform
Building a Modern Data Pipeline: Lessons Learned - Saulius Valatka, Adform
 
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data ArtisansApache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
 
Privacy by Design - Lars Albertsson, Mapflat
Privacy by Design - Lars Albertsson, MapflatPrivacy by Design - Lars Albertsson, Mapflat
Privacy by Design - Lars Albertsson, Mapflat
 
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
 
Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...
Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...
Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...
 
Enhancing Spark - increase streaming capabilities of your applications - Kami...
Enhancing Spark - increase streaming capabilities of your applications - Kami...Enhancing Spark - increase streaming capabilities of your applications - Kami...
Enhancing Spark - increase streaming capabilities of your applications - Kami...
 
7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...
7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...
7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...
 
Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...
Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...
Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...
 
Stream processing with Apache Flink - Maximilian Michels Data Artisans
Stream processing with Apache Flink - Maximilian Michels Data ArtisansStream processing with Apache Flink - Maximilian Michels Data Artisans
Stream processing with Apache Flink - Maximilian Michels Data Artisans
 
Scaling Cassandra in all directions - Jimmy Mardell Spotify
Scaling Cassandra in all directions - Jimmy Mardell SpotifyScaling Cassandra in all directions - Jimmy Mardell Spotify
Scaling Cassandra in all directions - Jimmy Mardell Spotify
 
Big Data for unstructured data Dariusz Śliwa
Big Data for unstructured data Dariusz ŚliwaBig Data for unstructured data Dariusz Śliwa
Big Data for unstructured data Dariusz Śliwa
 
Elastic development. Implementing Big Data search Grzegorz Kołpuć
Elastic development. Implementing Big Data search Grzegorz KołpućElastic development. Implementing Big Data search Grzegorz Kołpuć
Elastic development. Implementing Big Data search Grzegorz Kołpuć
 
H2 o deep water making deep learning accessible to everyone -jo-fai chow
H2 o deep water   making deep learning accessible to everyone -jo-fai chowH2 o deep water   making deep learning accessible to everyone -jo-fai chow
H2 o deep water making deep learning accessible to everyone -jo-fai chow
 
That won’t fit into RAM - Michał Brzezicki
That won’t fit into RAM -  Michał  BrzezickiThat won’t fit into RAM -  Michał  Brzezicki
That won’t fit into RAM - Michał Brzezicki
 
Stream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian HueskeStream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian Hueske
 

Recently uploaded

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 

Recently uploaded (20)

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 

Redundancy for Big Hadoop Clusters is hard - Stuart Pook

  • 1. CREATING REDUNDANCY FOR BIG HADOOP CLUSTERS IS HARD Stuart Pook (s.pook@criteo.com @StuartPook)
  • 2. BROUGHT TO YOU BY LAKE Anna Savarin, Anthony Rabier, Meriam Lachkar, Nicolas Fraison, Rémy Gayet, Rémy Saissy, Stuart Pook, Thierry Lefort & Yohan Bismuth
  • 3. HADOOP AT CRITEO BACK IN 2014 39 PB raw capacity > 100 000 jobs/day 10 000 CPUs 105 TB RAM 40 TB imported/day
  • 4. PAID COMPUTE ESSENTIAL Extract, Transform & Load logs (72 hour buffers) bidding models billing business analysis
  • 5. HADOOP PROVIDES LOCAL REDUNDANCY failing datanodes (1 or 2) failing racks (1) failing namenodes (1) failing resourcemanager (1)
  • 6. INFRASTRUCTURE PROVIDES multiple routers but single top of rack switch multiple power sources but one power supply per rack
  • 7. NO PROTECTION AGAINST data centre disaster multiple datanode failures in a short time multiple rack failures in a day operator error
  • 8. DATA BACKUP IN THE CLOUD backup long import 100 TB/day create 50 TB/day connection 50 Gb/s restore 2 PB too long at 50 Gb/s what about compute?
  • 9. COMPUTE IN THE CLOUD 20 000 CPUs requires reservation reservation expensive no need for elasticity Criteo has data centres Criteo likes bare metal cloud > 8 times more expensive in-house get us exactly the network & hardware we need
  • 10. BUILD A NEW DATA CENTRE IN 9 MONTHS space for 5000 machines non-blocking network 10 Gb/s endpoints level 3 routing Clos topology power 1 megawatt + option for 1 MW it's impossible
  • 11. NEW HARDWARE had one supplier need competition to keep prices down 3 replies to our call for tenders 3 similar machines 16 (or 12) 6 TB SATA disks 2 Xeon E5-2650L v3, 24 cores, 48 threads 256 GB RAM Mellanox 10 Gb/s network card
  • 12. TEST THE HARDWARE three 10 node clusters disk bandwidth network bandwidth teragen zero replication (disks) high replication (network) terasort
  • 13. HARDWARE IS SIMILAR eliminate the constructor with 4 DOA disks other failed disks 20% higher power consumption choose the cheapest and most dense → Huawei
  • 14. MIX THE HARDWARE operations are more difficult multiple configurations needed some clusters have both hardware types we have more choice at each order
  • 15. TWO MISTAKES (AT LEAST) made master and datanodes the same just one type of machine easy migration (except kerberos) moving master nodes to bigger machines bad RAID card change 700 cards on a running cluster all blocks lost on each card change no downtime allowed
  • 16. HAVE THE DC, BUILD HADOOP configure using Chef infrastructure is code in git automate to scale (because we don't) test Hadoop with 10 hour petasorts tune Hadoop for this scale namenode machine crashes upgrade kernel rolling restart on all machines rack by rack, not node by node
  • 17. NEW CLUSTER ONLINE more capacity as have 2 clusters users love it users find new uses soon using more capacity than the new cluster provides impossible to stop the old cluster
  • 18. SITUATION NOW WORSE two critical clusters two Hadoop versions to support not one but two SPOFs
  • 19. WILL TRY (AGAIN) FOR REDUNDANCY add hundreds of new nodes soon the new cluster will be sufficient move all jobs and data to the new cluster stop the old cluster upgrade the old cluster run both in parallel both clusters with capacity for essential jobs both clusters have all essential data procedure to handle a cluster loss
  • 20. STAYING ALIVE automate operations disk changes ramp ups restarts (machines or services) infrastructure as code tests: kitchen, ChefSpec & preprod merge requests with reviews
  • 21. STAYING ALIVE — TUNE HADOOP increase bandwidth and time limit for checkpoint 332 GB heap for namenode 180 GB reserved for native code & OS tune GC for namenode serial → not efficient on multi-thread parallel → long pauses + high throughput Concurrent Mark Sweep → short pauses + lower throughput G1 → in prod G1NewSizePercent=0 G1RSetRegionEntries=4096 +ParallelRefProcEnabled & +PerfDisableSharedMem Azul not required for 1000 DN but might be for 3000
  • 22. STAYING ALIVE — FIX BUGS The cluster crashes, find the bug, if fixed, backport it, else fix fix HDFS-10220 expired leases make namenode unresponsive and failover backport YARN-4041 Slow delegation token renewal prolongs RM recovery HDFS-9305 Delayed heartbeat processing causes storm of heartbeats YARN-4546 ResourceManager crash due to scheduling opportunity overflow HDFS-9906 Remove spammy log spew when a datanode is restarted
  • 23. STAYING ALIVE — MONITORING HDFS namenode: missing blocks, GC, checkpoints, safemode, QPS, live datanode datanodes: disks, read/write throughput, space YARN queue length, memory & CPU usage, job duration (scheduling + run time) ResourceManager: QPS bad nodes probes to emulate client behavior with witness jobs zookeeper: availability, probes
  • 24. STAYING ALIVE — FASTER MASTER NODES namenode does many sequential operations long locks failovers too slow heartbeats lost → fast CPU two big namenodes 512 GB RAM 2 × Intel Xeon CPU E5-2643 v4 @ 3.40GHz 3 × RAID 1 of 2 SSD
  • 25. HANDLING SCALE things that work at small scale then fail download packages from Internet DNS as database
  • 26. CLUSTER COLLAPSE (FEBRUARY 2016) lots of blocks → 132 GB namenode (NN) heap full user creates 20 million files & 20 PB data on a Friday a ernoon NN gets stuck doing GC → no throughput increase standby heap size to 85% RAM via restart too many requests during restart (iptables) failover crashed fsimage on active corrupt as too big for transfer copy missing NN edits from journal node restart 1200 datanodes in batches 36 hours to recover the cluster
  • 27. NEED SERVICE-LEVEL AGREEMENT (SLA) define time for operations job duration request handling measure monitoring respect some services are “best effort”
  • 28. OPERATOR ERROR same operators on both clusters one chef server for both clusters single mistake → both clusters
  • 29. WE HAVE 4 prod clusters 3 pre-prod clusters 2 running CDH4 5 running CDH5 2347 datanodes 40 986 cores 130 PB disk space 660 TB RAM > 300 000 jobs/day 100 TB imported daily 6 PB created or read per day
  • 30. UPCOMING CHALLENGES Optimize and fix Hadoop Add hundreds more datanodes Create a new bare-metal data-centre Make 2 big clusters work together We are hiring Come and join us s.pook@criteo.com @StuartPook Questions?