SlideShare a Scribd company logo
Apache Hadoop
The elephant in the room
C. Aaron Cois, Ph.D.
Me
@aaroncois
www.codehenge.net
Love to chat!
The Problem
Large-Scale Computation
• Traditionally, large computation was
focused on
– Complex, CPU-intensive calculations
– On relatively small data sets
• Examples:
– Calculate complex differential equations
– Calculate digits of Pi
Parallel Processing
• Distributed systems allow scalable
computation (more
processors, working simultaneously)
INPUT OUTPUT
Data Storage
• Data is often stored on a SAN
• Data is copied to each compute node
at compute time
• This works well for small amounts of
data, but requires significant copy
time for large data sets
SAN
Compute Nodes
Data
SAN
Calculating…
You must first distribute data
each time you run a
computation…
How much data?
How much data?
over 25 PB of data
How much data?
over 25 PB of data
over 100 PB of data
The internet
IDC estimates[2] the internet contains at
least:
1 Zetabyte
or
1,000 Exabytes
or
1,000,000 Petabytes
2 http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf (2007)
How much time?
Disk Transfer Rates:
• Standard 7200 RPM drive
128.75 MB/s
=> 7.7 secs/GB
=> 13 mins/100 GB
=> > 2 hours/TB
=> 90 days/PB
1 http://en.wikipedia.org/wiki/Hard_disk_drive#Data_transfer_rate
How much time?
Fastest Network Xfer rate:
• iSCSI over 1000GB ethernet (theor.)
– 12.5 Gb/S => 80 sec/TB, 1333 min/PB
Ok, ignore network bottleneck:
• Hypertransport Bus
– 51.2 Gb/S => 19 sec/TB, 325 min/PB
1 http://en.wikipedia.org/wiki/List_of_device_bit_rates
We need a better plan
• Sending data to distributed processors is
the bottleneck
• So what if we sent the processors to the
data?
Core concept:
Pre-distribute and store the data.
Assign compute nodes to operate on local
data.
The Solution
Distributed Data Servers
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
Distribute the Data
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
Send computation code to servers
containing relevant data
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
Hadoop Origin
• Hadoop was modeled after innovative
systems created by Google
• Designed to handle massive (web-
scale) amounts of data
Fun Fact: Hadoop’s creator
named it after his son’s stuffed
elephant
Hadoop Goals
• Store massive data sets
• Enable distributed computation
• Heavy focus on
– Fault tolerance
– Data integrity
– Commodity hardware
Hadoop System
GFS
MapReduce
BigTable
HDFS
Hadoop
MapReduce
HBase
Hadoop System
GFS
MapReduce
BigTable
HDFS
Hadoop
MapReduce
HBase
Hadoop
Components
HDFS
• “Hadoop Distributed File System”
• Sits on top of native filesystem
– ext3, etc
• Stores data in files, replicated and
distributed across data nodes
• Files are “write once”
• Performs best with millions of ~100MB+
files
HDFS
Files are split into blocks for storage
Datanodes
– Data blocks are distributed/replicated
across datanodes
Namenode
– The master node
– Keeps track of location of data blocks
HDFS
Multi-Node Cluster
Master Slave
Name Node
Data NodeData Node
MapReduce
A programming model
– Designed to make programming parallel
computation over large distributed data
sets easy
– Each node processes data already
residing on it (when possible)
– Inspired by functional programming map
and reduce functions
MapReduce
JobTracker
– Runs on a master node
– Clients submit jobs to the JobTracker
– Assigns Map and Reduce tasks to slave
nodes
TaskTracker
– Runs on every slave node
– Daemon that instantiates Map or Reduce
tasks and reports results to JobTracker
MapReduce
Multi-Node Cluster
Master Slave
JobTracker
TaskTrackerTaskTracker
MapReduce
Layer
HDFS Layer
Multi-Node Cluster
Master Slave
NameNod
e
DataNodeDataNode
JobTracker
TaskTracker TaskTracker
HBase
• Hadoop’s Database
• Sits on top of HDFS
• Provides random read/write access to
Very LargeTM tables
– Billions of rows, billions of columns
• Access via
Java, Jython, Groovy, Scala, or REST
web service
A Typical Hadoop Cluster
• Consists entirely of commodity ~$5k
servers
• 1 master, 1 -> 1000+ slaves
• Scales linearly as more processing
nodes are added
How it works
http://en.wikipedia.org/wiki/MapReduce
Traditional MapReduce
Hadoop MapReduce
Image Credit: http://www.drdobbs.com/database/hadoop-the-lay-of-the-land/240150854
MapReduce Example
function map(Str name, Str document):
for each word w in document:
increment_count(w, 1)
function reduce(Str word, Iter partialCounts):
sum = 0
for each pc in partialCounts:
sum += ParseInt(pc)
return (word, sum)
What didn’t I worry about?
• Data distribution
• Node management
• Concurrency
• Error handling
• Node failure
• Load balancing
• Data replication/integrity
Demo
Try the demo yourself!
Go to:
https://github.com/cacois/vagrant-
hadoop-cluster
Follow the instructions in the README

More Related Content

What's hot

HUG August 2010: Best practices
HUG August 2010: Best practicesHUG August 2010: Best practices
HUG August 2010: Best practices
Hadoop User Group
 
2. hadoop fundamentals
2. hadoop fundamentals2. hadoop fundamentals
2. hadoop fundamentals
Lokesh Ramaswamy
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
Roushan Sinha
 
Hadoop: Distributed Data Processing
Hadoop: Distributed Data ProcessingHadoop: Distributed Data Processing
Hadoop: Distributed Data Processing
Cloudera, Inc.
 
Map reduce & HDFS with Hadoop
Map reduce & HDFS with HadoopMap reduce & HDFS with Hadoop
Map reduce & HDFS with Hadoop
Diego Pacheco
 
Hadoop
Hadoop Hadoop
Hadoop
Shamama Kamal
 
Hadoop
HadoopHadoop
Hadoop
Hadoop Hadoop
Hadoop
ABHIJEET RAJ
 
Hadoop Primer
Hadoop PrimerHadoop Primer
Hadoop Primer
Steve Staso
 
Cloud Optimized Big Data
Cloud Optimized Big DataCloud Optimized Big Data
Cloud Optimized Big Data
Joydeep Sen Sarma
 
Hadoop and big data
Hadoop and big dataHadoop and big data
Hadoop and big data
Sharad Pandey
 
Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1
Rohit Agrawal
 
Big Data Journey
Big Data JourneyBig Data Journey
Big Data Journey
Tugdual Grall
 
PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
Shubham Parmar
 
Hadoop overview
Hadoop overviewHadoop overview
Hadoop overview
Siva Pandeti
 
Hadoop Fundamentals
Hadoop FundamentalsHadoop Fundamentals
Hadoop Fundamentals
its_skm
 
Qubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant ConferenceQubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant ConferenceJoydeep Sen Sarma
 
Getting started big data
Getting started big dataGetting started big data
Getting started big data
Kibrom Gebrehiwot
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
Stanley Wang
 

What's hot (20)

HUG August 2010: Best practices
HUG August 2010: Best practicesHUG August 2010: Best practices
HUG August 2010: Best practices
 
2. hadoop fundamentals
2. hadoop fundamentals2. hadoop fundamentals
2. hadoop fundamentals
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Hadoop: Distributed Data Processing
Hadoop: Distributed Data ProcessingHadoop: Distributed Data Processing
Hadoop: Distributed Data Processing
 
Map reduce & HDFS with Hadoop
Map reduce & HDFS with HadoopMap reduce & HDFS with Hadoop
Map reduce & HDFS with Hadoop
 
Hadoop
Hadoop Hadoop
Hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop
Hadoop Hadoop
Hadoop
 
Hadoop Primer
Hadoop PrimerHadoop Primer
Hadoop Primer
 
Cloud Optimized Big Data
Cloud Optimized Big DataCloud Optimized Big Data
Cloud Optimized Big Data
 
Nextag talk
Nextag talkNextag talk
Nextag talk
 
Hadoop and big data
Hadoop and big dataHadoop and big data
Hadoop and big data
 
Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1
 
Big Data Journey
Big Data JourneyBig Data Journey
Big Data Journey
 
PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
 
Hadoop overview
Hadoop overviewHadoop overview
Hadoop overview
 
Hadoop Fundamentals
Hadoop FundamentalsHadoop Fundamentals
Hadoop Fundamentals
 
Qubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant ConferenceQubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant Conference
 
Getting started big data
Getting started big dataGetting started big data
Getting started big data
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 

Viewers also liked

Theres An Elephant In The Control Room V4
Theres An Elephant In The Control Room V4Theres An Elephant In The Control Room V4
Theres An Elephant In The Control Room V4
calday
 
2004 ibc - The role of control room operators
2004 ibc - The role of control room operators2004 ibc - The role of control room operators
2004 ibc - The role of control room operators
Andy Brazier
 
Jumbo Vision - Control Room Design & Operations Conference 2013
Jumbo Vision - Control Room Design & Operations Conference 2013Jumbo Vision - Control Room Design & Operations Conference 2013
Jumbo Vision - Control Room Design & Operations Conference 2013
Lena Kimenkowski
 
Gesab Presentation
Gesab PresentationGesab Presentation
Gesab Presentation
adiazgesab
 
Welcome to the future Control Room Working Environment
Welcome to the future Control Room Working EnvironmentWelcome to the future Control Room Working Environment
Welcome to the future Control Room Working Environment
Jeton Partini
 
Control Room Design and Cost Reduction
Control Room Design and Cost ReductionControl Room Design and Cost Reduction
Control Room Design and Cost Reduction
David Watts
 
Control Room Design and Functionality | Evans Consoles PPT
Control Room Design and Functionality | Evans Consoles PPTControl Room Design and Functionality | Evans Consoles PPT
Control Room Design and Functionality | Evans Consoles PPT
albertfrost
 
Future cities and the control room of 2030
Future cities and the control room of 2030 Future cities and the control room of 2030
Future cities and the control room of 2030 David Watts
 
The Mine Central Control Room: From Concept to Reality
The Mine Central Control Room: From Concept to Reality The Mine Central Control Room: From Concept to Reality
The Mine Central Control Room: From Concept to Reality
Schneider Electric
 
Control Room of the Future
Control Room of the FutureControl Room of the Future
Control Room of the Future
Schneider Electric
 
2010 IBC - Managing risks of control room operations
2010 IBC - Managing risks of control room operations2010 IBC - Managing risks of control room operations
2010 IBC - Managing risks of control room operations
Andy Brazier
 
2005 IBC - Managing risks of control room operations
2005 IBC - Managing risks of control room operations2005 IBC - Managing risks of control room operations
2005 IBC - Managing risks of control room operations
Andy Brazier
 
Elephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYCElephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYCMike Lewis
 
Moving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at ScaleMoving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at Scale
Tyrone Hinderson
 
The elephant in the room. discussion
The elephant in the room. discussionThe elephant in the room. discussion
The elephant in the room. discussion
Andrew Gelston
 
YUI The Elephant In The Room
YUI The Elephant In The RoomYUI The Elephant In The Room
YUI The Elephant In The Room
Christian Heilmann
 
The elephant in the room
The elephant in the roomThe elephant in the room
The elephant in the room
Cleveland-Marshall College of Law
 
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOMELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
Art + Design: elearning lab design for social change
 
asteRISK
asteRISKasteRISK
asteRISK
krnmcg
 

Viewers also liked (20)

Theres An Elephant In The Control Room V4
Theres An Elephant In The Control Room V4Theres An Elephant In The Control Room V4
Theres An Elephant In The Control Room V4
 
2004 ibc - The role of control room operators
2004 ibc - The role of control room operators2004 ibc - The role of control room operators
2004 ibc - The role of control room operators
 
Jumbo Vision - Control Room Design & Operations Conference 2013
Jumbo Vision - Control Room Design & Operations Conference 2013Jumbo Vision - Control Room Design & Operations Conference 2013
Jumbo Vision - Control Room Design & Operations Conference 2013
 
Gesab Presentation
Gesab PresentationGesab Presentation
Gesab Presentation
 
MCC Presentation 13.05.15
MCC Presentation 13.05.15MCC Presentation 13.05.15
MCC Presentation 13.05.15
 
Welcome to the future Control Room Working Environment
Welcome to the future Control Room Working EnvironmentWelcome to the future Control Room Working Environment
Welcome to the future Control Room Working Environment
 
Control Room Design and Cost Reduction
Control Room Design and Cost ReductionControl Room Design and Cost Reduction
Control Room Design and Cost Reduction
 
Control Room Design and Functionality | Evans Consoles PPT
Control Room Design and Functionality | Evans Consoles PPTControl Room Design and Functionality | Evans Consoles PPT
Control Room Design and Functionality | Evans Consoles PPT
 
Future cities and the control room of 2030
Future cities and the control room of 2030 Future cities and the control room of 2030
Future cities and the control room of 2030
 
The Mine Central Control Room: From Concept to Reality
The Mine Central Control Room: From Concept to Reality The Mine Central Control Room: From Concept to Reality
The Mine Central Control Room: From Concept to Reality
 
Control Room of the Future
Control Room of the FutureControl Room of the Future
Control Room of the Future
 
2010 IBC - Managing risks of control room operations
2010 IBC - Managing risks of control room operations2010 IBC - Managing risks of control room operations
2010 IBC - Managing risks of control room operations
 
2005 IBC - Managing risks of control room operations
2005 IBC - Managing risks of control room operations2005 IBC - Managing risks of control room operations
2005 IBC - Managing risks of control room operations
 
Elephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYCElephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYC
 
Moving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at ScaleMoving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at Scale
 
The elephant in the room. discussion
The elephant in the room. discussionThe elephant in the room. discussion
The elephant in the room. discussion
 
YUI The Elephant In The Room
YUI The Elephant In The RoomYUI The Elephant In The Room
YUI The Elephant In The Room
 
The elephant in the room
The elephant in the roomThe elephant in the room
The elephant in the room
 
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOMELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
 
asteRISK
asteRISKasteRISK
asteRISK
 

Similar to Hadoop: The elephant in the room

Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big Data
Joe Alex
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015
clairvoyantllc
 
Scaling Storage and Computation with Hadoop
Scaling Storage and Computation with HadoopScaling Storage and Computation with Hadoop
Scaling Storage and Computation with Hadoop
yaevents
 
getFamiliarWithHadoop
getFamiliarWithHadoopgetFamiliarWithHadoop
getFamiliarWithHadoop
AmirReza Mohammadi
 
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewApache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewNisanth Simon
 
Introduction to Hadoop and Big-Data
Introduction to Hadoop and Big-DataIntroduction to Hadoop and Big-Data
Introduction to Hadoop and Big-Data
Ramsay Key
 
Hadoop
HadoopHadoop
Hadoop
Kasam Sharif
 
Big Data Technologies - Hadoop
Big Data Technologies - HadoopBig Data Technologies - Hadoop
Big Data Technologies - Hadoop
Talentica Software
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
vijayapraba1
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keownCisco Canada
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
Cisco Canada
 
4. hadoop גיא לבנברג
4. hadoop  גיא לבנברג4. hadoop  גיא לבנברג
4. hadoop גיא לבנברגTaldor Group
 
hadoop distributed file systems complete information
hadoop distributed file systems complete informationhadoop distributed file systems complete information
hadoop distributed file systems complete information
bhargavi804095
 
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
VMware Tanzu
 
Hadoop
HadoopHadoop
Hadoop
avnishagr
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
Mr. Ankit
 
Bigdata processing with Spark
Bigdata processing with SparkBigdata processing with Spark
Bigdata processing with Spark
Arjen de Vries
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
tcloudcomputing-tw
 

Similar to Hadoop: The elephant in the room (20)

Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big Data
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015
 
Scaling Storage and Computation with Hadoop
Scaling Storage and Computation with HadoopScaling Storage and Computation with Hadoop
Scaling Storage and Computation with Hadoop
 
getFamiliarWithHadoop
getFamiliarWithHadoopgetFamiliarWithHadoop
getFamiliarWithHadoop
 
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewApache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce Overview
 
Introduction to Hadoop and Big-Data
Introduction to Hadoop and Big-DataIntroduction to Hadoop and Big-Data
Introduction to Hadoop and Big-Data
 
Hadoop
HadoopHadoop
Hadoop
 
Big Data Technologies - Hadoop
Big Data Technologies - HadoopBig Data Technologies - Hadoop
Big Data Technologies - Hadoop
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keown
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
 
4. hadoop גיא לבנברג
4. hadoop  גיא לבנברג4. hadoop  גיא לבנברג
4. hadoop גיא לבנברג
 
hadoop distributed file systems complete information
hadoop distributed file systems complete informationhadoop distributed file systems complete information
hadoop distributed file systems complete information
 
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
 
Hadoop
HadoopHadoop
Hadoop
 
Anju
AnjuAnju
Anju
 
Hadoop fundamentals
Hadoop fundamentalsHadoop fundamentals
Hadoop fundamentals
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Bigdata processing with Spark
Bigdata processing with SparkBigdata processing with Spark
Bigdata processing with Spark
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
 

More from cacois

Devopssecfail
DevopssecfailDevopssecfail
Devopssecfail
cacois
 
Machine Learning for Modern Developers
Machine Learning for Modern DevelopersMachine Learning for Modern Developers
Machine Learning for Modern Developers
cacois
 
Avoiding Callback Hell with Async.js
Avoiding Callback Hell with Async.jsAvoiding Callback Hell with Async.js
Avoiding Callback Hell with Async.js
cacois
 
Node.js Patterns for Discerning Developers
Node.js Patterns for Discerning DevelopersNode.js Patterns for Discerning Developers
Node.js Patterns for Discerning Developers
cacois
 
High-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using RedisHigh-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using Redis
cacois
 
Automate your Development Environments with Vagrant
Automate your Development Environments with VagrantAutomate your Development Environments with Vagrant
Automate your Development Environments with Vagrant
cacois
 
Node.js: A Guided Tour
Node.js: A Guided TourNode.js: A Guided Tour
Node.js: A Guided Tour
cacois
 

More from cacois (7)

Devopssecfail
DevopssecfailDevopssecfail
Devopssecfail
 
Machine Learning for Modern Developers
Machine Learning for Modern DevelopersMachine Learning for Modern Developers
Machine Learning for Modern Developers
 
Avoiding Callback Hell with Async.js
Avoiding Callback Hell with Async.jsAvoiding Callback Hell with Async.js
Avoiding Callback Hell with Async.js
 
Node.js Patterns for Discerning Developers
Node.js Patterns for Discerning DevelopersNode.js Patterns for Discerning Developers
Node.js Patterns for Discerning Developers
 
High-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using RedisHigh-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using Redis
 
Automate your Development Environments with Vagrant
Automate your Development Environments with VagrantAutomate your Development Environments with Vagrant
Automate your Development Environments with Vagrant
 
Node.js: A Guided Tour
Node.js: A Guided TourNode.js: A Guided Tour
Node.js: A Guided Tour
 

Recently uploaded

GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 

Recently uploaded (20)

GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 

Hadoop: The elephant in the room

Editor's Notes

  1. Note: This study was from 2007. I don’t know if there’s a Moore’s Law of growth of data on the internet, but I expect this is a much larger number now.
  2. This is not a supercomputer, and its not intended to be. Google’s approach was always to use a lot of cheap, expendable commodity servers, rather than be beholden to expensive, custom hardware and vendors. What they knew was software, so they learned on that expertise to produce a solution.