SlideShare a Scribd company logo
1 of 41
Apache Hadoop
The elephant in the room
C. Aaron Cois, Ph.D.
Me
@aaroncois
www.codehenge.net
Love to chat!
The Problem
Large-Scale Computation
• Traditionally, large computation was
focused on
– Complex, CPU-intensive calculations
– On relatively small data sets
• Examples:
– Calculate complex differential equations
– Calculate digits of Pi
Parallel Processing
• Distributed systems allow scalable
computation (more
processors, working simultaneously)
INPUT OUTPUT
Data Storage
• Data is often stored on a SAN
• Data is copied to each compute node
at compute time
• This works well for small amounts of
data, but requires significant copy
time for large data sets
SAN
Compute Nodes
Data
SAN
Calculating…
You must first distribute data
each time you run a
computation…
How much data?
How much data?
over 25 PB of data
How much data?
over 25 PB of data
over 100 PB of data
The internet
IDC estimates[2] the internet contains at
least:
1 Zetabyte
or
1,000 Exabytes
or
1,000,000 Petabytes
2 http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf (2007)
How much time?
Disk Transfer Rates:
• Standard 7200 RPM drive
128.75 MB/s
=> 7.7 secs/GB
=> 13 mins/100 GB
=> > 2 hours/TB
=> 90 days/PB
1 http://en.wikipedia.org/wiki/Hard_disk_drive#Data_transfer_rate
How much time?
Fastest Network Xfer rate:
• iSCSI over 1000GB ethernet (theor.)
– 12.5 Gb/S => 80 sec/TB, 1333 min/PB
Ok, ignore network bottleneck:
• Hypertransport Bus
– 51.2 Gb/S => 19 sec/TB, 325 min/PB
1 http://en.wikipedia.org/wiki/List_of_device_bit_rates
We need a better plan
• Sending data to distributed processors is
the bottleneck
• So what if we sent the processors to the
data?
Core concept:
Pre-distribute and store the data.
Assign compute nodes to operate on local
data.
The Solution
Distributed Data Servers
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
Distribute the Data
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
Send computation code to servers
containing relevant data
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
010110
011010
Hadoop Origin
• Hadoop was modeled after innovative
systems created by Google
• Designed to handle massive (web-
scale) amounts of data
Fun Fact: Hadoop’s creator
named it after his son’s stuffed
elephant
Hadoop Goals
• Store massive data sets
• Enable distributed computation
• Heavy focus on
– Fault tolerance
– Data integrity
– Commodity hardware
Hadoop System
GFS
MapReduce
BigTable
HDFS
Hadoop
MapReduce
HBase
Hadoop System
GFS
MapReduce
BigTable
HDFS
Hadoop
MapReduce
HBase
Hadoop
Components
HDFS
• “Hadoop Distributed File System”
• Sits on top of native filesystem
– ext3, etc
• Stores data in files, replicated and
distributed across data nodes
• Files are “write once”
• Performs best with millions of ~100MB+
files
HDFS
Files are split into blocks for storage
Datanodes
– Data blocks are distributed/replicated
across datanodes
Namenode
– The master node
– Keeps track of location of data blocks
HDFS
Multi-Node Cluster
Master Slave
Name Node
Data NodeData Node
MapReduce
A programming model
– Designed to make programming parallel
computation over large distributed data
sets easy
– Each node processes data already
residing on it (when possible)
– Inspired by functional programming map
and reduce functions
MapReduce
JobTracker
– Runs on a master node
– Clients submit jobs to the JobTracker
– Assigns Map and Reduce tasks to slave
nodes
TaskTracker
– Runs on every slave node
– Daemon that instantiates Map or Reduce
tasks and reports results to JobTracker
MapReduce
Multi-Node Cluster
Master Slave
JobTracker
TaskTrackerTaskTracker
MapReduce
Layer
HDFS Layer
Multi-Node Cluster
Master Slave
NameNod
e
DataNodeDataNode
JobTracker
TaskTracker TaskTracker
HBase
• Hadoop’s Database
• Sits on top of HDFS
• Provides random read/write access to
Very LargeTM tables
– Billions of rows, billions of columns
• Access via
Java, Jython, Groovy, Scala, or REST
web service
A Typical Hadoop Cluster
• Consists entirely of commodity ~$5k
servers
• 1 master, 1 -> 1000+ slaves
• Scales linearly as more processing
nodes are added
How it works
http://en.wikipedia.org/wiki/MapReduce
Traditional MapReduce
Hadoop MapReduce
Image Credit: http://www.drdobbs.com/database/hadoop-the-lay-of-the-land/240150854
MapReduce Example
function map(Str name, Str document):
for each word w in document:
increment_count(w, 1)
function reduce(Str word, Iter partialCounts):
sum = 0
for each pc in partialCounts:
sum += ParseInt(pc)
return (word, sum)
What didn’t I worry about?
• Data distribution
• Node management
• Concurrency
• Error handling
• Node failure
• Load balancing
• Data replication/integrity
Demo
Try the demo yourself!
Go to:
https://github.com/cacois/vagrant-
hadoop-cluster
Follow the instructions in the README

More Related Content

What's hot

What's hot (20)

HUG August 2010: Best practices
HUG August 2010: Best practicesHUG August 2010: Best practices
HUG August 2010: Best practices
 
2. hadoop fundamentals
2. hadoop fundamentals2. hadoop fundamentals
2. hadoop fundamentals
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Hadoop: Distributed Data Processing
Hadoop: Distributed Data ProcessingHadoop: Distributed Data Processing
Hadoop: Distributed Data Processing
 
Map reduce & HDFS with Hadoop
Map reduce & HDFS with HadoopMap reduce & HDFS with Hadoop
Map reduce & HDFS with Hadoop
 
Hadoop
Hadoop Hadoop
Hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop
Hadoop Hadoop
Hadoop
 
Hadoop Primer
Hadoop PrimerHadoop Primer
Hadoop Primer
 
Cloud Optimized Big Data
Cloud Optimized Big DataCloud Optimized Big Data
Cloud Optimized Big Data
 
Nextag talk
Nextag talkNextag talk
Nextag talk
 
Hadoop and big data
Hadoop and big dataHadoop and big data
Hadoop and big data
 
Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1
 
Big Data Journey
Big Data JourneyBig Data Journey
Big Data Journey
 
PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
 
Hadoop overview
Hadoop overviewHadoop overview
Hadoop overview
 
Hadoop Fundamentals
Hadoop FundamentalsHadoop Fundamentals
Hadoop Fundamentals
 
Qubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant ConferenceQubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant Conference
 
Getting started big data
Getting started big dataGetting started big data
Getting started big data
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 

Viewers also liked

Theres An Elephant In The Control Room V4
Theres An Elephant In The Control Room V4Theres An Elephant In The Control Room V4
Theres An Elephant In The Control Room V4calday
 
2004 ibc - The role of control room operators
2004 ibc - The role of control room operators2004 ibc - The role of control room operators
2004 ibc - The role of control room operatorsAndy Brazier
 
Jumbo Vision - Control Room Design & Operations Conference 2013
Jumbo Vision - Control Room Design & Operations Conference 2013Jumbo Vision - Control Room Design & Operations Conference 2013
Jumbo Vision - Control Room Design & Operations Conference 2013Lena Kimenkowski
 
Gesab Presentation
Gesab PresentationGesab Presentation
Gesab Presentationadiazgesab
 
Welcome to the future Control Room Working Environment
Welcome to the future Control Room Working EnvironmentWelcome to the future Control Room Working Environment
Welcome to the future Control Room Working EnvironmentJeton Partini
 
Control Room Design and Cost Reduction
Control Room Design and Cost ReductionControl Room Design and Cost Reduction
Control Room Design and Cost ReductionDavid Watts
 
Control Room Design and Functionality | Evans Consoles PPT
Control Room Design and Functionality | Evans Consoles PPTControl Room Design and Functionality | Evans Consoles PPT
Control Room Design and Functionality | Evans Consoles PPTalbertfrost
 
Future cities and the control room of 2030
Future cities and the control room of 2030 Future cities and the control room of 2030
Future cities and the control room of 2030 David Watts
 
The Mine Central Control Room: From Concept to Reality
The Mine Central Control Room: From Concept to Reality The Mine Central Control Room: From Concept to Reality
The Mine Central Control Room: From Concept to Reality Schneider Electric
 
2010 IBC - Managing risks of control room operations
2010 IBC - Managing risks of control room operations2010 IBC - Managing risks of control room operations
2010 IBC - Managing risks of control room operationsAndy Brazier
 
2005 IBC - Managing risks of control room operations
2005 IBC - Managing risks of control room operations2005 IBC - Managing risks of control room operations
2005 IBC - Managing risks of control room operationsAndy Brazier
 
Elephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYCElephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYCMike Lewis
 
Moving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at ScaleMoving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at ScaleTyrone Hinderson
 
The elephant in the room. discussion
The elephant in the room. discussionThe elephant in the room. discussion
The elephant in the room. discussionAndrew Gelston
 
asteRISK
asteRISKasteRISK
asteRISKkrnmcg
 

Viewers also liked (20)

Theres An Elephant In The Control Room V4
Theres An Elephant In The Control Room V4Theres An Elephant In The Control Room V4
Theres An Elephant In The Control Room V4
 
2004 ibc - The role of control room operators
2004 ibc - The role of control room operators2004 ibc - The role of control room operators
2004 ibc - The role of control room operators
 
Jumbo Vision - Control Room Design & Operations Conference 2013
Jumbo Vision - Control Room Design & Operations Conference 2013Jumbo Vision - Control Room Design & Operations Conference 2013
Jumbo Vision - Control Room Design & Operations Conference 2013
 
Gesab Presentation
Gesab PresentationGesab Presentation
Gesab Presentation
 
MCC Presentation 13.05.15
MCC Presentation 13.05.15MCC Presentation 13.05.15
MCC Presentation 13.05.15
 
Welcome to the future Control Room Working Environment
Welcome to the future Control Room Working EnvironmentWelcome to the future Control Room Working Environment
Welcome to the future Control Room Working Environment
 
Control Room Design and Cost Reduction
Control Room Design and Cost ReductionControl Room Design and Cost Reduction
Control Room Design and Cost Reduction
 
Control Room Design and Functionality | Evans Consoles PPT
Control Room Design and Functionality | Evans Consoles PPTControl Room Design and Functionality | Evans Consoles PPT
Control Room Design and Functionality | Evans Consoles PPT
 
Future cities and the control room of 2030
Future cities and the control room of 2030 Future cities and the control room of 2030
Future cities and the control room of 2030
 
The Mine Central Control Room: From Concept to Reality
The Mine Central Control Room: From Concept to Reality The Mine Central Control Room: From Concept to Reality
The Mine Central Control Room: From Concept to Reality
 
Control Room of the Future
Control Room of the FutureControl Room of the Future
Control Room of the Future
 
2010 IBC - Managing risks of control room operations
2010 IBC - Managing risks of control room operations2010 IBC - Managing risks of control room operations
2010 IBC - Managing risks of control room operations
 
2005 IBC - Managing risks of control room operations
2005 IBC - Managing risks of control room operations2005 IBC - Managing risks of control room operations
2005 IBC - Managing risks of control room operations
 
Elephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYCElephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYC
 
Moving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at ScaleMoving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at Scale
 
The elephant in the room. discussion
The elephant in the room. discussionThe elephant in the room. discussion
The elephant in the room. discussion
 
YUI The Elephant In The Room
YUI The Elephant In The RoomYUI The Elephant In The Room
YUI The Elephant In The Room
 
The elephant in the room
The elephant in the roomThe elephant in the room
The elephant in the room
 
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOMELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
 
asteRISK
asteRISKasteRISK
asteRISK
 

Similar to Hadoop: The elephant in the room

Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big DataJoe Alex
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015 clairvoyantllc
 
Scaling Storage and Computation with Hadoop
Scaling Storage and Computation with HadoopScaling Storage and Computation with Hadoop
Scaling Storage and Computation with Hadoopyaevents
 
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewApache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewNisanth Simon
 
Introduction to Hadoop and Big-Data
Introduction to Hadoop and Big-DataIntroduction to Hadoop and Big-Data
Introduction to Hadoop and Big-DataRamsay Key
 
Big Data Technologies - Hadoop
Big Data Technologies - HadoopBig Data Technologies - Hadoop
Big Data Technologies - HadoopTalentica Software
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.pptvijayapraba1
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and DeploymentCisco Canada
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keownCisco Canada
 
4. hadoop גיא לבנברג
4. hadoop  גיא לבנברג4. hadoop  גיא לבנברג
4. hadoop גיא לבנברגTaldor Group
 
hadoop distributed file systems complete information
hadoop distributed file systems complete informationhadoop distributed file systems complete information
hadoop distributed file systems complete informationbhargavi804095
 
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)VMware Tanzu
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and HadoopMr. Ankit
 
Bigdata processing with Spark
Bigdata processing with SparkBigdata processing with Spark
Bigdata processing with SparkArjen de Vries
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2tcloudcomputing-tw
 

Similar to Hadoop: The elephant in the room (20)

Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big Data
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015
 
Scaling Storage and Computation with Hadoop
Scaling Storage and Computation with HadoopScaling Storage and Computation with Hadoop
Scaling Storage and Computation with Hadoop
 
getFamiliarWithHadoop
getFamiliarWithHadoopgetFamiliarWithHadoop
getFamiliarWithHadoop
 
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewApache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce Overview
 
Introduction to Hadoop and Big-Data
Introduction to Hadoop and Big-DataIntroduction to Hadoop and Big-Data
Introduction to Hadoop and Big-Data
 
Hadoop
HadoopHadoop
Hadoop
 
Big Data Technologies - Hadoop
Big Data Technologies - HadoopBig Data Technologies - Hadoop
Big Data Technologies - Hadoop
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keown
 
4. hadoop גיא לבנברג
4. hadoop  גיא לבנברג4. hadoop  גיא לבנברג
4. hadoop גיא לבנברג
 
hadoop distributed file systems complete information
hadoop distributed file systems complete informationhadoop distributed file systems complete information
hadoop distributed file systems complete information
 
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
 
Hadoop
HadoopHadoop
Hadoop
 
Anju
AnjuAnju
Anju
 
Hadoop fundamentals
Hadoop fundamentalsHadoop fundamentals
Hadoop fundamentals
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Bigdata processing with Spark
Bigdata processing with SparkBigdata processing with Spark
Bigdata processing with Spark
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
 

More from cacois

Devopssecfail
DevopssecfailDevopssecfail
Devopssecfailcacois
 
Machine Learning for Modern Developers
Machine Learning for Modern DevelopersMachine Learning for Modern Developers
Machine Learning for Modern Developerscacois
 
Avoiding Callback Hell with Async.js
Avoiding Callback Hell with Async.jsAvoiding Callback Hell with Async.js
Avoiding Callback Hell with Async.jscacois
 
Node.js Patterns for Discerning Developers
Node.js Patterns for Discerning DevelopersNode.js Patterns for Discerning Developers
Node.js Patterns for Discerning Developerscacois
 
High-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using RedisHigh-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using Rediscacois
 
Automate your Development Environments with Vagrant
Automate your Development Environments with VagrantAutomate your Development Environments with Vagrant
Automate your Development Environments with Vagrantcacois
 
Node.js: A Guided Tour
Node.js: A Guided TourNode.js: A Guided Tour
Node.js: A Guided Tourcacois
 

More from cacois (7)

Devopssecfail
DevopssecfailDevopssecfail
Devopssecfail
 
Machine Learning for Modern Developers
Machine Learning for Modern DevelopersMachine Learning for Modern Developers
Machine Learning for Modern Developers
 
Avoiding Callback Hell with Async.js
Avoiding Callback Hell with Async.jsAvoiding Callback Hell with Async.js
Avoiding Callback Hell with Async.js
 
Node.js Patterns for Discerning Developers
Node.js Patterns for Discerning DevelopersNode.js Patterns for Discerning Developers
Node.js Patterns for Discerning Developers
 
High-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using RedisHigh-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using Redis
 
Automate your Development Environments with Vagrant
Automate your Development Environments with VagrantAutomate your Development Environments with Vagrant
Automate your Development Environments with Vagrant
 
Node.js: A Guided Tour
Node.js: A Guided TourNode.js: A Guided Tour
Node.js: A Guided Tour
 

Recently uploaded

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Recently uploaded (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Hadoop: The elephant in the room

Editor's Notes

  1. Note: This study was from 2007. I don’t know if there’s a Moore’s Law of growth of data on the internet, but I expect this is a much larger number now.
  2. This is not a supercomputer, and its not intended to be. Google’s approach was always to use a lot of cheap, expendable commodity servers, rather than be beholden to expensive, custom hardware and vendors. What they knew was software, so they learned on that expertise to produce a solution.