SlideShare a Scribd company logo
1© Copyright 2013 Pivotal. All rights reserved. 1© Copyright 2013 Pivotal. All rights reserved.
Hadoop: A
Foundation for
Change
Milind Bhandarkar
Chief Scientist, Pivotal
Twitter: @techmilind
2© Copyright 2013 Pivotal. All rights reserved.
About Me
 http://www.linkedin.com/in/milindb
 Founding member of Hadoop team at Yahoo! [2005-2010]
 Contributor to Apache Hadoop since v0.1
 Built and led Grid Solutions Team at Yahoo! [2007-2010]
 Parallel Programming Paradigms [1989-today] (PhD cs.illinois.edu)
 Center for Development of Advanced Computing (C-DAC), National
Center for Supercomputing Applications (NCSA), Center for Simulation of
Advanced Rockets, Siebel Systems, Pathscale Inc. (acquired by QLogic),
Yahoo!, LinkedIn, and Pivotal (formerly EMC-Greenplum)
3© Copyright 2013 Pivotal. All rights reserved.
First, technology is good. Then it gets
bad. Then it gets stable.
- Alistair Croll
(http://strata.oreilly.com/2013/01/data-warefare.html)
4© Copyright 2013 Pivotal. All rights reserved.
History (2003-2010)
5© Copyright 2013 Pivotal. All rights reserved.
Google Papers
6© Copyright 2013 Pivotal. All rights reserved.
Yahoo! Search
+
=
7© Copyright 2013 Pivotal. All rights reserved.
W-1-W
 WebMap : Graph processing for WWW
 Dreadnaught: Infrastructure for WebMap
 Juggernaut: Infrastructure for W-1-W
 JFS, JMR, Condor: Abandoned for Hadoop
8© Copyright 2013 Pivotal. All rights reserved.
Lucene, Nutch
9© Copyright 2013 Pivotal. All rights reserved.
Kryptonite
10© Copyright 2013 Pivotal. All rights reserved.
Lessons Learned
 Multi-Tenancy from ground-up
 Agility in lieu of Performance
 Provisioning vs Procurement
 “Weird” use cases as learning experience
 Academic collaboration
11© Copyright 2013 Pivotal. All rights reserved.
(From Hadoop Summit 2010)
Who Uses Hadoop ?
12© Copyright 2013 Pivotal. All rights reserved.
http://www.forbes.com/sites/davefeinleib/2012/06/19/the-big-data-landscape/
Big Data Landscape (June 2012)
13© Copyright 2013 Pivotal. All rights reserved.
http://www.datameer.com/blog/perspectives/hadoop-ecosystem-as-of-january-2013-now-an-app.html
Hadoop Ecosystem (January 2013)
14© Copyright 2013 Pivotal. All rights reserved.
15© Copyright 2013 Pivotal. All rights reserved.
16© Copyright 2013 Pivotal. All rights reserved.
17© Copyright 2013 Pivotal. All rights reserved.
Hadoop Economics is Game Changer
$-
$20,000
$40,000
$60,000
$80,000
2008 2009 2010 2011 2012 2013
Big Data Platform Price/TB
Big Data DB Hadoop
18© Copyright 2013 Pivotal. All rights reserved.
“Typical” Hadoop Use-Case
 “User” Modeling
 Objective: Determine User-Interests by mining user-
activities
 Large dimensionality of possible user activities
 Typical user has sparse activity vector
 Event attributes change over time
19© Copyright 2013 Pivotal. All rights reserved.
Domain: Retail
 User = Customer
 Activities
– Online: Purchase, Ad click, FB Likes
– Offline : Brick-and-mortar purchases, returns, coupon clipping,
gift cards
 Personalized Product Recommendation
20© Copyright 2013 Pivotal. All rights reserved.
Domain: IT Infrastructure
 “User” = HW & SW Components
 Activities
– Log messages, Metrics, connectivity, communication events
 Goal: Proactive alerting of imminent failures
21© Copyright 2013 Pivotal. All rights reserved.
Domain: Healthcare
 User = Patient
 Activities
– Doctor Visits, Medicine refills, Medical History
– 3G/WiFi-enabled Pillbox...
 Goal: Prevent Hospital Readmissions
22© Copyright 2013 Pivotal. All rights reserved.
Domain: Telecom
 User: Subscriber
 Activities
– Calls made, duration, calls dropped, locations, ...
– “social” graph, status updates
 Goal: Reduce customer churn
23© Copyright 2013 Pivotal. All rights reserved.
Domain: Ad-Supported Web
 User = User :-)
 Activities
– Clicks on content, Likes, Repost
– Search Queries, Comments, Participation
 Goal: Increase Engagement, Increase Clicks on
revenue-generating content (ads/premium content)
24© Copyright 2013 Pivotal. All rights reserved.
User-Modeling Pipeline
 Sessionization
 Feature and Target Generation
 Model Training
 Offline Scoring & Evaluation
 Batch Scoring & Upload to serving
25© Copyright 2013 Pivotal. All rights reserved.
What’s Next ?
26© Copyright 2013 Pivotal. All rights reserved.
Trough of Disillusionment ?
27© Copyright 2013 Pivotal. All rights reserved.
Or, Hadoop Everywhere ?
28© Copyright 2013 Pivotal. All rights reserved.
Storage Wars
 HDFS
 KosmosFS, LocalFS, Quantcast FS, S3
 MapR
 GPFS, Isilon, Atmos, Swift, NetApp
 Lustre, Gluster, Ceph, PanFS, PVFS
 EMC ViPR
29© Copyright 2013 Pivotal. All rights reserved.
NoSQL = Not Yet SQL ?
 Pivotal HAWQ
 Cloudera Impala
 Apache Drill, Spire (Drawn to Scale)
 Cascading Lingual, Optiq
 Hortonworks Stinger
 More to come....
30© Copyright 2013 Pivotal. All rights reserved.
Prepare for Convergence
 HPC: Cache Coherence, Prefetching, Zero-copy, Low-
contention locks
 “Big Data”: Caching, Mirroring, Sharding (various
flavors), relaxed consistency
 Databases: Indexing, MVCC, Columnar
storage/processing, Cost-based optimization
31© Copyright 2013 Pivotal. All rights reserved.
Convergence
 Resource Allocation, Scheduling, Lifecycle
Management
 Compute, Storage, and Communication isolation, Multi-
tenancy, Performance SLAs
 Auth & Auth, Data/System Provisioning and
Management, Monitoring, Metadata Management,
Metering
32© Copyright 2013 Pivotal. All rights reserved.
Hadoop As A Service
 Hadoop Platform-As-A-Service
– EMR competitor proliferation
– OpenStack, CloudStack, Joyent...
 Application-As-A-Service (Hadoop Inside)
– Cetas, Continuuity, Causata, Claritics, Tresata, Wibidata,…
 Pivotal One
– CloudFoundry, Hadoop, HAWQ, Analytics
– Spring, Redis, RabbitMQ
33© Copyright 2013 Pivotal. All rights reserved.
New Hardware Platforms
 Mellanox - Hadoop Acceleration through Network
Levitated Merge
 RoCE - Brocade, Cisco, Extreme, Arista...
 ARM - Low power Hadoop servers
 SSD - Velobit, Violin, FusionIO, Samsung..
 Niche - Compression, Encryption…
34© Copyright 2013 Pivotal. All rights reserved.
IAAS as the new Hardware
 AWS, GCE, Azure
 vSphere, OpenStack
 Easy Provisioning
 Scalable
 Elastic
 Ubiquitous
 Needs bundling with Data & Analytics as Services
35© Copyright 2013 Pivotal. All rights reserved.
Big Data Platform of Future ?
deploy
Public Cloud
Private Cloud
On Premise
36© Copyright 2013 Pivotal. All rights reserved.
Questions ?
A NEW PLATFORM FOR A NEW ERA

More Related Content

What's hot

Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analytics
Muralidhar Somisetty
 
Solutions Linux 2013: Extracting value from Big Data through a new informatio...
Solutions Linux 2013: Extracting value from Big Data through a new informatio...Solutions Linux 2013: Extracting value from Big Data through a new informatio...
Solutions Linux 2013: Extracting value from Big Data through a new informatio...
SpagoWorld
 
Deep Learning with Cloudera
Deep Learning with ClouderaDeep Learning with Cloudera
Deep Learning with Cloudera
Cloudera, Inc.
 
Big data, map reduce and beyond
Big data, map reduce and beyondBig data, map reduce and beyond
Big data, map reduce and beyonddatasalt
 
Yahoo Microstrategy 2008
Yahoo Microstrategy 2008Yahoo Microstrategy 2008
Yahoo Microstrategy 2008
Amr Awadallah
 
Hadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and MoreHadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and More
Trendwise Analytics
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
Febiyan Rachman
 
Hadoop core concepts
Hadoop core conceptsHadoop core concepts
Hadoop core concepts
Maryan Faryna
 
Introduction To Big Data and Use Cases using Hadoop
Introduction To Big Data and Use Cases using HadoopIntroduction To Big Data and Use Cases using Hadoop
Introduction To Big Data and Use Cases using Hadoop
Jongwook Woo
 
All data accessible to all my organization - Presentation at OW2con'19, June...
 All data accessible to all my organization - Presentation at OW2con'19, June... All data accessible to all my organization - Presentation at OW2con'19, June...
All data accessible to all my organization - Presentation at OW2con'19, June...
OW2
 
Big data 101
Big data 101Big data 101
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Caserta
 
Introduction of Big data and Hadoop
Introduction of Big data and Hadoop Introduction of Big data and Hadoop
Introduction of Big data and Hadoop
Arohi Khandelwal
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingm_hepburn
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?Hortonworks
 
Adam Fuchs' Accumulo Talk at NoSQL Now! 2013
Adam Fuchs' Accumulo Talk at NoSQL Now! 2013Adam Fuchs' Accumulo Talk at NoSQL Now! 2013
Adam Fuchs' Accumulo Talk at NoSQL Now! 2013
Sqrrl
 
Big data introduction, Hadoop in details
Big data introduction, Hadoop in detailsBig data introduction, Hadoop in details
Big data introduction, Hadoop in details
Mahmoud Yassin
 
Geode is Not a Cache, it's an Analytics Engine
Geode is Not a Cache, it's an Analytics EngineGeode is Not a Cache, it's an Analytics Engine
Geode is Not a Cache, it's an Analytics Engine
VMware Tanzu
 

What's hot (20)

Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analytics
 
Solutions Linux 2013: Extracting value from Big Data through a new informatio...
Solutions Linux 2013: Extracting value from Big Data through a new informatio...Solutions Linux 2013: Extracting value from Big Data through a new informatio...
Solutions Linux 2013: Extracting value from Big Data through a new informatio...
 
Deep Learning with Cloudera
Deep Learning with ClouderaDeep Learning with Cloudera
Deep Learning with Cloudera
 
Big data, map reduce and beyond
Big data, map reduce and beyondBig data, map reduce and beyond
Big data, map reduce and beyond
 
Yahoo Microstrategy 2008
Yahoo Microstrategy 2008Yahoo Microstrategy 2008
Yahoo Microstrategy 2008
 
Hadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and MoreHadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and More
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Hadoop core concepts
Hadoop core conceptsHadoop core concepts
Hadoop core concepts
 
Big data edel
Big data edelBig data edel
Big data edel
 
Introduction To Big Data and Use Cases using Hadoop
Introduction To Big Data and Use Cases using HadoopIntroduction To Big Data and Use Cases using Hadoop
Introduction To Big Data and Use Cases using Hadoop
 
All data accessible to all my organization - Presentation at OW2con'19, June...
 All data accessible to all my organization - Presentation at OW2con'19, June... All data accessible to all my organization - Presentation at OW2con'19, June...
All data accessible to all my organization - Presentation at OW2con'19, June...
 
Big data 101
Big data 101Big data 101
Big data 101
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Introduction of Big data and Hadoop
Introduction of Big data and Hadoop Introduction of Big data and Hadoop
Introduction of Big data and Hadoop
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
 
Adam Fuchs' Accumulo Talk at NoSQL Now! 2013
Adam Fuchs' Accumulo Talk at NoSQL Now! 2013Adam Fuchs' Accumulo Talk at NoSQL Now! 2013
Adam Fuchs' Accumulo Talk at NoSQL Now! 2013
 
Big data introduction, Hadoop in details
Big data introduction, Hadoop in detailsBig data introduction, Hadoop in details
Big data introduction, Hadoop in details
 
Big Data & Hadoop
Big Data & HadoopBig Data & Hadoop
Big Data & Hadoop
 
Geode is Not a Cache, it's an Analytics Engine
Geode is Not a Cache, it's an Analytics EngineGeode is Not a Cache, it's an Analytics Engine
Geode is Not a Cache, it's an Analytics Engine
 

Viewers also liked

Engagement slideshow final 6 4-2011
Engagement slideshow final 6 4-2011Engagement slideshow final 6 4-2011
Engagement slideshow final 6 4-2011bryanbigos
 
Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup
Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 MeetupUnlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup
Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup
BigDataCloud
 
Big Data in the Cloud - Solutions & Apps
Big Data in the Cloud - Solutions & AppsBig Data in the Cloud - Solutions & Apps
Big Data in the Cloud - Solutions & Apps
BigDataCloud
 
Why Hadoop is the New Infrastructure for the CMO?
Why Hadoop is the New Infrastructure for the CMO?Why Hadoop is the New Infrastructure for the CMO?
Why Hadoop is the New Infrastructure for the CMO?
BigDataCloud
 
Creating Business Value from Big Data, Analytics & Technology.
Creating Business Value from Big Data, Analytics & Technology.Creating Business Value from Big Data, Analytics & Technology.
Creating Business Value from Big Data, Analytics & Technology.
BigDataCloud
 
BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
BigDataCloud
 
Streak + Google Cloud Platform
Streak + Google Cloud PlatformStreak + Google Cloud Platform
Streak + Google Cloud Platform
BigDataCloud
 
Big Data Cloud Meetup - Jan 24 2013 - Zettaset
Big Data Cloud Meetup - Jan 24 2013 - ZettasetBig Data Cloud Meetup - Jan 24 2013 - Zettaset
Big Data Cloud Meetup - Jan 24 2013 - Zettaset
BigDataCloud
 
Big Data Analytics in Motorola on the Google Cloud Platform
Big Data Analytics in Motorola on the Google Cloud PlatformBig Data Analytics in Motorola on the Google Cloud Platform
Big Data Analytics in Motorola on the Google Cloud Platform
BigDataCloud
 
Cloud Computing Services
Cloud Computing ServicesCloud Computing Services
Cloud Computing Services
BigDataCloud
 
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd MeetupOptimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
BigDataCloud
 
Using Advanced Analyics to bring Business Value
Using Advanced Analyics to bring Business Value Using Advanced Analyics to bring Business Value
Using Advanced Analyics to bring Business Value
BigDataCloud
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
BigDataCloud
 
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at FacebookA Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
BigDataCloud
 
Crime Analysis & Prediction System
Crime Analysis & Prediction SystemCrime Analysis & Prediction System
Crime Analysis & Prediction System
BigDataCloud
 
Recommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab GhoshRecommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab Ghosh
BigDataCloud
 
REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS
BigDataCloud
 
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
BigDataCloud
 

Viewers also liked (18)

Engagement slideshow final 6 4-2011
Engagement slideshow final 6 4-2011Engagement slideshow final 6 4-2011
Engagement slideshow final 6 4-2011
 
Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup
Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 MeetupUnlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup
Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup
 
Big Data in the Cloud - Solutions & Apps
Big Data in the Cloud - Solutions & AppsBig Data in the Cloud - Solutions & Apps
Big Data in the Cloud - Solutions & Apps
 
Why Hadoop is the New Infrastructure for the CMO?
Why Hadoop is the New Infrastructure for the CMO?Why Hadoop is the New Infrastructure for the CMO?
Why Hadoop is the New Infrastructure for the CMO?
 
Creating Business Value from Big Data, Analytics & Technology.
Creating Business Value from Big Data, Analytics & Technology.Creating Business Value from Big Data, Analytics & Technology.
Creating Business Value from Big Data, Analytics & Technology.
 
BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
 
Streak + Google Cloud Platform
Streak + Google Cloud PlatformStreak + Google Cloud Platform
Streak + Google Cloud Platform
 
Big Data Cloud Meetup - Jan 24 2013 - Zettaset
Big Data Cloud Meetup - Jan 24 2013 - ZettasetBig Data Cloud Meetup - Jan 24 2013 - Zettaset
Big Data Cloud Meetup - Jan 24 2013 - Zettaset
 
Big Data Analytics in Motorola on the Google Cloud Platform
Big Data Analytics in Motorola on the Google Cloud PlatformBig Data Analytics in Motorola on the Google Cloud Platform
Big Data Analytics in Motorola on the Google Cloud Platform
 
Cloud Computing Services
Cloud Computing ServicesCloud Computing Services
Cloud Computing Services
 
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd MeetupOptimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
 
Using Advanced Analyics to bring Business Value
Using Advanced Analyics to bring Business Value Using Advanced Analyics to bring Business Value
Using Advanced Analyics to bring Business Value
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
 
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at FacebookA Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
 
Crime Analysis & Prediction System
Crime Analysis & Prediction SystemCrime Analysis & Prediction System
Crime Analysis & Prediction System
 
Recommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab GhoshRecommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab Ghosh
 
REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS
 
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
 

Similar to Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal

Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccion
Fran Navarro
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
jaxconf
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Hazelcast
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
Capgemini
 
Hadoop workshop
Hadoop workshopHadoop workshop
Hadoop workshopFang Mac
 
Expand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big DataExpand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big Data
jdijcks
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on Hadoop
Cloudera, Inc.
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart Meter
Cloudera, Inc.
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderainevitablecloud
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
TheInevitableCloud
 
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Sarah Aerni
 
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber ThreatsA Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber Threats
Cloudera, Inc.
 
What it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesWhat it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! Perspectives
DataWorks Summit
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
Srivatsan Ramanujam
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
POSSCON
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
 
Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
Toronto-Oracle-Users-Group
 
S2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real WorldS2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real World
Sean Roberts
 
Spark forplainoldjavageeks svforum_20140724
Spark forplainoldjavageeks svforum_20140724Spark forplainoldjavageeks svforum_20140724
Spark forplainoldjavageeks svforum_20140724
sdeeg
 

Similar to Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal (20)

Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccion
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
 
Big Data
Big DataBig Data
Big Data
 
Hadoop workshop
Hadoop workshopHadoop workshop
Hadoop workshop
 
Expand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big DataExpand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big Data
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on Hadoop
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart Meter
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-cloudera
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
 
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
 
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber ThreatsA Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber Threats
 
What it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesWhat it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! Perspectives
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
 
S2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real WorldS2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real World
 
Spark forplainoldjavageeks svforum_20140724
Spark forplainoldjavageeks svforum_20140724Spark forplainoldjavageeks svforum_20140724
Spark forplainoldjavageeks svforum_20140724
 

More from BigDataCloud

Google Enterprise Cloud Platform - Resources & $2000 credit!
Google Enterprise Cloud Platform - Resources & $2000 credit!Google Enterprise Cloud Platform - Resources & $2000 credit!
Google Enterprise Cloud Platform - Resources & $2000 credit!
BigDataCloud
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
BigDataCloud
 
Recommendation Engines - An Architectural Guide
Recommendation Engines - An Architectural GuideRecommendation Engines - An Architectural Guide
Recommendation Engines - An Architectural GuideBigDataCloud
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
BigDataCloud
 
What Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will WinWhat Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will Win
BigDataCloud
 
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentationBigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentationBigDataCloud
 
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
BigDataCloud
 
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
BigDataCloud
 
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud
 

More from BigDataCloud (9)

Google Enterprise Cloud Platform - Resources & $2000 credit!
Google Enterprise Cloud Platform - Resources & $2000 credit!Google Enterprise Cloud Platform - Resources & $2000 credit!
Google Enterprise Cloud Platform - Resources & $2000 credit!
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
Recommendation Engines - An Architectural Guide
Recommendation Engines - An Architectural GuideRecommendation Engines - An Architectural Guide
Recommendation Engines - An Architectural Guide
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
 
What Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will WinWhat Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will Win
 
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentationBigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
 
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
 
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
 
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
 

Recently uploaded

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 

Recently uploaded (20)

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 

Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal

  • 1. 1© Copyright 2013 Pivotal. All rights reserved. 1© Copyright 2013 Pivotal. All rights reserved. Hadoop: A Foundation for Change Milind Bhandarkar Chief Scientist, Pivotal Twitter: @techmilind
  • 2. 2© Copyright 2013 Pivotal. All rights reserved. About Me  http://www.linkedin.com/in/milindb  Founding member of Hadoop team at Yahoo! [2005-2010]  Contributor to Apache Hadoop since v0.1  Built and led Grid Solutions Team at Yahoo! [2007-2010]  Parallel Programming Paradigms [1989-today] (PhD cs.illinois.edu)  Center for Development of Advanced Computing (C-DAC), National Center for Supercomputing Applications (NCSA), Center for Simulation of Advanced Rockets, Siebel Systems, Pathscale Inc. (acquired by QLogic), Yahoo!, LinkedIn, and Pivotal (formerly EMC-Greenplum)
  • 3. 3© Copyright 2013 Pivotal. All rights reserved. First, technology is good. Then it gets bad. Then it gets stable. - Alistair Croll (http://strata.oreilly.com/2013/01/data-warefare.html)
  • 4. 4© Copyright 2013 Pivotal. All rights reserved. History (2003-2010)
  • 5. 5© Copyright 2013 Pivotal. All rights reserved. Google Papers
  • 6. 6© Copyright 2013 Pivotal. All rights reserved. Yahoo! Search + =
  • 7. 7© Copyright 2013 Pivotal. All rights reserved. W-1-W  WebMap : Graph processing for WWW  Dreadnaught: Infrastructure for WebMap  Juggernaut: Infrastructure for W-1-W  JFS, JMR, Condor: Abandoned for Hadoop
  • 8. 8© Copyright 2013 Pivotal. All rights reserved. Lucene, Nutch
  • 9. 9© Copyright 2013 Pivotal. All rights reserved. Kryptonite
  • 10. 10© Copyright 2013 Pivotal. All rights reserved. Lessons Learned  Multi-Tenancy from ground-up  Agility in lieu of Performance  Provisioning vs Procurement  “Weird” use cases as learning experience  Academic collaboration
  • 11. 11© Copyright 2013 Pivotal. All rights reserved. (From Hadoop Summit 2010) Who Uses Hadoop ?
  • 12. 12© Copyright 2013 Pivotal. All rights reserved. http://www.forbes.com/sites/davefeinleib/2012/06/19/the-big-data-landscape/ Big Data Landscape (June 2012)
  • 13. 13© Copyright 2013 Pivotal. All rights reserved. http://www.datameer.com/blog/perspectives/hadoop-ecosystem-as-of-january-2013-now-an-app.html Hadoop Ecosystem (January 2013)
  • 14. 14© Copyright 2013 Pivotal. All rights reserved.
  • 15. 15© Copyright 2013 Pivotal. All rights reserved.
  • 16. 16© Copyright 2013 Pivotal. All rights reserved.
  • 17. 17© Copyright 2013 Pivotal. All rights reserved. Hadoop Economics is Game Changer $- $20,000 $40,000 $60,000 $80,000 2008 2009 2010 2011 2012 2013 Big Data Platform Price/TB Big Data DB Hadoop
  • 18. 18© Copyright 2013 Pivotal. All rights reserved. “Typical” Hadoop Use-Case  “User” Modeling  Objective: Determine User-Interests by mining user- activities  Large dimensionality of possible user activities  Typical user has sparse activity vector  Event attributes change over time
  • 19. 19© Copyright 2013 Pivotal. All rights reserved. Domain: Retail  User = Customer  Activities – Online: Purchase, Ad click, FB Likes – Offline : Brick-and-mortar purchases, returns, coupon clipping, gift cards  Personalized Product Recommendation
  • 20. 20© Copyright 2013 Pivotal. All rights reserved. Domain: IT Infrastructure  “User” = HW & SW Components  Activities – Log messages, Metrics, connectivity, communication events  Goal: Proactive alerting of imminent failures
  • 21. 21© Copyright 2013 Pivotal. All rights reserved. Domain: Healthcare  User = Patient  Activities – Doctor Visits, Medicine refills, Medical History – 3G/WiFi-enabled Pillbox...  Goal: Prevent Hospital Readmissions
  • 22. 22© Copyright 2013 Pivotal. All rights reserved. Domain: Telecom  User: Subscriber  Activities – Calls made, duration, calls dropped, locations, ... – “social” graph, status updates  Goal: Reduce customer churn
  • 23. 23© Copyright 2013 Pivotal. All rights reserved. Domain: Ad-Supported Web  User = User :-)  Activities – Clicks on content, Likes, Repost – Search Queries, Comments, Participation  Goal: Increase Engagement, Increase Clicks on revenue-generating content (ads/premium content)
  • 24. 24© Copyright 2013 Pivotal. All rights reserved. User-Modeling Pipeline  Sessionization  Feature and Target Generation  Model Training  Offline Scoring & Evaluation  Batch Scoring & Upload to serving
  • 25. 25© Copyright 2013 Pivotal. All rights reserved. What’s Next ?
  • 26. 26© Copyright 2013 Pivotal. All rights reserved. Trough of Disillusionment ?
  • 27. 27© Copyright 2013 Pivotal. All rights reserved. Or, Hadoop Everywhere ?
  • 28. 28© Copyright 2013 Pivotal. All rights reserved. Storage Wars  HDFS  KosmosFS, LocalFS, Quantcast FS, S3  MapR  GPFS, Isilon, Atmos, Swift, NetApp  Lustre, Gluster, Ceph, PanFS, PVFS  EMC ViPR
  • 29. 29© Copyright 2013 Pivotal. All rights reserved. NoSQL = Not Yet SQL ?  Pivotal HAWQ  Cloudera Impala  Apache Drill, Spire (Drawn to Scale)  Cascading Lingual, Optiq  Hortonworks Stinger  More to come....
  • 30. 30© Copyright 2013 Pivotal. All rights reserved. Prepare for Convergence  HPC: Cache Coherence, Prefetching, Zero-copy, Low- contention locks  “Big Data”: Caching, Mirroring, Sharding (various flavors), relaxed consistency  Databases: Indexing, MVCC, Columnar storage/processing, Cost-based optimization
  • 31. 31© Copyright 2013 Pivotal. All rights reserved. Convergence  Resource Allocation, Scheduling, Lifecycle Management  Compute, Storage, and Communication isolation, Multi- tenancy, Performance SLAs  Auth & Auth, Data/System Provisioning and Management, Monitoring, Metadata Management, Metering
  • 32. 32© Copyright 2013 Pivotal. All rights reserved. Hadoop As A Service  Hadoop Platform-As-A-Service – EMR competitor proliferation – OpenStack, CloudStack, Joyent...  Application-As-A-Service (Hadoop Inside) – Cetas, Continuuity, Causata, Claritics, Tresata, Wibidata,…  Pivotal One – CloudFoundry, Hadoop, HAWQ, Analytics – Spring, Redis, RabbitMQ
  • 33. 33© Copyright 2013 Pivotal. All rights reserved. New Hardware Platforms  Mellanox - Hadoop Acceleration through Network Levitated Merge  RoCE - Brocade, Cisco, Extreme, Arista...  ARM - Low power Hadoop servers  SSD - Velobit, Violin, FusionIO, Samsung..  Niche - Compression, Encryption…
  • 34. 34© Copyright 2013 Pivotal. All rights reserved. IAAS as the new Hardware  AWS, GCE, Azure  vSphere, OpenStack  Easy Provisioning  Scalable  Elastic  Ubiquitous  Needs bundling with Data & Analytics as Services
  • 35. 35© Copyright 2013 Pivotal. All rights reserved. Big Data Platform of Future ? deploy Public Cloud Private Cloud On Premise
  • 36. 36© Copyright 2013 Pivotal. All rights reserved. Questions ?
  • 37. A NEW PLATFORM FOR A NEW ERA