• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal
 

Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal

on

  • 662 views

As adoption of Hadoop across enterprises has skyrocketed, a variety of business use cases have emerged. In this talk, Milind would highlight a few use cases, and talk about emerging use cases that are ...

As adoption of Hadoop across enterprises has skyrocketed, a variety of business use cases have emerged. In this talk, Milind would highlight a few use cases, and talk about emerging use cases that are shaping the future of the Hadoop platform.

Statistics

Views

Total Views
662
Views on SlideShare
645
Embed Views
17

Actions

Likes
8
Downloads
0
Comments
0

3 Embeds 17

http://www.linkedin.com 11
https://www.linkedin.com 5
https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal Presentation Transcript

    • 1© Copyright 2013 Pivotal. All rights reserved. 1© Copyright 2013 Pivotal. All rights reserved.Hadoop: AFoundation forChangeMilind BhandarkarChief Scientist, PivotalTwitter: @techmilind
    • 2© Copyright 2013 Pivotal. All rights reserved.About Me http://www.linkedin.com/in/milindb Founding member of Hadoop team at Yahoo! [2005-2010] Contributor to Apache Hadoop since v0.1 Built and led Grid Solutions Team at Yahoo! [2007-2010] Parallel Programming Paradigms [1989-today] (PhD cs.illinois.edu) Center for Development of Advanced Computing (C-DAC), NationalCenter for Supercomputing Applications (NCSA), Center for Simulation ofAdvanced Rockets, Siebel Systems, Pathscale Inc. (acquired by QLogic),Yahoo!, LinkedIn, and Pivotal (formerly EMC-Greenplum)
    • 3© Copyright 2013 Pivotal. All rights reserved.First, technology is good. Then it getsbad. Then it gets stable.- Alistair Croll(http://strata.oreilly.com/2013/01/data-warefare.html)
    • 4© Copyright 2013 Pivotal. All rights reserved.History (2003-2010)
    • 5© Copyright 2013 Pivotal. All rights reserved.Google Papers
    • 6© Copyright 2013 Pivotal. All rights reserved.Yahoo! Search+=
    • 7© Copyright 2013 Pivotal. All rights reserved.W-1-W WebMap : Graph processing for WWW Dreadnaught: Infrastructure for WebMap Juggernaut: Infrastructure for W-1-W JFS, JMR, Condor: Abandoned for Hadoop
    • 8© Copyright 2013 Pivotal. All rights reserved.Lucene, Nutch
    • 9© Copyright 2013 Pivotal. All rights reserved.Kryptonite
    • 10© Copyright 2013 Pivotal. All rights reserved.Lessons Learned Multi-Tenancy from ground-up Agility in lieu of Performance Provisioning vs Procurement “Weird” use cases as learning experience Academic collaboration
    • 11© Copyright 2013 Pivotal. All rights reserved.(From Hadoop Summit 2010)Who Uses Hadoop ?
    • 12© Copyright 2013 Pivotal. All rights reserved.http://www.forbes.com/sites/davefeinleib/2012/06/19/the-big-data-landscape/Big Data Landscape (June 2012)
    • 13© Copyright 2013 Pivotal. All rights reserved.http://www.datameer.com/blog/perspectives/hadoop-ecosystem-as-of-january-2013-now-an-app.htmlHadoop Ecosystem (January 2013)
    • 14© Copyright 2013 Pivotal. All rights reserved.
    • 15© Copyright 2013 Pivotal. All rights reserved.
    • 16© Copyright 2013 Pivotal. All rights reserved.
    • 17© Copyright 2013 Pivotal. All rights reserved.Hadoop Economics is Game Changer$-$20,000$40,000$60,000$80,0002008 2009 2010 2011 2012 2013Big Data Platform Price/TBBig Data DB Hadoop
    • 18© Copyright 2013 Pivotal. All rights reserved.“Typical” Hadoop Use-Case “User” Modeling Objective: Determine User-Interests by mining user-activities Large dimensionality of possible user activities Typical user has sparse activity vector Event attributes change over time
    • 19© Copyright 2013 Pivotal. All rights reserved.Domain: Retail User = Customer Activities– Online: Purchase, Ad click, FB Likes– Offline : Brick-and-mortar purchases, returns, coupon clipping,gift cards Personalized Product Recommendation
    • 20© Copyright 2013 Pivotal. All rights reserved.Domain: IT Infrastructure “User” = HW & SW Components Activities– Log messages, Metrics, connectivity, communication events Goal: Proactive alerting of imminent failures
    • 21© Copyright 2013 Pivotal. All rights reserved.Domain: Healthcare User = Patient Activities– Doctor Visits, Medicine refills, Medical History– 3G/WiFi-enabled Pillbox... Goal: Prevent Hospital Readmissions
    • 22© Copyright 2013 Pivotal. All rights reserved.Domain: Telecom User: Subscriber Activities– Calls made, duration, calls dropped, locations, ...– “social” graph, status updates Goal: Reduce customer churn
    • 23© Copyright 2013 Pivotal. All rights reserved.Domain: Ad-Supported Web User = User :-) Activities– Clicks on content, Likes, Repost– Search Queries, Comments, Participation Goal: Increase Engagement, Increase Clicks onrevenue-generating content (ads/premium content)
    • 24© Copyright 2013 Pivotal. All rights reserved.User-Modeling Pipeline Sessionization Feature and Target Generation Model Training Offline Scoring & Evaluation Batch Scoring & Upload to serving
    • 25© Copyright 2013 Pivotal. All rights reserved.What’s Next ?
    • 26© Copyright 2013 Pivotal. All rights reserved.Trough of Disillusionment ?
    • 27© Copyright 2013 Pivotal. All rights reserved.Or, Hadoop Everywhere ?
    • 28© Copyright 2013 Pivotal. All rights reserved.Storage Wars HDFS KosmosFS, LocalFS, Quantcast FS, S3 MapR GPFS, Isilon, Atmos, Swift, NetApp Lustre, Gluster, Ceph, PanFS, PVFS EMC ViPR
    • 29© Copyright 2013 Pivotal. All rights reserved.NoSQL = Not Yet SQL ? Pivotal HAWQ Cloudera Impala Apache Drill, Spire (Drawn to Scale) Cascading Lingual, Optiq Hortonworks Stinger More to come....
    • 30© Copyright 2013 Pivotal. All rights reserved.Prepare for Convergence HPC: Cache Coherence, Prefetching, Zero-copy, Low-contention locks “Big Data”: Caching, Mirroring, Sharding (variousflavors), relaxed consistency Databases: Indexing, MVCC, Columnarstorage/processing, Cost-based optimization
    • 31© Copyright 2013 Pivotal. All rights reserved.Convergence Resource Allocation, Scheduling, LifecycleManagement Compute, Storage, and Communication isolation, Multi-tenancy, Performance SLAs Auth & Auth, Data/System Provisioning andManagement, Monitoring, Metadata Management,Metering
    • 32© Copyright 2013 Pivotal. All rights reserved.Hadoop As A Service Hadoop Platform-As-A-Service– EMR competitor proliferation– OpenStack, CloudStack, Joyent... Application-As-A-Service (Hadoop Inside)– Cetas, Continuuity, Causata, Claritics, Tresata, Wibidata,… Pivotal One– CloudFoundry, Hadoop, HAWQ, Analytics– Spring, Redis, RabbitMQ
    • 33© Copyright 2013 Pivotal. All rights reserved.New Hardware Platforms Mellanox - Hadoop Acceleration through NetworkLevitated Merge RoCE - Brocade, Cisco, Extreme, Arista... ARM - Low power Hadoop servers SSD - Velobit, Violin, FusionIO, Samsung.. Niche - Compression, Encryption…
    • 34© Copyright 2013 Pivotal. All rights reserved.IAAS as the new Hardware AWS, GCE, Azure vSphere, OpenStack Easy Provisioning Scalable Elastic Ubiquitous Needs bundling with Data & Analytics as Services
    • 35© Copyright 2013 Pivotal. All rights reserved.Big Data Platform of Future ?deployPublic CloudPrivate CloudOn Premise
    • 36© Copyright 2013 Pivotal. All rights reserved.Questions ?
    • A NEW PLATFORM FOR A NEW ERA