• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
2013 Dec 9 Data Marketing 2013 - Hadoop
 

2013 Dec 9 Data Marketing 2013 - Hadoop

on

  • 369 views

Data Marketing 2013 Presentation of Hadoop. The paradigm shift in 45 minutes or less. No, really.

Data Marketing 2013 Presentation of Hadoop. The paradigm shift in 45 minutes or less. No, really.

Statistics

Views

Total Views
369
Views on SlideShare
361
Embed Views
8

Actions

Likes
0
Downloads
16
Comments
0

1 Embed 8

http://www.linkedin.com 8

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    2013 Dec 9 Data Marketing 2013 - Hadoop 2013 Dec 9 Data Marketing 2013 - Hadoop Presentation Transcript

    • Adam  Muise  –  Solu/on  Architect,  Hortonworks   ELEPHANT  AT  THE  DOOR:   HADOOP  AND  NEXT  GENERATION  DATA  
    • Who  am  I?  
    • Who  is                                        ?  
    • 100%  Open  Source  –   Democra/zed  Access  to   Data   The  leaders  of  Hadoop’s   development   We  do  Hadoop   Drive  Innova/on  in   the  plaForm  –  We   lead  the  roadmap     Community  driven,     Enterprise  Focused  
    • We  do  Hadoop  successfully.   Support     Training   Professional  Services  
    • We  do  Hadoop  successfully   everywhere.  
    • We  do  Hadoop  successfully,   everywhere,  with  partners.  
    • What  is  Hadoop?     What  is  everyone  talking  about?  
    • Data  
    • “Big  Data”  is  the  marke/ng  term   of  the  decade  in  IT  
    • What  lurks  behind  the  hype  is   the  democra/za/on  of  Data.  
    • You  need  data.    
    • But  what  do  you  do  with  your   data  now?  
    • We  are  obsessive  compulsive   about  collec/ng  and  structuring   our  data.  
    • Put  it  away,  delete  it,  tweet  it,   compress  it,  shred  it,  wikileak-­‐it,  put   it  in  a  database,  put  it  in  SAN/NAS,   put  it  in  the  cloud,  hide  it  in  tape…  
    • You  need  data.  Your  customers   expect  you  to  know  what  they  want   before  they  do.    
    • Let’s  talk  challenges…  
    • Volume   Volume   Volume   Volume  
    • Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume   Volume   Volume  Volume  
    • Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume  Volume   Volume  
    • Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume   Volume  Volume   Volume   Volume   Volume  
    • Storage,  Management,  Processing   all  become  challenges  with  Data  at   Volume  
    • Tradi/onal  technologies  adopt  a   divide,  drop,  and  conquer  approach  
    • Another  EDW   Analy/cal  DB   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   The  solu/on?   EDW   Data   Data   Data   Data   Data   Data   Data   Data   Data   OLTP   Data   Data   Data   Data   Data   Data   Data   Data   Data   Yet  Another  EDW   Data   Data   Data   Data   Data   Data   Data   Data   Data  
    • Another  EDW   Analy/cal  DB   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   OLTP   Ummm…you   dropped  something   EDW   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Yet  Another  EDW   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data  Data   Data   Data   Data   Data  Data   Data   Data   Data   Data   Data   Data   Data   Data  Data   Data   Data  Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data  Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data  
    • Analyzing  the  data  usually  raises   more  interes/ng  ques/ons…  
    • …which  leads  to  more  data  
    • Wait,  you’ve  seen  this  before.   …   Data   Data   Data   Analy/cs  Sausage  Factory   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data  Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   …   Data   Data   Data   Data  Data   Data   Data   Data  Data   Data   Data   Data   Data  
    • Data  begets  Data.  
    • What  keeps  us  from  our  Data?  
    • “Prices,  Stupid  passwords,  and   Boring  Sta/s/cs.”     -­‐  Hans  Rosling   h)p://www.youtube.com/watch?v=hVimVzgtD6w  
    • Your  data  silos  are  lonely  places.   EDW   Accounts   Customers   Web  Proper/es   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data  
    • …  Data  likes  to  be  together.   EDW   Accounts   Customers   Data   Data   Web  Proper/es   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data  
    • CDR   Data   Data   Data   Machine  Data   Facebook   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Weather  Data   Twi^er   Data   Data  likes  to  socialize  too.   Data   Data   EDW   Data   Data   Data   Data   Data   Data   Accounts   Data   Web  Proper/es   Data   Data   Data   Customers   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data   Data  
    • New  types  of  data  don’t  quite  fit  into   your  pris/ne  view  of  the  world.   Logs   Data   Data   Data   Data   Data  Data   Data   Machine  Data   Data   Data   Data   Data   Data  Data   Data   My  Li^le  Data  Empire   Data   ?   Data   ?   Data   Data   Data   Data   Data   ?  ?   Data   Data  
    • To  resolve  this,  some  people  take   hints  from  Lord  Of  The  Rings...  
    • …and  create  One-­‐Schema-­‐To-­‐ Rule-­‐Them-­‐All…   EDW   Data   Data   Data   Data   Data   Schema   Data   Data   Data   Data  
    • ETL   Data   Data   Data   ETL   ETL   ETL   EDW   Data   Data   Data   Data   Data   Schema   Data   Data   Data   Data   …but  that  has  its  problems  too.   ETL   Data   Data   Data   ETL   ETL   ETL   EDW   Data   Data   Data   Data   Data   Schema   Data   Data   Data   Data  
    • ETL   Data   Data   Data   ETL   ETL   ETL   EDW   Data   Data   Data   Data   Data   Schema   Data   Data   Data   Data   Fragile  workflows  make  suppor/ng  the   analy/cal  models  you  want  expensive  and   /me-­‐consuming.   ETL   Data   Data   Data   ETL   ETL   ETL   EDW   Data   Data   Data   Data   Data   Schema   Data   Data   Data   Data  
    • What  do  you  want  to  do  with   data?  
    • Marke/ng  Analy/cs  needs  data.   Work  with  the  popula/on,  not  just  a   sample.  
    • Town/City   Middle  Income  Band   Your  segmenta/on  today.   Female   Age:  25-­‐30   Male   Product  Category   Preferences  
    • GPS  coordinates   Looking  to  start  a   business     Walking  into   Starbucks  right  now…   Spent  25  minutes   looking  at  tea  cozies   Unhappy  with  his  cell   phone  plan   $65-­‐68k  per  year   Your  segmenta/on  with   Pregnant   be^er  data.   Tea  Party   Hippie   A  depressed  Toronto   Maple  Leaf’s  Fan   Gene   Expression  for   Risk  Taker   Male   Female   Age:  27  but   feels  old   Product   recommenda/ons   Thinking  about   a  new  house   Products  lek  in   basket  indicate  drunk   amazon  shopper  
    • Pick  up  all  of  that  data  that  was   prohibi/vely  expensive  to  store  and   use.      
    • Why  do  viewer  surveys…  
    • …when  raw  data  can  tell  you  what   bu^on  on  the  remote  was  pressed   during  what  commercial  for  the   en/re  viewer  popula/on?  
    • To  approach  these  use  cases  you   need  an  affordable  plaForm  that   stores,  processes,  and  analyzes  the   data.    
    • So  what  is  the  answer?  
    • Enter  the  Hadoop.   ………   h^p://www.fabulouslybroke.com/2011/05/ninja-­‐elephants-­‐and-­‐other-­‐awesome-­‐stories/  
    • Hadoop  was  created  because   tradi/onal  technologies  never  cut  it   for  the  Internet  proper/es  like   Google,  Yahoo,  Facebook,  Twi^er,   and  LinkedIn  
    • Tradi/onal  architecture  didn’t   scale  enough…   App   App   App   App   App   App   App   App   DB   DB   DB   SAN   App   App   App   App   DB   DB   DB   SAN   DB   DB   DB   SAN  
    • Databases  can  become  bloated   and  useless  
    • $upercompu/ng   Tradi/onal  architectures  cost  too   much  at  that  volume…   $/TB   $pecial   Hardware  
    • So  what  is  the  answer?  
    • If  you  could  design  a  system  that   would  handle  this,  what  would  it   look  like?  
    • It  would  probably  need  a  highly   resilient,  self-­‐healing,  cost-­‐efficient,   distributed  file  system…   Storage   Storage   Storage   Storage   Storage   Storage   Storage   Storage   Storage  
    • It  would  probably  need  a  completely   parallel  processing  framework  that   took  tasks  to  the  data…   Processing   Processing  Processing   Storage   Storage   Storage   Processing   Processing  Processing   Storage   Storage   Storage   Processing   Processing  Processing   Storage   Storage   Storage  
    • It  would  probably  run  on  commodity   hardware,  virtualized  machines,  and   common  OS  plaForms   Processing   Processing  Processing   Storage   Storage   Storage   Processing   Processing  Processing   Storage   Storage   Storage   Processing   Processing  Processing   Storage   Storage   Storage  
    • It  would  probably  be  open  source  so   innova/on  could  happen  as  quickly   as  possible  
    • It  would  need  a  cri/cal  mass  of   users  
    • Hadoop  2  just  hit  the  ground:   Introducing  YARN  
    • YARN  lets  you  run  more  data   apps  than  ever  before   MapReduce  V2   MapReduce  V?   STORM   Giraph   Tez   YARN   HDFS2   MPI   HBase   …  and   more  
    • YARN  turns  Hadoop  into  a  smart   phone:  An  App  Ecosystem   hortonworks.com/yarn/  
    • YARN:     Yeah,  we  did  that  too.   hortonworks.com/yarn/  
    • Storm   HDFS   YARN   Pig   MapReduce   Apache  Hadoop   HCatalog   Hive   HBase   Ambari   Sqoop   Falcon   Flume  
    • Storm   Pig   HDFS   YARN   MapReduce   Hortonworks  Data  PlaForm   HCatalog   Hive   HBase   Ambari   Sqoop   Falcon   Flume  
    • What  else  are  we  working  on?   hortonworks.com/labs/  
    • Hadoop  is  the  new  Data  Opera/ng   System  for  the  Enterprise  
    • There is NO second place Hortonworks   …the  Bull  Elephant  of  Hadoop  InnovaDon   © Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Page  69