hadoop事例紹介

7,163 views
6,973 views

Published on

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,163
On SlideShare
0
From Embeds
0
Number of Embeds
27
Actions
Shares
0
Downloads
244
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

hadoop事例紹介

  1. 1. 2010/5/20 OSS OSS Laboratories Inc.! http://www.ossl.co.jp Mail: funai@ossl.co.jp Twitter: http://twitter.com/satoruf LinkedIn: http://jp.linkedin.com/in/satorufunai/ja 1 Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved
  2. 2. •  OSS •  Apache •  Google !  GFS (Google File System) : HDFS (Hadoop Distributed File System) !  Google MapReduce : Hadoop MapReduce !  Google Chubby : Hadoop Zookeeper !  DSL Google Sawzall : Hadoop Pig !  Google BigTable : Hadoop Hbase !  Google ? : Hadoop Hive •  •  •  Yahoo! Facebook Amazon China Mobile VISA JP Morgan Chase •  UFJ NTT •  ACID Atomic Consistent Isolated Durable BASE Basically Available Soft-State Eventual Consistency •  Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved 2
  3. 3. Apache Hadoop ETL Tools BI Reporting RDBMS Pig (Data Flow) Hive (SQL) Sqoop Avro (Serialization) (Coordination) MapReduce (Job Scheduling/Execution System) Zookeepr HBase (key-value store) (Streaming/Pipes APIs) HDFS (Hadoop Distributed File System)
  4. 4. HDFS: Hadoop Distributed File System HDFS 64MB
  5. 5. MapReduce: Distributed Processing / Map Reduce 1
  6. 6. Hadoop Business Intelligence Interactive Application OLAP Data Mart OLTP Data Store Engineers Hadoop: Storage and Batch Processing ETL/sqoop
  7. 7. Hadoop !  : !  2x Quad Core Nehalems !  24GB !  12 * 1TB SATA (JBOD , RAID ) !  1 Gigabit Ethernet !  : !  HDFS : !  ! reserved for temp shuffle space, which leaves 9TB/node !  3 way replication leads to 3TB effective HDFS space/node !  But assuming 7x compression that becomes ~ 20TB/node TB :2 5 /TB
  8. 8. Yahoo! •  Hadoop •  25,000 82PB Hadoop •  4,000 64TB 16PB 32,000 •  500 •  SearchAssistTM 26 20 •  1,500 1TB 62 •  3,700 1PB 16 Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved 8
  9. 9. Facebook •  200 Hive DWH Hadoop •  +12TB/ •  135TB/ •  1,050 32TB 12.5PB 4,800 •  .*/"& !"#$ %)&*#"$ ($ %"&'"&($ +*,-*"&$ 5*'"$ %)&*#"657,118$ &"8/*)731 =,A1)$5*'"657,118$ 9/2(:"&$ 4$ 9/2(:"&$ Node = 0&1,2)314$5*'"657,118$ Disks Disks Disks Disks Disks Disks DataNode ;&7)/"$ .","&7:",$ Node Node Node Node Node Node + <=9$ 9/2(:"&$ Map-Reduce +>%?@$ 1 Gigabit 4 Gigabit Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved 9
  10. 10. VISA •  2 Hadoop 340TB •  Hadoop #1 ~40Tb / 42 node •  Hadoop #1 ~300Tb / 28 node •  Hadoop ( ) •  ( ) Hadoop IP •  2 7 3000 36TB •  1 Hadoop 13 Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved 10
  11. 11. •  5 •  CMCC CDR 1 5TB~9TB 2,000 1 300GB •  BC-PDM(Big Cloud based Parallel Data Mining) •  Hadoop HDFS Hyper-DFS Hadoop •  16 •  ETL 12 16 •  10 50 •  3 7 •  Hadoop 256 Hadoop Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved 11
  12. 12. JP •  Hadoop •  PC (RDBMS) •  RDBMS SAN/NAS Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved 12
  13. 13. •  Hadoop •  4,000 2,000 GB •  GB x •  •  •  •  150% •  Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved 15
  14. 14. •  COOKPAD: 3.9 816 64 30 4 1 •  •  Amazon EC2 50 Hadoop •  •  http://business.nikkeibp.co.jp/article/tech/20100416/214016/ Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved 16
  15. 15. http://www.cloudera.com/ •  Hadoop •  Cloudera Mike Olson Oracle SleepycatSoftware CEO) Christophe Bisciglia Google Dr.Amr Awadallah Yahoo! VivaSmart Jeff Hammerbacher Facebook •  Cloudera Diane Greene VMware CEO Mike Abbott Palm CaterinaFake Flickr Dr. Qi Lu Microsoft Yahoo! MartenMickos MySQL CEO Jeff Weiner LinkedIn Yahoo! Gideon Yu Facebook CFO YouTube CFO •  Yahoo! Facebook OpenPDC Codeplex Hadoop Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved 17
  16. 16. Pentaho + Hadoop •  2010/7 •  Hadoop BI Hive Hadoop DFS Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved 18
  17. 17. IBM InfoSphere BigInsights •  Apache Hadoop BigInsights Core Web BigSheets 2 •  BigSheets BigSheets BigInsights Core •  BigSheet Copyright 2010(C) OSS Laboratories Inc. All Rights Reserved 19

×