SlideShare a Scribd company logo
1 of 28
HDFS 原理与实现 刘景龙 [email_address]
 
为什么选择 Hadoop ? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Hadoop history:
谁在用 Hadoop ?
百度 hadoop 集群现状 ,[object Object],[object Object],[object Object]
百度如何使用 hadoop ,[object Object],[object Object],[object Object]
HDFS 能做什么? ,[object Object],[object Object],[object Object],[object Object],[object Object]
HDFS 不适合做什么? ,[object Object],[object Object],[object Object]
HDFS  架构: Namenode Namespace Metadata & Journal Namespace Block Map Datanodes ,[object Object],Horizontally Scale IO and Storage ,[object Object],b1 b2 b3 b1 b5 b3 b3 b5 b2 b4 b5 b6 b2 b3 b4 Heartbeats & Block Reports ,[object Object]
HDFS  : namenode 数据结构
HDFS  : 读写流程 Client Client Namenode 1 open 2 read 2 write 1 create write write Datanodes Namespace State Block Map End-to-end checksum b1 b2 b3 b1 b5 b3 b3 b5 b2 b4 b5 b6 b2 b3 b4
HDFS :副本分布 ,[object Object],[object Object]
HDFS  :容错 Namenode Datanodes Bad/lost block replica Periodically check block checksums Namespace State Block Map b1 b2 b3 b1 b5 b3 b3 b5 b2 b4 b5 b6 b2 b3 b4 2. copy 3. blockReceived 1. replicate
HDFS :数据本地化 Data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Results Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Hadoop Cluster Block 1 Block 1 Block 2 Block 2 Block 2 Block 1 MAP MAP MAP Reduce Block 3 Block 3 Block 3
HDFS :接口 ,[object Object],[object Object],[object Object],[object Object],[object Object]
HDFS  在路上 HDFS Peta1.0 Peta2.0
可扩展性 Namenode 水平扩展 通过加机器解决文件数增加的问题 垂直扩展 内存存储热数据,冷数据磁盘存储
可扩展性: 水平扩展
可扩展性: 对象存储
可扩展性:数据结构
可扩展性: ,[object Object],[object Object],[object Object]
可用性:元数据结构 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
可用性
可用性: ,[object Object],[object Object]
未来的工作方向 ,[object Object],[object Object],[object Object],[object Object],[object Object]
求助热线: ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Q & A Thanks

More Related Content

What's hot

Hadoop 0.20 程式設計
Hadoop 0.20 程式設計Hadoop 0.20 程式設計
Hadoop 0.20 程式設計Wei-Yu Chen
 
Bigdata 大資料分析實務 (進階上機課程)
Bigdata 大資料分析實務 (進階上機課程)Bigdata 大資料分析實務 (進階上機課程)
Bigdata 大資料分析實務 (進階上機課程)家雋 莊
 
Hadoop Deployment Model @ OSDC.TW
Hadoop Deployment Model @ OSDC.TWHadoop Deployment Model @ OSDC.TW
Hadoop Deployment Model @ OSDC.TWJazz Yao-Tsung Wang
 
What could hadoop do for us
What could hadoop do for us What could hadoop do for us
What could hadoop do for us Simon Hsu
 
Hadoop大数据实践经验
Hadoop大数据实践经验Hadoop大数据实践经验
Hadoop大数据实践经验Hanborq Inc.
 
2015-05-20 製造業生產歷程全方位整合查詢與探勘的規劃心法
2015-05-20 製造業生產歷程全方位整合查詢與探勘的規劃心法2015-05-20 製造業生產歷程全方位整合查詢與探勘的規劃心法
2015-05-20 製造業生產歷程全方位整合查詢與探勘的規劃心法Jazz Yao-Tsung Wang
 
Hadoop 與 SQL 的甜蜜連結
Hadoop 與 SQL 的甜蜜連結Hadoop 與 SQL 的甜蜜連結
Hadoop 與 SQL 的甜蜜連結James Chen
 
2006-11-16 RFID and OSS for Agriculture
2006-11-16 RFID and OSS for Agriculture2006-11-16 RFID and OSS for Agriculture
2006-11-16 RFID and OSS for AgricultureJazz Yao-Tsung Wang
 
大資料分析技術的濫觴
大資料分析技術的濫觴大資料分析技術的濫觴
大資料分析技術的濫觴家雋 莊
 
2014-10-17 探析台灣巨量資料產業供應鏈串聯現況
2014-10-17 探析台灣巨量資料產業供應鏈串聯現況2014-10-17 探析台灣巨量資料產業供應鏈串聯現況
2014-10-17 探析台灣巨量資料產業供應鏈串聯現況Jazz Yao-Tsung Wang
 
Memcached vs redis
Memcached vs redisMemcached vs redis
Memcached vs redisqianshi
 
Hdfs原理及实现
Hdfs原理及实现Hdfs原理及实现
Hdfs原理及实现baggioss
 
Hadoop Map Reduce 程式設計
Hadoop Map Reduce 程式設計Hadoop Map Reduce 程式設計
Hadoop Map Reduce 程式設計Wei-Yu Chen
 
redis 适用场景与实现
redis 适用场景与实现redis 适用场景与实现
redis 适用场景与实现iammutex
 
淺談物聯網巨量資料挑戰 - Jazz 王耀聰 (2016/3/17 於鴻海內湖) 免費講座
淺談物聯網巨量資料挑戰 - Jazz 王耀聰 (2016/3/17 於鴻海內湖) 免費講座淺談物聯網巨量資料挑戰 - Jazz 王耀聰 (2016/3/17 於鴻海內湖) 免費講座
淺談物聯網巨量資料挑戰 - Jazz 王耀聰 (2016/3/17 於鴻海內湖) 免費講座NTC.im(Notch Training Center)
 
分布式存储的元数据设计
分布式存储的元数据设计分布式存储的元数据设计
分布式存储的元数据设计LI Daobing
 

What's hot (20)

Life of Big Data Technologies
Life of Big Data TechnologiesLife of Big Data Technologies
Life of Big Data Technologies
 
Hadoop 0.20 程式設計
Hadoop 0.20 程式設計Hadoop 0.20 程式設計
Hadoop 0.20 程式設計
 
Bigdata 大資料分析實務 (進階上機課程)
Bigdata 大資料分析實務 (進階上機課程)Bigdata 大資料分析實務 (進階上機課程)
Bigdata 大資料分析實務 (進階上機課程)
 
Hadoop Deployment Model @ OSDC.TW
Hadoop Deployment Model @ OSDC.TWHadoop Deployment Model @ OSDC.TW
Hadoop Deployment Model @ OSDC.TW
 
What could hadoop do for us
What could hadoop do for us What could hadoop do for us
What could hadoop do for us
 
Hadoop大数据实践经验
Hadoop大数据实践经验Hadoop大数据实践经验
Hadoop大数据实践经验
 
2015-05-20 製造業生產歷程全方位整合查詢與探勘的規劃心法
2015-05-20 製造業生產歷程全方位整合查詢與探勘的規劃心法2015-05-20 製造業生產歷程全方位整合查詢與探勘的規劃心法
2015-05-20 製造業生產歷程全方位整合查詢與探勘的規劃心法
 
大數據
大數據大數據
大數據
 
Hadoop 與 SQL 的甜蜜連結
Hadoop 與 SQL 的甜蜜連結Hadoop 與 SQL 的甜蜜連結
Hadoop 與 SQL 的甜蜜連結
 
2006-11-16 RFID and OSS for Agriculture
2006-11-16 RFID and OSS for Agriculture2006-11-16 RFID and OSS for Agriculture
2006-11-16 RFID and OSS for Agriculture
 
大資料分析技術的濫觴
大資料分析技術的濫觴大資料分析技術的濫觴
大資料分析技術的濫觴
 
Dfs ning
Dfs ningDfs ning
Dfs ning
 
2014-10-17 探析台灣巨量資料產業供應鏈串聯現況
2014-10-17 探析台灣巨量資料產業供應鏈串聯現況2014-10-17 探析台灣巨量資料產業供應鏈串聯現況
2014-10-17 探析台灣巨量資料產業供應鏈串聯現況
 
Memcached vs redis
Memcached vs redisMemcached vs redis
Memcached vs redis
 
When R meet Hadoop
When R meet HadoopWhen R meet Hadoop
When R meet Hadoop
 
Hdfs原理及实现
Hdfs原理及实现Hdfs原理及实现
Hdfs原理及实现
 
Hadoop Map Reduce 程式設計
Hadoop Map Reduce 程式設計Hadoop Map Reduce 程式設計
Hadoop Map Reduce 程式設計
 
redis 适用场景与实现
redis 适用场景与实现redis 适用场景与实现
redis 适用场景与实现
 
淺談物聯網巨量資料挑戰 - Jazz 王耀聰 (2016/3/17 於鴻海內湖) 免費講座
淺談物聯網巨量資料挑戰 - Jazz 王耀聰 (2016/3/17 於鴻海內湖) 免費講座淺談物聯網巨量資料挑戰 - Jazz 王耀聰 (2016/3/17 於鴻海內湖) 免費講座
淺談物聯網巨量資料挑戰 - Jazz 王耀聰 (2016/3/17 於鴻海內湖) 免費講座
 
分布式存储的元数据设计
分布式存储的元数据设计分布式存储的元数据设计
分布式存储的元数据设计
 

Viewers also liked

Turing machine2
Turing machine2Turing machine2
Turing machine2bewhands
 
CETS 2011, Traci Weiss, Creating Scenario-Based Learning Using Rapid eLearnin...
CETS 2011, Traci Weiss, Creating Scenario-Based Learning Using Rapid eLearnin...CETS 2011, Traci Weiss, Creating Scenario-Based Learning Using Rapid eLearnin...
CETS 2011, Traci Weiss, Creating Scenario-Based Learning Using Rapid eLearnin...Chicago eLearning & Technology Showcase
 
Lexus Venture Banjarmasin
Lexus Venture BanjarmasinLexus Venture Banjarmasin
Lexus Venture BanjarmasinTimbul Naibaho
 
15825270 mutasi-dna-powerhouse-rhenald-kasali-ph-d
15825270 mutasi-dna-powerhouse-rhenald-kasali-ph-d15825270 mutasi-dna-powerhouse-rhenald-kasali-ph-d
15825270 mutasi-dna-powerhouse-rhenald-kasali-ph-dibnuwahyuddinramdani
 
Picasso[1]
Picasso[1]Picasso[1]
Picasso[1]mbushong
 
CETS 2011, Marge Feely, slides for The Devil Is in the Details: Technical Con...
CETS 2011, Marge Feely, slides for The Devil Is in the Details: Technical Con...CETS 2011, Marge Feely, slides for The Devil Is in the Details: Technical Con...
CETS 2011, Marge Feely, slides for The Devil Is in the Details: Technical Con...Chicago eLearning & Technology Showcase
 
CETS 2011, Brian Richardson, slides for Best Practices for LMS Selection and ...
CETS 2011, Brian Richardson, slides for Best Practices for LMS Selection and ...CETS 2011, Brian Richardson, slides for Best Practices for LMS Selection and ...
CETS 2011, Brian Richardson, slides for Best Practices for LMS Selection and ...Chicago eLearning & Technology Showcase
 
From Food Chains to Food Web
From Food Chains to Food WebFrom Food Chains to Food Web
From Food Chains to Food WebLM9
 

Viewers also liked (20)

CETS 2011, Sarah Remijan, slides for Webinars Made Easy
CETS 2011, Sarah Remijan, slides for Webinars Made EasyCETS 2011, Sarah Remijan, slides for Webinars Made Easy
CETS 2011, Sarah Remijan, slides for Webinars Made Easy
 
CETS 2013, Tracy Adams, slides for Make It Once, Use it Twice
CETS 2013, Tracy Adams, slides for Make It Once, Use it TwiceCETS 2013, Tracy Adams, slides for Make It Once, Use it Twice
CETS 2013, Tracy Adams, slides for Make It Once, Use it Twice
 
Organigrama original 84813
Organigrama original 84813Organigrama original 84813
Organigrama original 84813
 
Turing machine2
Turing machine2Turing machine2
Turing machine2
 
Digital Lifestyle Expo 2012
Digital Lifestyle Expo 2012Digital Lifestyle Expo 2012
Digital Lifestyle Expo 2012
 
CETS 2011, Traci Weiss, Creating Scenario-Based Learning Using Rapid eLearnin...
CETS 2011, Traci Weiss, Creating Scenario-Based Learning Using Rapid eLearnin...CETS 2011, Traci Weiss, Creating Scenario-Based Learning Using Rapid eLearnin...
CETS 2011, Traci Weiss, Creating Scenario-Based Learning Using Rapid eLearnin...
 
Lexus Venture Banjarmasin
Lexus Venture BanjarmasinLexus Venture Banjarmasin
Lexus Venture Banjarmasin
 
TiE Asia Pacific Conference 2010
TiE Asia Pacific Conference 2010TiE Asia Pacific Conference 2010
TiE Asia Pacific Conference 2010
 
emmettryan2010
emmettryan2010emmettryan2010
emmettryan2010
 
Pavasaris
PavasarisPavasaris
Pavasaris
 
15825270 mutasi-dna-powerhouse-rhenald-kasali-ph-d
15825270 mutasi-dna-powerhouse-rhenald-kasali-ph-d15825270 mutasi-dna-powerhouse-rhenald-kasali-ph-d
15825270 mutasi-dna-powerhouse-rhenald-kasali-ph-d
 
X|Media|Lab KL: Animation and Games CALL FOR NOMINATIONS
X|Media|Lab KL: Animation and Games CALL FOR NOMINATIONSX|Media|Lab KL: Animation and Games CALL FOR NOMINATIONS
X|Media|Lab KL: Animation and Games CALL FOR NOMINATIONS
 
Study abroad 1
Study abroad 1Study abroad 1
Study abroad 1
 
Tick App
Tick AppTick App
Tick App
 
Picasso[1]
Picasso[1]Picasso[1]
Picasso[1]
 
CETS 2011, Marge Feely, slides for The Devil Is in the Details: Technical Con...
CETS 2011, Marge Feely, slides for The Devil Is in the Details: Technical Con...CETS 2011, Marge Feely, slides for The Devil Is in the Details: Technical Con...
CETS 2011, Marge Feely, slides for The Devil Is in the Details: Technical Con...
 
Herramientas publicitarias de google
Herramientas publicitarias de googleHerramientas publicitarias de google
Herramientas publicitarias de google
 
CETS 2011, Brian Richardson, slides for Best Practices for LMS Selection and ...
CETS 2011, Brian Richardson, slides for Best Practices for LMS Selection and ...CETS 2011, Brian Richardson, slides for Best Practices for LMS Selection and ...
CETS 2011, Brian Richardson, slides for Best Practices for LMS Selection and ...
 
From Food Chains to Food Web
From Food Chains to Food WebFrom Food Chains to Food Web
From Food Chains to Food Web
 
CETS 2011, Mike Kemmler, slides for SCORM 101
CETS 2011, Mike Kemmler, slides for SCORM 101CETS 2011, Mike Kemmler, slides for SCORM 101
CETS 2011, Mike Kemmler, slides for SCORM 101
 

Similar to Hdfs introduction

Hadoop 簡介 教師 許智威
Hadoop 簡介 教師 許智威Hadoop 簡介 教師 許智威
Hadoop 簡介 教師 許智威Awei Hsu
 
大规模数据处理
大规模数据处理大规模数据处理
大规模数据处理airsex
 
Hadoop-分布式数据平台
Hadoop-分布式数据平台Hadoop-分布式数据平台
Hadoop-分布式数据平台Jacky Chi
 
HDFS與MapReduce架構研討
HDFS與MapReduce架構研討HDFS與MapReduce架構研討
HDFS與MapReduce架構研討Billy Yang
 
淘宝Hadoop数据分析实践
淘宝Hadoop数据分析实践淘宝Hadoop数据分析实践
淘宝Hadoop数据分析实践Min Zhou
 
百度系统部分布式系统介绍 马如悦 Sacc2010
百度系统部分布式系统介绍 马如悦 Sacc2010百度系统部分布式系统介绍 马如悦 Sacc2010
百度系统部分布式系统介绍 马如悦 Sacc2010Chuanying Du
 
查礼 -大数据技术如何用于传统信息系统
查礼 -大数据技术如何用于传统信息系统查礼 -大数据技术如何用于传统信息系统
查礼 -大数据技术如何用于传统信息系统hdhappy001
 
淘宝分布式数据处理实践
淘宝分布式数据处理实践淘宝分布式数据处理实践
淘宝分布式数据处理实践isnull
 
Hadoop con 2015 hadoop enables enterprise data lake
Hadoop con 2015   hadoop enables enterprise data lakeHadoop con 2015   hadoop enables enterprise data lake
Hadoop con 2015 hadoop enables enterprise data lakeJames Chen
 
Big Data Projet Management the Body of Knowledge (BDPMBOK)
Big Data Projet Management the Body of Knowledge (BDPMBOK)Big Data Projet Management the Body of Knowledge (BDPMBOK)
Big Data Projet Management the Body of Knowledge (BDPMBOK)Jazz Yao-Tsung Wang
 
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹Anna Yen
 
Hadoop与数据分析
Hadoop与数据分析Hadoop与数据分析
Hadoop与数据分析George Ang
 
Hadoop系统及其关键技术
Hadoop系统及其关键技术Hadoop系统及其关键技术
Hadoop系统及其关键技术冬 陈
 
Voldemort Intro Tangfl
Voldemort Intro TangflVoldemort Intro Tangfl
Voldemort Intro Tangflfulin tang
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopTechParty@UIC
 
Hadoop基线选定
Hadoop基线选定Hadoop基线选定
Hadoop基线选定baggioss
 
Hdfs raid migration to hadoop 1.x
Hdfs raid migration to hadoop 1.x Hdfs raid migration to hadoop 1.x
Hdfs raid migration to hadoop 1.x Jiang Yu
 
Hadoop yarn 基本架构和发展趋势
Hadoop yarn 基本架构和发展趋势Hadoop yarn 基本架构和发展趋势
Hadoop yarn 基本架构和发展趋势Xicheng Dong
 

Similar to Hdfs introduction (20)

Hadoop 簡介 教師 許智威
Hadoop 簡介 教師 許智威Hadoop 簡介 教師 許智威
Hadoop 簡介 教師 許智威
 
大规模数据处理
大规模数据处理大规模数据处理
大规模数据处理
 
Hadoop-分布式数据平台
Hadoop-分布式数据平台Hadoop-分布式数据平台
Hadoop-分布式数据平台
 
HDFS與MapReduce架構研討
HDFS與MapReduce架構研討HDFS與MapReduce架構研討
HDFS與MapReduce架構研討
 
淘宝Hadoop数据分析实践
淘宝Hadoop数据分析实践淘宝Hadoop数据分析实践
淘宝Hadoop数据分析实践
 
百度系统部分布式系统介绍 马如悦 Sacc2010
百度系统部分布式系统介绍 马如悦 Sacc2010百度系统部分布式系统介绍 马如悦 Sacc2010
百度系统部分布式系统介绍 马如悦 Sacc2010
 
查礼 -大数据技术如何用于传统信息系统
查礼 -大数据技术如何用于传统信息系统查礼 -大数据技术如何用于传统信息系统
查礼 -大数据技术如何用于传统信息系统
 
淘宝分布式数据处理实践
淘宝分布式数据处理实践淘宝分布式数据处理实践
淘宝分布式数据处理实践
 
Hadoop con 2015 hadoop enables enterprise data lake
Hadoop con 2015   hadoop enables enterprise data lakeHadoop con 2015   hadoop enables enterprise data lake
Hadoop con 2015 hadoop enables enterprise data lake
 
Hadoop
HadoopHadoop
Hadoop
 
Big Data Projet Management the Body of Knowledge (BDPMBOK)
Big Data Projet Management the Body of Knowledge (BDPMBOK)Big Data Projet Management the Body of Knowledge (BDPMBOK)
Big Data Projet Management the Body of Knowledge (BDPMBOK)
 
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
 
Hadoop与数据分析
Hadoop与数据分析Hadoop与数据分析
Hadoop与数据分析
 
Hadoop系统及其关键技术
Hadoop系统及其关键技术Hadoop系统及其关键技术
Hadoop系统及其关键技术
 
Voldemort Intro Tangfl
Voldemort Intro TangflVoldemort Intro Tangfl
Voldemort Intro Tangfl
 
Hic2011
Hic2011Hic2011
Hic2011
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop基线选定
Hadoop基线选定Hadoop基线选定
Hadoop基线选定
 
Hdfs raid migration to hadoop 1.x
Hdfs raid migration to hadoop 1.x Hdfs raid migration to hadoop 1.x
Hdfs raid migration to hadoop 1.x
 
Hadoop yarn 基本架构和发展趋势
Hadoop yarn 基本架构和发展趋势Hadoop yarn 基本架构和发展趋势
Hadoop yarn 基本架构和发展趋势
 

More from baggioss

Hdfs写流程异常处理
Hdfs写流程异常处理Hdfs写流程异常处理
Hdfs写流程异常处理baggioss
 
Hbase性能测试文档
Hbase性能测试文档Hbase性能测试文档
Hbase性能测试文档baggioss
 
Hbase使用hadoop分析
Hbase使用hadoop分析Hbase使用hadoop分析
Hbase使用hadoop分析baggioss
 
Hic 2011 realtime_analytics_at_facebook
Hic 2011 realtime_analytics_at_facebookHic 2011 realtime_analytics_at_facebook
Hic 2011 realtime_analytics_at_facebookbaggioss
 
[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)baggioss
 

More from baggioss (6)

Hdfs写流程异常处理
Hdfs写流程异常处理Hdfs写流程异常处理
Hdfs写流程异常处理
 
Hbase性能测试文档
Hbase性能测试文档Hbase性能测试文档
Hbase性能测试文档
 
Hbase使用hadoop分析
Hbase使用hadoop分析Hbase使用hadoop分析
Hbase使用hadoop分析
 
Hic 2011 realtime_analytics_at_facebook
Hic 2011 realtime_analytics_at_facebookHic 2011 realtime_analytics_at_facebook
Hic 2011 realtime_analytics_at_facebook
 
[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)
 
Hbase
HbaseHbase
Hbase
 

Hdfs introduction

  • 2.  
  • 3.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. HDFS : namenode 数据结构
  • 12. HDFS : 读写流程 Client Client Namenode 1 open 2 read 2 write 1 create write write Datanodes Namespace State Block Map End-to-end checksum b1 b2 b3 b1 b5 b3 b3 b5 b2 b4 b5 b6 b2 b3 b4
  • 13.
  • 14. HDFS :容错 Namenode Datanodes Bad/lost block replica Periodically check block checksums Namespace State Block Map b1 b2 b3 b1 b5 b3 b3 b5 b2 b4 b5 b6 b2 b3 b4 2. copy 3. blockReceived 1. replicate
  • 15. HDFS :数据本地化 Data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Results Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Hadoop Cluster Block 1 Block 1 Block 2 Block 2 Block 2 Block 1 MAP MAP MAP Reduce Block 3 Block 3 Block 3
  • 16.
  • 17. HDFS 在路上 HDFS Peta1.0 Peta2.0
  • 18. 可扩展性 Namenode 水平扩展 通过加机器解决文件数增加的问题 垂直扩展 内存存储热数据,冷数据磁盘存储
  • 22.
  • 23.
  • 25.
  • 26.
  • 27.
  • 28. Q & A Thanks

Editor's Notes

  1. 按照当前各公司公布的数据来看,百度日处理规模居全球主要互联网公司第 2 名,仅次于 Google 的每日 30PB 左右的输入数据处理量。
  2. – Chooses new DataNodes for new replicas – Balances disk usage – Balances communication traffic to DataNodes
  3. Block (Object) Storage Subsystem Shared storage provided as pools of blocks Namespaces (HDFS, others) use one or more block-pools Note: HDFS has 2 layers today – we are generalizing/extending it.