SlideShare a Scribd company logo
解構⼤大數據架構
⼤大數據系統的伺服器與網路資源規劃
“How to eat an elephant – one byte at a time”
CP Li 李俊邦
Enterprise Technologist
Enterprise Solutions & Alliances, Greater China
Dell
2
議程
1.  不同的伺服器⾓角⾊色
1.  Manager
2.  Name Nodes
3.  Edge Nodes
4.  Data Nodes
2.  Hadoop Cluster設計
3.  Etu+Dell
4.  Futures / Roadmap
5.  Questions?
3
Server Roles - Manager
•  系統安裝圖形介⾯面/ 主控台
•  ⼤大多安裝在Edge Node
•  常⾒見版本
–  Cloudera Manager
–  Apache Ambari
4
Server Roles – Name Nodes
•  存放HDFS的metadata
•  Job Manager for YARN data-processing framework
•  Primary
–  Heartbeats from data nodes
–  10th heartbeat is a block report from which it generates
metadata
•  Standby
–  Checks in every hour to mirror metadata / block map
–  Not a hot-spare – requires manual fail-over
•  High Availability (HA) can be added in some
distributions
–  Results in a dedicated HA node that acts as a witness
to the Name Node cluster
5
Server Roles - Edge Nodes
•  資料進出Hadoop叢集的主要端⼝口
•  可擴展
•  Hadoop叢集裡唯⼀一的多網段節點
PowerEdge	
  R730	
  –	
  Name	
  Node
PowerEdge	
  R730	
  –	
  Standby	
  Name	
  Node
PowerEdge	
  R730	
  –	
  Edge	
  Node(s)
PowerEdge	
  R730	
  –	
  HA	
  Node
Corporate	
  Network Data	
  Network
Corporate
Data	
  Network
Data	
  Network
Data	
  Network
Data	
  Network
PowerEdge	
  R730XD	
  –	
  Data	
  Nodes
Data	
  Network
6
Server Roles - Data Node
•  HDFS的主要存放處
•  執⾏行YARN資源管理所指定的資料處理
•  主要屬性
–  記憶體
›  標配64GB
›  更多服務(Impala/Spark) 需要更多記憶體
–  很多的本地硬碟 (JBOD / Non-RAID mode)
›  SFF (2.5”) for performance-based workloads
›  LFF (3.5”)for capacity-centric workloads
–  CPUs – legacy recommendation of 1:1 core:spindle ratio
›  SSDs, faster HDD (10K+), and in-memory workloads make this less of an issue
›  10 and 12 core are the best practice default
Hadoop Cluster
Design
8
Hadoop Cluster Design – Hardware Considerations
9
Hadoop Cluster Deployment – Installation Best
Practices
•  Use pre-built, assembled & cabled racks from vendor
•  ⾃自動佈署⼯工具 (ex: Open Crowbar)
•  Purchase nodes in standard size groups for easy capacity growth and ordering, not in single node
increments
–  Common increments are ½ or full rack for easy deployment and sizing
•  For each type of hardware, purchase spare components to keep on site for easy, rapid repair
10
Core Hadoop Use Cases
歸檔
⾼高硬碟/CPU⽐比
記憶體使⽤用低
法規需求
⻑⾧長期歸檔
資料處
理
⾼高硬碟/CPU⽐比
記憶體使⽤用中等
DW offload
ETL offload
EDH
質量分析
IT Log分析
分析
⾼高核⼼心數
記憶體使⽤用⾼高
市場分析
詐欺預防
網路分析
11
Common Hadoop Use Case to Ecosystem Tool Mapping
12
Hadoop Use Case to Ratio Mapping
歸檔
1:2:1
資料處理
1:4:1
分析
2:8:1
CPU (Cores) : Memory (GB) : Disk (數量) – Data Node
13
Node Considerations
Dell PowerEdge R730 Dell PowerEdge R730 Dell PowerEdge R730
Dell PowerEdge R730xd
14
Node Considerations
15
HDFS Capacity
•  HDFS protects information through replication of the data between nodes, the default Replication
Factor is 3, but is configurable.
•  HDFS Raw Capacity = Number of Compute Nodes x Number of Drives x Capacity of Drives
•  HDFS Usable Capacity = HDFS Raw Capacity/Replication Factor
16
Big Data Networking Best Practices
•  Traditional Ethernet is used since it’s affordable and already prevalent.
•  1GbE networking was used initially in early drafts of the solution but with the reduction in cost it’s
much more efficient to go with 10GbE.
•  Multiple ports are teamed both for redundancy and throughput. LACP or software bonding are the
most common methods.
•  IPv4 is most widely used. IPv6 has limited support at the OS and Hadoop level.
17
Attributes of a Good Switch for Big Data
•  Non-blocking backplane
•  Deep per-port packet buffers (shared buffers do not work well). During sort/shuffle phases of
map/reduce operations network traffic is so chaotic that it can saturate any and all shared buffers,
impacting multiple host’s network performance.
•  Good choices:
–  1GbE
›  S55
›  S60
–  10GbE
›  S4810
›  S5000
–  40GbE
›  Z9000
›  Z9500
›  S6000
18
Dell Hadoop Solution Logical Diagram
19
Scale-out Aggregation Layer
20
Dell Points of Integration
•  VLT / VRRP is a very affordable way to team switches both at the ToR and the aggregation tiers.
This makes the Dell Networking Force10 switches a great choice.
•  Active Fabric Manager
–  Speeds up the creation and administration of the required VLT / VRRP configuration on the switches.
–  Helps with capacity-planning as customer scale
21
Big Data Networking Futures
•  40GbE onboard LOMs will begin to be used for high-volume clusters. Right now the cost:benefit
ratio isn’t there yet.
•  As HPC and Big Data converge, we’ll start to see the use of IB for node-to-node connectivity.
•  In-memory (Spark / Impala) workloads are reducing the bottlenecks that used to exist at the disk
and now move to the processor and network. Expect customers to be looking to increase core
counts and network speed to overcome this.
@Dell_Enterprise Enterprise Solutions
Etu+Dell = complete Hadoop/Big Data solution provider
Best of breed
Cloudera partners
- Etu
Analytic software
solutions for Big Data
Dell Professional Services for Big Data
Dell PowerEdge
13G servers
Dell Networking
solutions
Installation and configuration service
Complete end-to-end implementation
Discover Plan ImplementInvestigate
2. Store1. Integrate
4. Act
3. Analyze
Solution architecture
Analytical output
Toad Data Point
Desktop – integrate, cleanse
Dell Boomi
Cloud – integrate, correlate
Toad Intelligence
Central
Data aggregation
and virtualization
Dell STATISTICA
Customer data
Order data
Events
Stock market data
Advanced
Analytics
Marketing campaigns
Dell Statistica Big Data
Desktop – crawl, save
Social Media
24
Futures
•  Speed Improvements in Map / Reduce
•  More in-memory workloads
–  Possible move to Spark to replace Map/Reduce
•  Virtualized Hadoop
–  VMWare Big Data Extensions
–  Openstack Sahara
–  Microsoft HDInsights (Hortonworks)
25
Dell In-Memory Appliance for Cloudera Enterprise
Configurations at a glance
Mid-Size Configuration
16 Node Cluster
PowerEegeR720- 4 Infrastructure Nodes
with ProSupport
PowerEdgeR720XD- 12 Data Nodes with
ProSupport
Cloudera Enterprise
Force10- S4810P
Force10- S55
Dell Rack 42U
~528TB (disk raw space)
Starter Configuration
8 Node Cluster
PowerEdge R720- 4 Infrastructure Nodes
with ProSupport
PowerEdgeR720XD- 4 Data Nodes with
ProSupport
Cloudera Enterprise
Force10- S4810P
Force10- S55
Dell Rack 42U
~176TB (disk raw space)
Small Enterprise
Configuration
24 Node Cluster
PowerEdgeR720- 4 Infrastructure Nodes
with ProSupport
PowerEdgeR720XD- 20 Data Nodes with
ProSupport
Cloudera Enterprise
Force10- S4810P
Force10- S55
Dell Rack 42U
~880TB (disk raw space)
Expansion Unit- PowerEdgeR720XD-4 Data Nodes w ProSupport, Cloudera Enterprise, Scales in
Blocks
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃

More Related Content

What's hot

Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red_Hat_Storage
 
Migration DB2 to EDB - Project Experience
 Migration DB2 to EDB - Project Experience Migration DB2 to EDB - Project Experience
Migration DB2 to EDB - Project Experience
EDB
 
Red Hat Storage Day Atlanta - Why Software Defined Storage Matters
Red Hat Storage Day Atlanta - Why Software Defined Storage MattersRed Hat Storage Day Atlanta - Why Software Defined Storage Matters
Red Hat Storage Day Atlanta - Why Software Defined Storage Matters
Red_Hat_Storage
 
10/ EnterpriseDB @ OPEN'16
10/ EnterpriseDB @ OPEN'16 10/ EnterpriseDB @ OPEN'16
10/ EnterpriseDB @ OPEN'16
Kangaroot
 
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
Red_Hat_Storage
 
[db tech showcase Tokyo 2015] D25:The difference between logical and physical...
[db tech showcase Tokyo 2015] D25:The difference between logical and physical...[db tech showcase Tokyo 2015] D25:The difference between logical and physical...
[db tech showcase Tokyo 2015] D25:The difference between logical and physical...Insight Technology, Inc.
 
9/ IBM POWER @ OPEN'16
9/ IBM POWER @ OPEN'169/ IBM POWER @ OPEN'16
9/ IBM POWER @ OPEN'16
Kangaroot
 
IBM Power9 Features and Specifications
IBM Power9 Features and SpecificationsIBM Power9 Features and Specifications
IBM Power9 Features and Specifications
inside-BigData.com
 
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red_Hat_Storage
 
2017-02-21 AFCEA West Building Continuous Integration & Deployment (CI/CD) Pi...
2017-02-21 AFCEA West Building Continuous Integration & Deployment (CI/CD) Pi...2017-02-21 AFCEA West Building Continuous Integration & Deployment (CI/CD) Pi...
2017-02-21 AFCEA West Building Continuous Integration & Deployment (CI/CD) Pi...
Shawn Wells
 
Red hat on_power-ibm _lop_day_2015
Red hat on_power-ibm _lop_day_2015Red hat on_power-ibm _lop_day_2015
Red hat on_power-ibm _lop_day_2015
cmilsted
 
Red Hat Storage Day LA - Persistent Storage for Linux Containers
Red Hat Storage Day LA - Persistent Storage for Linux Containers Red Hat Storage Day LA - Persistent Storage for Linux Containers
Red Hat Storage Day LA - Persistent Storage for Linux Containers
Red_Hat_Storage
 
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red_Hat_Storage
 
Not all open source is the same
Not all open source is the sameNot all open source is the same
Not all open source is the same
EDB
 
Best Practices & Lessons Learned from Deployment of PostgreSQL
 Best Practices & Lessons Learned from Deployment of PostgreSQL Best Practices & Lessons Learned from Deployment of PostgreSQL
Best Practices & Lessons Learned from Deployment of PostgreSQL
EDB
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Ceph Community
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation Overview
NVIDIA Taiwan
 
Bare-metal performance for Big Data workloads on Docker containers
Bare-metal performance for Big Data workloads on Docker containersBare-metal performance for Big Data workloads on Docker containers
Bare-metal performance for Big Data workloads on Docker containers
BlueData, Inc.
 
Why Software-Defined Storage Matters
Why Software-Defined Storage MattersWhy Software-Defined Storage Matters
Why Software-Defined Storage Matters
Red_Hat_Storage
 
How to use postgresql.conf to configure and tune the PostgreSQL server
How to use postgresql.conf to configure and tune the PostgreSQL serverHow to use postgresql.conf to configure and tune the PostgreSQL server
How to use postgresql.conf to configure and tune the PostgreSQL server
EDB
 

What's hot (20)

Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
 
Migration DB2 to EDB - Project Experience
 Migration DB2 to EDB - Project Experience Migration DB2 to EDB - Project Experience
Migration DB2 to EDB - Project Experience
 
Red Hat Storage Day Atlanta - Why Software Defined Storage Matters
Red Hat Storage Day Atlanta - Why Software Defined Storage MattersRed Hat Storage Day Atlanta - Why Software Defined Storage Matters
Red Hat Storage Day Atlanta - Why Software Defined Storage Matters
 
10/ EnterpriseDB @ OPEN'16
10/ EnterpriseDB @ OPEN'16 10/ EnterpriseDB @ OPEN'16
10/ EnterpriseDB @ OPEN'16
 
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
 
[db tech showcase Tokyo 2015] D25:The difference between logical and physical...
[db tech showcase Tokyo 2015] D25:The difference between logical and physical...[db tech showcase Tokyo 2015] D25:The difference between logical and physical...
[db tech showcase Tokyo 2015] D25:The difference between logical and physical...
 
9/ IBM POWER @ OPEN'16
9/ IBM POWER @ OPEN'169/ IBM POWER @ OPEN'16
9/ IBM POWER @ OPEN'16
 
IBM Power9 Features and Specifications
IBM Power9 Features and SpecificationsIBM Power9 Features and Specifications
IBM Power9 Features and Specifications
 
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
 
2017-02-21 AFCEA West Building Continuous Integration & Deployment (CI/CD) Pi...
2017-02-21 AFCEA West Building Continuous Integration & Deployment (CI/CD) Pi...2017-02-21 AFCEA West Building Continuous Integration & Deployment (CI/CD) Pi...
2017-02-21 AFCEA West Building Continuous Integration & Deployment (CI/CD) Pi...
 
Red hat on_power-ibm _lop_day_2015
Red hat on_power-ibm _lop_day_2015Red hat on_power-ibm _lop_day_2015
Red hat on_power-ibm _lop_day_2015
 
Red Hat Storage Day LA - Persistent Storage for Linux Containers
Red Hat Storage Day LA - Persistent Storage for Linux Containers Red Hat Storage Day LA - Persistent Storage for Linux Containers
Red Hat Storage Day LA - Persistent Storage for Linux Containers
 
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
 
Not all open source is the same
Not all open source is the sameNot all open source is the same
Not all open source is the same
 
Best Practices & Lessons Learned from Deployment of PostgreSQL
 Best Practices & Lessons Learned from Deployment of PostgreSQL Best Practices & Lessons Learned from Deployment of PostgreSQL
Best Practices & Lessons Learned from Deployment of PostgreSQL
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation Overview
 
Bare-metal performance for Big Data workloads on Docker containers
Bare-metal performance for Big Data workloads on Docker containersBare-metal performance for Big Data workloads on Docker containers
Bare-metal performance for Big Data workloads on Docker containers
 
Why Software-Defined Storage Matters
Why Software-Defined Storage MattersWhy Software-Defined Storage Matters
Why Software-Defined Storage Matters
 
How to use postgresql.conf to configure and tune the PostgreSQL server
How to use postgresql.conf to configure and tune the PostgreSQL serverHow to use postgresql.conf to configure and tune the PostgreSQL server
How to use postgresql.conf to configure and tune the PostgreSQL server
 

Viewers also liked

豆瓣数据架构实践
豆瓣数据架构实践豆瓣数据架构实践
豆瓣数据架构实践
Xupeng Yun
 
Track C-2 洞見未來 - Tableau 創造大數據新價值
Track C-2 洞見未來 - Tableau 創造大數據新價值Track C-2 洞見未來 - Tableau 創造大數據新價值
Track C-2 洞見未來 - Tableau 創造大數據新價值
Etu Solution
 
Qualitative Research in Segmentation
Qualitative Research in SegmentationQualitative Research in Segmentation
Qualitative Research in Segmentation
Susan Abbott
 
Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台
Etu Solution
 
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構
Etu Solution
 
The Women's March Conversation
The Women's March Conversation The Women's March Conversation
The Women's March Conversation
Susan Abbott
 
Trinity BDM - 橋接傳統與未來
Trinity BDM - 橋接傳統與未來Trinity BDM - 橋接傳統與未來
Trinity BDM - 橋接傳統與未來
Etu Solution
 
Track C-1 大數據時代的產品 ─ 創新與洞察決策
Track C-1 大數據時代的產品 ─ 創新與洞察決策Track C-1 大數據時代的產品 ─ 創新與洞察決策
Track C-1 大數據時代的產品 ─ 創新與洞察決策
Etu Solution
 
Data Science Thailand Meetup#11
Data Science Thailand Meetup#11Data Science Thailand Meetup#11
Data Science Thailand Meetup#11
Data Science Thailand
 
Track C-3 Let's Play Marketing - 瘋創意 玩推薦 就該這樣搞行銷
Track C-3 Let's Play Marketing - 瘋創意 玩推薦 就該這樣搞行銷Track C-3 Let's Play Marketing - 瘋創意 玩推薦 就該這樣搞行銷
Track C-3 Let's Play Marketing - 瘋創意 玩推薦 就該這樣搞行銷
Etu Solution
 
CUSTOMER ANALYTICS & SEGMENTATION FOR CUSTOMER CENTRIC ORGANIZATION & MARKETI...
CUSTOMER ANALYTICS & SEGMENTATION FOR CUSTOMER CENTRIC ORGANIZATION & MARKETI...CUSTOMER ANALYTICS & SEGMENTATION FOR CUSTOMER CENTRIC ORGANIZATION & MARKETI...
CUSTOMER ANALYTICS & SEGMENTATION FOR CUSTOMER CENTRIC ORGANIZATION & MARKETI...
Data Science Thailand
 
歡迎回來:全面圖譜,金融 3.0 顧客行銷新視界
歡迎回來:全面圖譜,金融 3.0 顧客行銷新視界歡迎回來:全面圖譜,金融 3.0 顧客行銷新視界
歡迎回來:全面圖譜,金融 3.0 顧客行銷新視界
Etu Solution
 
投客所好:互聯內外,啟動投信藍海數據戰
投客所好:互聯內外,啟動投信藍海數據戰投客所好:互聯內外,啟動投信藍海數據戰
投客所好:互聯內外,啟動投信藍海數據戰
Etu Solution
 
Presentation Churn Management
Presentation Churn ManagementPresentation Churn Management
Presentation Churn Managementfarhanmajeed
 
猜你喜歡:虛實並進,贏在全通路
猜你喜歡:虛實並進,贏在全通路猜你喜歡:虛實並進,贏在全通路
猜你喜歡:虛實並進,贏在全通路
Etu Solution
 
終歸:分群消費者x多元商機的實現
終歸:分群消費者x多元商機的實現終歸:分群消費者x多元商機的實現
終歸:分群消費者x多元商機的實現
Etu Solution
 
Implementing a Segmentation Strategy
Implementing a Segmentation StrategyImplementing a Segmentation Strategy
Implementing a Segmentation Strategy
Susan Abbott
 
Data without Boundaries - 圍繞第一方數據,找到商業驅動力
Data without Boundaries - 圍繞第一方數據,找到商業驅動力Data without Boundaries - 圍繞第一方數據,找到商業驅動力
Data without Boundaries - 圍繞第一方數據,找到商業驅動力
Etu Solution
 
致詞歡迎:Big Data 無所不在,Data Technology 無 C 不歡
致詞歡迎:Big Data 無所不在,Data Technology 無 C 不歡致詞歡迎:Big Data 無所不在,Data Technology 無 C 不歡
致詞歡迎:Big Data 無所不在,Data Technology 無 C 不歡
Etu Solution
 
唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pub唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pub
Chao Zhu
 

Viewers also liked (20)

豆瓣数据架构实践
豆瓣数据架构实践豆瓣数据架构实践
豆瓣数据架构实践
 
Track C-2 洞見未來 - Tableau 創造大數據新價值
Track C-2 洞見未來 - Tableau 創造大數據新價值Track C-2 洞見未來 - Tableau 創造大數據新價值
Track C-2 洞見未來 - Tableau 創造大數據新價值
 
Qualitative Research in Segmentation
Qualitative Research in SegmentationQualitative Research in Segmentation
Qualitative Research in Segmentation
 
Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台
 
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構
 
The Women's March Conversation
The Women's March Conversation The Women's March Conversation
The Women's March Conversation
 
Trinity BDM - 橋接傳統與未來
Trinity BDM - 橋接傳統與未來Trinity BDM - 橋接傳統與未來
Trinity BDM - 橋接傳統與未來
 
Track C-1 大數據時代的產品 ─ 創新與洞察決策
Track C-1 大數據時代的產品 ─ 創新與洞察決策Track C-1 大數據時代的產品 ─ 創新與洞察決策
Track C-1 大數據時代的產品 ─ 創新與洞察決策
 
Data Science Thailand Meetup#11
Data Science Thailand Meetup#11Data Science Thailand Meetup#11
Data Science Thailand Meetup#11
 
Track C-3 Let's Play Marketing - 瘋創意 玩推薦 就該這樣搞行銷
Track C-3 Let's Play Marketing - 瘋創意 玩推薦 就該這樣搞行銷Track C-3 Let's Play Marketing - 瘋創意 玩推薦 就該這樣搞行銷
Track C-3 Let's Play Marketing - 瘋創意 玩推薦 就該這樣搞行銷
 
CUSTOMER ANALYTICS & SEGMENTATION FOR CUSTOMER CENTRIC ORGANIZATION & MARKETI...
CUSTOMER ANALYTICS & SEGMENTATION FOR CUSTOMER CENTRIC ORGANIZATION & MARKETI...CUSTOMER ANALYTICS & SEGMENTATION FOR CUSTOMER CENTRIC ORGANIZATION & MARKETI...
CUSTOMER ANALYTICS & SEGMENTATION FOR CUSTOMER CENTRIC ORGANIZATION & MARKETI...
 
歡迎回來:全面圖譜,金融 3.0 顧客行銷新視界
歡迎回來:全面圖譜,金融 3.0 顧客行銷新視界歡迎回來:全面圖譜,金融 3.0 顧客行銷新視界
歡迎回來:全面圖譜,金融 3.0 顧客行銷新視界
 
投客所好:互聯內外,啟動投信藍海數據戰
投客所好:互聯內外,啟動投信藍海數據戰投客所好:互聯內外,啟動投信藍海數據戰
投客所好:互聯內外,啟動投信藍海數據戰
 
Presentation Churn Management
Presentation Churn ManagementPresentation Churn Management
Presentation Churn Management
 
猜你喜歡:虛實並進,贏在全通路
猜你喜歡:虛實並進,贏在全通路猜你喜歡:虛實並進,贏在全通路
猜你喜歡:虛實並進,贏在全通路
 
終歸:分群消費者x多元商機的實現
終歸:分群消費者x多元商機的實現終歸:分群消費者x多元商機的實現
終歸:分群消費者x多元商機的實現
 
Implementing a Segmentation Strategy
Implementing a Segmentation StrategyImplementing a Segmentation Strategy
Implementing a Segmentation Strategy
 
Data without Boundaries - 圍繞第一方數據,找到商業驅動力
Data without Boundaries - 圍繞第一方數據,找到商業驅動力Data without Boundaries - 圍繞第一方數據,找到商業驅動力
Data without Boundaries - 圍繞第一方數據,找到商業驅動力
 
致詞歡迎:Big Data 無所不在,Data Technology 無 C 不歡
致詞歡迎:Big Data 無所不在,Data Technology 無 C 不歡致詞歡迎:Big Data 無所不在,Data Technology 無 C 不歡
致詞歡迎:Big Data 無所不在,Data Technology 無 C 不歡
 
唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pub唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pub
 

Similar to Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃

Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
Mike Pittaro
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Community
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
Hortonworks
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
solarisyougood
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Community
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weitingWei Ting Chen
 
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and StorageAccelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Alluxio, Inc.
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
Kamesh Pemmaraju
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Red_Hat_Storage
 
Whd master deck_final
Whd master deck_final Whd master deck_final
Whd master deck_final Juergen Domnik
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Databricks
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2
hdhappy001
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld
 
EMC Isilon Database Converged deck
EMC Isilon Database Converged deckEMC Isilon Database Converged deck
EMC Isilon Database Converged deck
KeithETD_CTO
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3
xKinAnx
 
Operate your hadoop cluster like a high eff goldmine
Operate your hadoop cluster like a high eff goldmineOperate your hadoop cluster like a high eff goldmine
Operate your hadoop cluster like a high eff goldmineDataWorks Summit
 
Hadoop on Azure, Blue elephants
Hadoop on Azure,  Blue elephantsHadoop on Azure,  Blue elephants
Hadoop on Azure, Blue elephants
Ovidiu Dimulescu
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture
Hortonworks
 

Similar to Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃 (20)

Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting
 
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and StorageAccelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
 
Whd master deck_final
Whd master deck_final Whd master deck_final
Whd master deck_final
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
 
EMC Isilon Database Converged deck
EMC Isilon Database Converged deckEMC Isilon Database Converged deck
EMC Isilon Database Converged deck
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3
 
Operate your hadoop cluster like a high eff goldmine
Operate your hadoop cluster like a high eff goldmineOperate your hadoop cluster like a high eff goldmine
Operate your hadoop cluster like a high eff goldmine
 
Hadoop on Azure, Blue elephants
Hadoop on Azure,  Blue elephantsHadoop on Azure,  Blue elephants
Hadoop on Azure, Blue elephants
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture
 

More from Etu Solution

啟程:Data Technology 的待客之道
啟程:Data Technology 的待客之道啟程:Data Technology 的待客之道
啟程:Data Technology 的待客之道
Etu Solution
 
Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析
Etu Solution
 
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
Etu Solution
 
Big Data Tornado - 2015 台灣 Big Data 企業經典應用案例分享
Big Data Tornado - 2015 台灣 Big Data 企業經典應用案例分享Big Data Tornado - 2015 台灣 Big Data 企業經典應用案例分享
Big Data Tornado - 2015 台灣 Big Data 企業經典應用案例分享
Etu Solution
 
Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展
Etu Solution
 
Data Leaders in Action - 資料價值領袖風範與關鍵行動
Data Leaders in Action - 資料價值領袖風範與關鍵行動Data Leaders in Action - 資料價值領袖風範與關鍵行動
Data Leaders in Action - 資料價值領袖風範與關鍵行動
Etu Solution
 
Opening: Big Data+
Opening: Big Data+Opening: Big Data+
Opening: Big Data+
Etu Solution
 
數位媒體的客戶洞察行銷術
數位媒體的客戶洞察行銷術數位媒體的客戶洞察行銷術
數位媒體的客戶洞察行銷術
Etu Solution
 
Hadoop Big Data 成功案例分享
Hadoop Big Data 成功案例分享Hadoop Big Data 成功案例分享
Hadoop Big Data 成功案例分享
Etu Solution
 
打造一個讓企業賣更多的「氣象大數據平台服務」
打造一個讓企業賣更多的「氣象大數據平台服務」打造一個讓企業賣更多的「氣象大數據平台服務」
打造一個讓企業賣更多的「氣象大數據平台服務」
Etu Solution
 
那些你知道的,但還沒看過的 Big Data 風景
那些你知道的,但還沒看過的 Big Data 風景那些你知道的,但還沒看過的 Big Data 風景
那些你知道的,但還沒看過的 Big Data 風景
Etu Solution
 
Big Data Taiwan 2014 Track1-1: 群體智慧‧想像無限 ─ 精準推薦解決方案
Big Data Taiwan 2014 Track1-1: 群體智慧‧想像無限 ─ 精準推薦解決方案Big Data Taiwan 2014 Track1-1: 群體智慧‧想像無限 ─ 精準推薦解決方案
Big Data Taiwan 2014 Track1-1: 群體智慧‧想像無限 ─ 精準推薦解決方案
Etu Solution
 
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Etu Solution
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Etu Solution
 
Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野
Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野
Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野
Etu Solution
 
Big Data Taiwan 2014 Track1-3: Big Data, Big Challenge — Splunk 幫你解決 Big Data...
Big Data Taiwan 2014 Track1-3: Big Data, Big Challenge — Splunk 幫你解決 Big Data...Big Data Taiwan 2014 Track1-3: Big Data, Big Challenge — Splunk 幫你解決 Big Data...
Big Data Taiwan 2014 Track1-3: Big Data, Big Challenge — Splunk 幫你解決 Big Data...
Etu Solution
 
Big Data Taiwan 2014 Keynote 4: Monetize Enterprise Data – Big Data 在台灣的經典應用與行動
Big Data Taiwan 2014 Keynote 4: Monetize Enterprise Data – Big Data 在台灣的經典應用與行動Big Data Taiwan 2014 Keynote 4: Monetize Enterprise Data – Big Data 在台灣的經典應用與行動
Big Data Taiwan 2014 Keynote 4: Monetize Enterprise Data – Big Data 在台灣的經典應用與行動
Etu Solution
 
Big Data Taiwan 2014 Keynote 5: 新聞媒體的大數據應用
Big Data Taiwan 2014 Keynote 5: 新聞媒體的大數據應用Big Data Taiwan 2014 Keynote 5: 新聞媒體的大數據應用
Big Data Taiwan 2014 Keynote 5: 新聞媒體的大數據應用
Etu Solution
 
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data Management
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data ManagementBig Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data Management
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data Management
Etu Solution
 

More from Etu Solution (19)

啟程:Data Technology 的待客之道
啟程:Data Technology 的待客之道啟程:Data Technology 的待客之道
啟程:Data Technology 的待客之道
 
Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析
 
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
 
Big Data Tornado - 2015 台灣 Big Data 企業經典應用案例分享
Big Data Tornado - 2015 台灣 Big Data 企業經典應用案例分享Big Data Tornado - 2015 台灣 Big Data 企業經典應用案例分享
Big Data Tornado - 2015 台灣 Big Data 企業經典應用案例分享
 
Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展
 
Data Leaders in Action - 資料價值領袖風範與關鍵行動
Data Leaders in Action - 資料價值領袖風範與關鍵行動Data Leaders in Action - 資料價值領袖風範與關鍵行動
Data Leaders in Action - 資料價值領袖風範與關鍵行動
 
Opening: Big Data+
Opening: Big Data+Opening: Big Data+
Opening: Big Data+
 
數位媒體的客戶洞察行銷術
數位媒體的客戶洞察行銷術數位媒體的客戶洞察行銷術
數位媒體的客戶洞察行銷術
 
Hadoop Big Data 成功案例分享
Hadoop Big Data 成功案例分享Hadoop Big Data 成功案例分享
Hadoop Big Data 成功案例分享
 
打造一個讓企業賣更多的「氣象大數據平台服務」
打造一個讓企業賣更多的「氣象大數據平台服務」打造一個讓企業賣更多的「氣象大數據平台服務」
打造一個讓企業賣更多的「氣象大數據平台服務」
 
那些你知道的,但還沒看過的 Big Data 風景
那些你知道的,但還沒看過的 Big Data 風景那些你知道的,但還沒看過的 Big Data 風景
那些你知道的,但還沒看過的 Big Data 風景
 
Big Data Taiwan 2014 Track1-1: 群體智慧‧想像無限 ─ 精準推薦解決方案
Big Data Taiwan 2014 Track1-1: 群體智慧‧想像無限 ─ 精準推薦解決方案Big Data Taiwan 2014 Track1-1: 群體智慧‧想像無限 ─ 精準推薦解決方案
Big Data Taiwan 2014 Track1-1: 群體智慧‧想像無限 ─ 精準推薦解決方案
 
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
 
Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野
Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野
Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野
 
Big Data Taiwan 2014 Track1-3: Big Data, Big Challenge — Splunk 幫你解決 Big Data...
Big Data Taiwan 2014 Track1-3: Big Data, Big Challenge — Splunk 幫你解決 Big Data...Big Data Taiwan 2014 Track1-3: Big Data, Big Challenge — Splunk 幫你解決 Big Data...
Big Data Taiwan 2014 Track1-3: Big Data, Big Challenge — Splunk 幫你解決 Big Data...
 
Big Data Taiwan 2014 Keynote 4: Monetize Enterprise Data – Big Data 在台灣的經典應用與行動
Big Data Taiwan 2014 Keynote 4: Monetize Enterprise Data – Big Data 在台灣的經典應用與行動Big Data Taiwan 2014 Keynote 4: Monetize Enterprise Data – Big Data 在台灣的經典應用與行動
Big Data Taiwan 2014 Keynote 4: Monetize Enterprise Data – Big Data 在台灣的經典應用與行動
 
Big Data Taiwan 2014 Keynote 5: 新聞媒體的大數據應用
Big Data Taiwan 2014 Keynote 5: 新聞媒體的大數據應用Big Data Taiwan 2014 Keynote 5: 新聞媒體的大數據應用
Big Data Taiwan 2014 Keynote 5: 新聞媒體的大數據應用
 
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data Management
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data ManagementBig Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data Management
Big Data Taiwan 2014 Keynote 2: Hadoop and the Future of Data Management
 

Recently uploaded

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 

Recently uploaded (20)

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 

Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃

  • 1. 解構⼤大數據架構 ⼤大數據系統的伺服器與網路資源規劃 “How to eat an elephant – one byte at a time” CP Li 李俊邦 Enterprise Technologist Enterprise Solutions & Alliances, Greater China Dell
  • 2. 2 議程 1.  不同的伺服器⾓角⾊色 1.  Manager 2.  Name Nodes 3.  Edge Nodes 4.  Data Nodes 2.  Hadoop Cluster設計 3.  Etu+Dell 4.  Futures / Roadmap 5.  Questions?
  • 3. 3 Server Roles - Manager •  系統安裝圖形介⾯面/ 主控台 •  ⼤大多安裝在Edge Node •  常⾒見版本 –  Cloudera Manager –  Apache Ambari
  • 4. 4 Server Roles – Name Nodes •  存放HDFS的metadata •  Job Manager for YARN data-processing framework •  Primary –  Heartbeats from data nodes –  10th heartbeat is a block report from which it generates metadata •  Standby –  Checks in every hour to mirror metadata / block map –  Not a hot-spare – requires manual fail-over •  High Availability (HA) can be added in some distributions –  Results in a dedicated HA node that acts as a witness to the Name Node cluster
  • 5. 5 Server Roles - Edge Nodes •  資料進出Hadoop叢集的主要端⼝口 •  可擴展 •  Hadoop叢集裡唯⼀一的多網段節點 PowerEdge  R730  –  Name  Node PowerEdge  R730  –  Standby  Name  Node PowerEdge  R730  –  Edge  Node(s) PowerEdge  R730  –  HA  Node Corporate  Network Data  Network Corporate Data  Network Data  Network Data  Network Data  Network PowerEdge  R730XD  –  Data  Nodes Data  Network
  • 6. 6 Server Roles - Data Node •  HDFS的主要存放處 •  執⾏行YARN資源管理所指定的資料處理 •  主要屬性 –  記憶體 ›  標配64GB ›  更多服務(Impala/Spark) 需要更多記憶體 –  很多的本地硬碟 (JBOD / Non-RAID mode) ›  SFF (2.5”) for performance-based workloads ›  LFF (3.5”)for capacity-centric workloads –  CPUs – legacy recommendation of 1:1 core:spindle ratio ›  SSDs, faster HDD (10K+), and in-memory workloads make this less of an issue ›  10 and 12 core are the best practice default
  • 8. 8 Hadoop Cluster Design – Hardware Considerations
  • 9. 9 Hadoop Cluster Deployment – Installation Best Practices •  Use pre-built, assembled & cabled racks from vendor •  ⾃自動佈署⼯工具 (ex: Open Crowbar) •  Purchase nodes in standard size groups for easy capacity growth and ordering, not in single node increments –  Common increments are ½ or full rack for easy deployment and sizing •  For each type of hardware, purchase spare components to keep on site for easy, rapid repair
  • 10. 10 Core Hadoop Use Cases 歸檔 ⾼高硬碟/CPU⽐比 記憶體使⽤用低 法規需求 ⻑⾧長期歸檔 資料處 理 ⾼高硬碟/CPU⽐比 記憶體使⽤用中等 DW offload ETL offload EDH 質量分析 IT Log分析 分析 ⾼高核⼼心數 記憶體使⽤用⾼高 市場分析 詐欺預防 網路分析
  • 11. 11 Common Hadoop Use Case to Ecosystem Tool Mapping
  • 12. 12 Hadoop Use Case to Ratio Mapping 歸檔 1:2:1 資料處理 1:4:1 分析 2:8:1 CPU (Cores) : Memory (GB) : Disk (數量) – Data Node
  • 13. 13 Node Considerations Dell PowerEdge R730 Dell PowerEdge R730 Dell PowerEdge R730 Dell PowerEdge R730xd
  • 15. 15 HDFS Capacity •  HDFS protects information through replication of the data between nodes, the default Replication Factor is 3, but is configurable. •  HDFS Raw Capacity = Number of Compute Nodes x Number of Drives x Capacity of Drives •  HDFS Usable Capacity = HDFS Raw Capacity/Replication Factor
  • 16. 16 Big Data Networking Best Practices •  Traditional Ethernet is used since it’s affordable and already prevalent. •  1GbE networking was used initially in early drafts of the solution but with the reduction in cost it’s much more efficient to go with 10GbE. •  Multiple ports are teamed both for redundancy and throughput. LACP or software bonding are the most common methods. •  IPv4 is most widely used. IPv6 has limited support at the OS and Hadoop level.
  • 17. 17 Attributes of a Good Switch for Big Data •  Non-blocking backplane •  Deep per-port packet buffers (shared buffers do not work well). During sort/shuffle phases of map/reduce operations network traffic is so chaotic that it can saturate any and all shared buffers, impacting multiple host’s network performance. •  Good choices: –  1GbE ›  S55 ›  S60 –  10GbE ›  S4810 ›  S5000 –  40GbE ›  Z9000 ›  Z9500 ›  S6000
  • 18. 18 Dell Hadoop Solution Logical Diagram
  • 20. 20 Dell Points of Integration •  VLT / VRRP is a very affordable way to team switches both at the ToR and the aggregation tiers. This makes the Dell Networking Force10 switches a great choice. •  Active Fabric Manager –  Speeds up the creation and administration of the required VLT / VRRP configuration on the switches. –  Helps with capacity-planning as customer scale
  • 21. 21 Big Data Networking Futures •  40GbE onboard LOMs will begin to be used for high-volume clusters. Right now the cost:benefit ratio isn’t there yet. •  As HPC and Big Data converge, we’ll start to see the use of IB for node-to-node connectivity. •  In-memory (Spark / Impala) workloads are reducing the bottlenecks that used to exist at the disk and now move to the processor and network. Expect customers to be looking to increase core counts and network speed to overcome this.
  • 22. @Dell_Enterprise Enterprise Solutions Etu+Dell = complete Hadoop/Big Data solution provider Best of breed Cloudera partners - Etu Analytic software solutions for Big Data Dell Professional Services for Big Data Dell PowerEdge 13G servers Dell Networking solutions Installation and configuration service Complete end-to-end implementation Discover Plan ImplementInvestigate
  • 23. 2. Store1. Integrate 4. Act 3. Analyze Solution architecture Analytical output Toad Data Point Desktop – integrate, cleanse Dell Boomi Cloud – integrate, correlate Toad Intelligence Central Data aggregation and virtualization Dell STATISTICA Customer data Order data Events Stock market data Advanced Analytics Marketing campaigns Dell Statistica Big Data Desktop – crawl, save Social Media
  • 24. 24 Futures •  Speed Improvements in Map / Reduce •  More in-memory workloads –  Possible move to Spark to replace Map/Reduce •  Virtualized Hadoop –  VMWare Big Data Extensions –  Openstack Sahara –  Microsoft HDInsights (Hortonworks)
  • 25. 25 Dell In-Memory Appliance for Cloudera Enterprise Configurations at a glance Mid-Size Configuration 16 Node Cluster PowerEegeR720- 4 Infrastructure Nodes with ProSupport PowerEdgeR720XD- 12 Data Nodes with ProSupport Cloudera Enterprise Force10- S4810P Force10- S55 Dell Rack 42U ~528TB (disk raw space) Starter Configuration 8 Node Cluster PowerEdge R720- 4 Infrastructure Nodes with ProSupport PowerEdgeR720XD- 4 Data Nodes with ProSupport Cloudera Enterprise Force10- S4810P Force10- S55 Dell Rack 42U ~176TB (disk raw space) Small Enterprise Configuration 24 Node Cluster PowerEdgeR720- 4 Infrastructure Nodes with ProSupport PowerEdgeR720XD- 20 Data Nodes with ProSupport Cloudera Enterprise Force10- S4810P Force10- S55 Dell Rack 42U ~880TB (disk raw space) Expansion Unit- PowerEdgeR720XD-4 Data Nodes w ProSupport, Cloudera Enterprise, Scales in Blocks