Big Data & Hadoop by Skidmarkii

Big Data & Hadoop
김태우
fb.com/taewoo.kim.3910829
taewook1124@gmail.com

Definition of Big Data (1)
• From Wikipedia
>> 기존의 DBMS 로 관리할 수 없는 대
량의 정형 또는 비정형 데이터 집합
>> 위와 같은 데이터로부터 가치를 추
출한 뒤 결과를 분석하는 기술

Definition of Big Data (2)
• From Udacity
>> it's data that‘s too big to be
processed on a single
machine.
• The 3 Vs
>> Volume : 데이터의 크기
>> Variety : 데이터의 다양성
>> Velocity : 데이터의 생성 및 처리
속도

Definition of Hadoop
• From Wikipedia
>> 대량의 자료를 처리할 수 있는 큰 컴퓨
터
클러스터에서 동작하는 분산 응용 프로그램
을
지원하는 오픈 소스 프레임워크

Core Hadoop
MapMap
ReduceReduce
Store
In
HDFS
Process
With
Map Reduce

Hadoop Distributed File System
BLK_2
BLK_1
BLK_3
File
Chunk 단위로 분할
NameNode
DataNode
Cluster

Map Reduce
Mappers
Index 를 통해서 Key – value 형태의
intermediate record 를 생성
Shuffle and
Sort
Reducers
Result
intermediate record 를
Reducers 에게 전달
Key 값과 Key 에 해당하는
모든 value 를 가짐

What I’ll do
• Do tutorial
>> Set up
>> Examples run
• And more...
>> Udacity.com
>> github

Big Data & Hadoop by Skidmarkii

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to Big Data & Hadoop by Skidmarkii

Similar to Big Data & Hadoop by Skidmarkii (20)

More from Taewoo Kim

More from Taewoo Kim (6)

Big Data & Hadoop by Skidmarkii

Editor's Notes