2. 2
Agenda
§ What am I doing?
§ Big Data Computing History
– Supercomputer
– Parallel Computing
– Linux Cluster
– Big Data Computing
§ Google File System (GFS)
§ Hadoop Map and Reduce
§ Spark Stream Processing
§ References
5. 5
Healing Platform
모바일 플랫폼
Open API
의료 데이터
프로바이더 1..N
5000만명x17건/365일
=~200만건/일;
라이프레코드
프로바이더 1..N
5000만명x5회
= ~3억/일
개인 힐링 레코드
저장소 1..N
5000만명/일
요청
전송
저장
서비스
분석 엔진
모바일 서비스 1..N
RESTfullAPI
3초 이내
로드요청
표준
변환
Targeted 데이터
/힐링지식베이스
(NoSQL DB)
TD TD
TD KB
변환/필터링
스트림컴퓨팅
(업데이트 관리)
고속계산용DB
DW구축
DC DC
DC DC
Big Data
Personal Data
Control
Service
분석플랫폼
데이터 중계기
요청
전송
공공 임상
사례 빅데이터
개인 힐링레코드
사례 빅데이터
원본 빅데이터 (HDFS)
유사사례
검색
트렌드
플래닝
TD구축
지식베이스 구축 엔진
Cluster, CBR, …
9. 9
Architecture of HyperCube
John P. Hayes, “Architecture of Supercomputer,” International Conference of Parallel Processing 1986.
http://web.eecs.umich.edu/~tnm/trev_test/papersPDF/1986.08.Architecture%20Of%20A%20Hypercube%20Supercomputer_Conf_Parallel_Processing.pdf
25. 25
Limitations of achieving this goal
§ Visible Human Project
– Data Size : 40GB (~100GB)
§ Linux File System (ext2)
– 16GB/1 file
– IDE bandwidth : 33Mbps (66Mbps)
– Ethernet bandwidth : 100Mbps (below 30Mbps)
– RAM : not enough
§ Myrinet network interface
– Too difficult to use
– Kernel hooking required!!!
§ Programming Model
– PVM or MPI – Too Slow & Difficult!!!
27. 27
Google File System (GFS, 2003)
Sanjay Ghemawat, “The Google File System,”
http://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf
32. 32
How about this example?
§ Count Phone Call Logs?
– Each user’s total time for phone call
– KT’s case : 40TB / month
– No exception available
§ Oracle Database
– HW cost : ?
– SW cost : Over 400,000,000 Korean Won
– Time cost : about 1 day.
33. 33
Solution?
§ Simple is best
– Log Merge
for(int i=0; i<max_log;i++)
user[log[i].id].usage_time += log[i].usage_time;
But, Too much time required!!!
43. 43
Conclusion
§ Big Data Computing?
– Of course, it is needed!! But for us?
§ We did a lot.
– We need to enhance our aspect?
§ What’s the next?
– Trends are repeated!!!
– Your major might be come again?