SlideShare a Scribd company logo
1 of 26
Download to read offline
한국보건정보통계학회 추계학술발표회 2013

“빅” 데이터의 분석적 시각화
Analytic Data Visualization
許 明 會

2013.11.29

고려대학교 통계학과 stat420@korea.ac.kr

1

Health Info & Stat
Data Visualization
- Descriptive vs Analytic ...
- Small vs Big ...

science

technology
art

2013.11.29

2

Health Info & Stat
Contents
- Scatterplot
- Biplot
- Regression Biplot
- Kernel PCA
- SVM Biplot

2013.11.29

3

Health Info & Stat
Scatterplot: 산점도
- “Lego” for analytic data visualization
- Reflecting the third variable

quakes:

2013.11.29

longitude(=x), latitude(=y), depth(=z)

4

Health Info & Stat
Scatterplot: 산점도
- For the case of large  (≧  ), over-plotting can produce
serious outcome.

Skin Segmentation Data:  (red) vs.  (green)
      

2013.11.29

5

Health Info & Stat
Scatterplot: 산점도
- For the case of large  (≧  ), alpha channel can be utilized.

Skin Segmentation Data:  (red) vs.  (green)
      

2013.11.29

6

Health Info & Stat
Scatterplot: 산점도
- lowess: A nonparametric regression for bivariate data

cars data: distance vs. speed

2013.11.29

7

Health Info & Stat
Scatterplot: 산점도
- 3D Rotation for three variables

Skin Segmentation Data:  (red),  (green),  (blue)

- ggobi:

2013.11.29

3D Rotation for four or more variables

8

Health Info & Stat
Biplot of Observations and Variables,

Gabriel (1971)

- The biplot is a graph that shows  observations and  variables.

Protein data (row: 25 nations, column: 9 protein sources)

2013.11.29

9

Health Info & Stat
Biplot of Observations and Variables,

Gabriel (1971)

- Idea: Linear projection

Protein data: variable cereal

2013.11.29

10

Health Info & Stat
Regression Biplot,

Huh and Lee (2013)

- Regression biplot is a graph for  observations of   ⋯    ,
arranged by predicted  .
- Assume that the model fit is determined by a function of linear
combination of   ⋯    . For instance,
   ⋯  ,


 
 
or

log           ⋯    .



- Set the vertical dimension by the direction of regression coefficients
  

  ⋮ ,
or      .
∥∥
  
- Set the horizontal dimension by the direction of principal axis of





  ⋯   ,



where  

denotes the orthogonal component generated from the

projection of   on  .

2013.11.29

11

Health Info & Stat
Regression Biplot,

Huh and Lee (2013)

Example 1. Stack Loss Data (  ;   loss of ammonia,         )

2013.11.29

12

Health Info & Stat
Regression Biplot,

Huh and Lee (2013)

Example 2. Magazine Data (  ;   Subscription (0,1),   )

2013.11.29

13

Health Info & Stat
Kernel PCA,

Scholkopf et al. (1998)

- For  observations    ⋯    ( × ), consider the nonlinear mapping
    ⋯   
to a Hilbert space, in which                      .
- Denoting            , Kernel PCA is obtained from
eigen-decomposing
             .






- Kernel PCA yields a plot of observations by projecting       ⋯      
on 









′  


where 

2013.11.29


′

   ′  ,

     ,   is an eigenvector of  .



14

Health Info & Stat
Kernel PCA Diagram (or Kernel Biplot),

Huh (2013)

- Aim: Representation of  variables in Kernel PC plot of observations.
- Proposed Procedure:

1) For each    ⋯    , map         on the plane,

   ⋯   , where    is a constant and     ⋯   ⋯    .
Projection is given by




′  

  ′   
′


 
 
  

 ″    
 ′ ″   
 ″ ″′  .

 ″  
 ″  
 ″  ″′  








2) For each  , link the projection points of   and  

2013.11.29

15

by an arrow.

Health Info & Stat
Example 1. Arrow diagrams [  ] for kernel PCA of the iris data
with rbf kernel,   

2013.11.29

16

Health Info & Stat
Example 1. Arrow diagrams [  ] for kernel PCA of the iris data
with rbf kernel,   

2013.11.29

17

Health Info & Stat
Example 2. Arrow diagrams [  ] for kernel PCA of the spam data
[      ]

2013.11.29

18

Health Info & Stat
SVM-Guided Biplot as an extension of Regression Biplot
- Idea: Combine Linear/Logistic Regression Biplot and Kernel PCA.
- Classification/Regression Part:
Classified

as

SVM classifier

  -1 or 1 for    ⋯   .
              ,


where

 

      , 





Vertical dimension is set to


  
   
  



2013.11.29







≧ .







(      ,        ).

19

Health Info & Stat
SVM-Guided Biplot: Classification
- Kernel PCA Part:
         
 

 
∴




(   
      ′  ),
 ′   ′



   ⋯   .

          ′                  ′   ′   


 ′   ′ ,

  ′   ⋯   .

Hence


 →      (   ) or          .




Horizontal dimension is determined by eigen-decomposing  .

- Perturbation Scheme for Arrow Diagrams.
Define      ,  ×  , where  represents a perturbation of
which the magnitude is controlled by . Then, project   on the first
(vertical) and the second (horizontal) dimension.

2013.11.29

20

Health Info & Stat
Example 1. Iris Data: Versicolor vs. Virginica [sigma=0.1, C=1,   ]

2013.11.29

21

Health Info & Stat
Importance of Variables

(in the case of large

)

- It is necessary to select a small number of variables in determining
the first and second dimensions.
- Measures of Importance (definition)  Length of Arrows
1) in vertical direction,
2) in horizontal direction.
- Plot arrow diagrams for importance variables only.

2013.11.29

22

Health Info & Stat
Example 2. Spam Data [sigma=0.1, C=10,   ],   

2013.11.29

23

Health Info & Stat
SVM-Guided Biplot: Regression
- The same method can be applied to SVM regression.
- Example 3. Aerobic Fitness [       ] for oxygen uptake (=  )
with RBF kernel ( =0.1, C=10,  =0.1,   )

2013.11.29

24

Health Info & Stat
Concluding Remarks
- Biplot method can be extended to be suited for linear regression or
classification (logistic regression).
- Biplot method can be extended to allow nonlinear mapping of
observations and variables, by fully utilizing kernel trick.

http://blog.naver.com/huh4200

금붕어 어항 (on the iPad)

2013.11.29

25

Health Info & Stat
References
Gabriel, K.R. (1971). “The biplot display of matrices with the application to
principal component analysis”. Biometrika, 58. 453-467.
Huh, M.H. (2013). “Arrow diagrams for kernel principal component analysis”.
Communications for Statistical Applications and Methods, 20. 175-184.
Huh, M.H. (2013). “SVM-guided biplot of observations and variables”.
Communications for Statistical Applications and Methods. (to appear)
Huh, M.H. and Lee, Y.G. (2013). “Biplots of multivariate data guided by linear
and/or logistic regression”. Communications for Statistical Applications and
Methods, 20. 129-136.
Scholkopf, B., Smola, A. and Muller, K.R. (1998). Nonlinear component analysis as
a kernel eigenvalue problem. Neural Computation, 10. 1299–1319.

2013.11.29

26

Health Info & Stat

More Related Content

What's hot

Parallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationParallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationIJERA Editor
 
Supporting Flight Test And Flight Matching
Supporting Flight Test And Flight MatchingSupporting Flight Test And Flight Matching
Supporting Flight Test And Flight Matchingj2aircraft
 
A Novel Approach to Analyze Satellite Images for Severe Weather Events
A Novel Approach to Analyze Satellite Images for Severe Weather EventsA Novel Approach to Analyze Satellite Images for Severe Weather Events
A Novel Approach to Analyze Satellite Images for Severe Weather EventsIJERA Editor
 
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...Yoshihiro Nagano
 

What's hot (6)

Parallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationParallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix Multiplication
 
Supporting Flight Test And Flight Matching
Supporting Flight Test And Flight MatchingSupporting Flight Test And Flight Matching
Supporting Flight Test And Flight Matching
 
A Novel Approach to Analyze Satellite Images for Severe Weather Events
A Novel Approach to Analyze Satellite Images for Severe Weather EventsA Novel Approach to Analyze Satellite Images for Severe Weather Events
A Novel Approach to Analyze Satellite Images for Severe Weather Events
 
Vldb14
Vldb14Vldb14
Vldb14
 
Four data models in GIS
Four data models in GISFour data models in GIS
Four data models in GIS
 
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
 

Viewers also liked

데이터 분석 실무 2강 (실습 1차)
데이터 분석 실무 2강 (실습 1차)데이터 분석 실무 2강 (실습 1차)
데이터 분석 실무 2강 (실습 1차)YongGeun Song
 
대화형지도 Carto를 활용한 데이터 분석 및 통찰력
대화형지도 Carto를 활용한 데이터  분석 및 통찰력대화형지도 Carto를 활용한 데이터  분석 및 통찰력
대화형지도 Carto를 활용한 데이터 분석 및 통찰력선경 김선경
 
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석YBIGTA
 
인프라 성능 데이터 분석 시작하기 (김아령)
인프라 성능 데이터 분석 시작하기 (김아령)인프라 성능 데이터 분석 시작하기 (김아령)
인프라 성능 데이터 분석 시작하기 (김아령)삵 (sarc.io)
 
데이터 분석 실무 1강
데이터 분석 실무 1강데이터 분석 실무 1강
데이터 분석 실무 1강YongGeun Song
 
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어Dongsam Byun
 
판매정보 빅데이터 분석을 통한 판매 예측 시스템
판매정보 빅데이터 분석을 통한 판매 예측 시스템판매정보 빅데이터 분석을 통한 판매 예측 시스템
판매정보 빅데이터 분석을 통한 판매 예측 시스템Dongsam Byun
 
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영FAST CAMPUS
 
비즈니스 데이터 분석의 현재와 미래
비즈니스 데이터 분석의 현재와 미래비즈니스 데이터 분석의 현재와 미래
비즈니스 데이터 분석의 현재와 미래HT Kim
 
예측 분석 산업별 사례 147
예측 분석 산업별 사례 147예측 분석 산업별 사례 147
예측 분석 산업별 사례 147eungjin cho
 
검색로그시스템 with Python
검색로그시스템 with Python검색로그시스템 with Python
검색로그시스템 with Pythonitproman35
 
파이썬 데이터 분석 3종세트
파이썬 데이터 분석 3종세트파이썬 데이터 분석 3종세트
파이썬 데이터 분석 3종세트itproman35
 
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering) 20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering) Tae Young Lee
 
데이터분석의 길 4: “고수는 통계학습의 달인이다”
데이터분석의 길 4:  “고수는 통계학습의 달인이다”데이터분석의 길 4:  “고수는 통계학습의 달인이다”
데이터분석의 길 4: “고수는 통계학습의 달인이다”Jaimie Kwon (권재명)
 
[아꿈사] 게임 기초 수학 물리 1,2장
[아꿈사] 게임 기초 수학 물리 1,2장[아꿈사] 게임 기초 수학 물리 1,2장
[아꿈사] 게임 기초 수학 물리 1,2장sung ki choi
 
빅데이터 시각화 기술 특허 동향 분석
빅데이터 시각화 기술 특허 동향 분석빅데이터 시각화 기술 특허 동향 분석
빅데이터 시각화 기술 특허 동향 분석Newsjelly
 
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인Ji Lee
 
빅데이터 기술 현황과 시장 전망(2014)
빅데이터 기술 현황과 시장 전망(2014)빅데이터 기술 현황과 시장 전망(2014)
빅데이터 기술 현황과 시장 전망(2014)Channy Yun
 
빅데이터 분석 시각화 분석 : 3장 시각화 방법
빅데이터 분석 시각화 분석 : 3장 시각화 방법빅데이터 분석 시각화 분석 : 3장 시각화 방법
빅데이터 분석 시각화 분석 : 3장 시각화 방법Ji Lee
 
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스Ji Lee
 

Viewers also liked (20)

데이터 분석 실무 2강 (실습 1차)
데이터 분석 실무 2강 (실습 1차)데이터 분석 실무 2강 (실습 1차)
데이터 분석 실무 2강 (실습 1차)
 
대화형지도 Carto를 활용한 데이터 분석 및 통찰력
대화형지도 Carto를 활용한 데이터  분석 및 통찰력대화형지도 Carto를 활용한 데이터  분석 및 통찰력
대화형지도 Carto를 활용한 데이터 분석 및 통찰력
 
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
 
인프라 성능 데이터 분석 시작하기 (김아령)
인프라 성능 데이터 분석 시작하기 (김아령)인프라 성능 데이터 분석 시작하기 (김아령)
인프라 성능 데이터 분석 시작하기 (김아령)
 
데이터 분석 실무 1강
데이터 분석 실무 1강데이터 분석 실무 1강
데이터 분석 실무 1강
 
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
 
판매정보 빅데이터 분석을 통한 판매 예측 시스템
판매정보 빅데이터 분석을 통한 판매 예측 시스템판매정보 빅데이터 분석을 통한 판매 예측 시스템
판매정보 빅데이터 분석을 통한 판매 예측 시스템
 
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
 
비즈니스 데이터 분석의 현재와 미래
비즈니스 데이터 분석의 현재와 미래비즈니스 데이터 분석의 현재와 미래
비즈니스 데이터 분석의 현재와 미래
 
예측 분석 산업별 사례 147
예측 분석 산업별 사례 147예측 분석 산업별 사례 147
예측 분석 산업별 사례 147
 
검색로그시스템 with Python
검색로그시스템 with Python검색로그시스템 with Python
검색로그시스템 with Python
 
파이썬 데이터 분석 3종세트
파이썬 데이터 분석 3종세트파이썬 데이터 분석 3종세트
파이썬 데이터 분석 3종세트
 
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering) 20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
 
데이터분석의 길 4: “고수는 통계학습의 달인이다”
데이터분석의 길 4:  “고수는 통계학습의 달인이다”데이터분석의 길 4:  “고수는 통계학습의 달인이다”
데이터분석의 길 4: “고수는 통계학습의 달인이다”
 
[아꿈사] 게임 기초 수학 물리 1,2장
[아꿈사] 게임 기초 수학 물리 1,2장[아꿈사] 게임 기초 수학 물리 1,2장
[아꿈사] 게임 기초 수학 물리 1,2장
 
빅데이터 시각화 기술 특허 동향 분석
빅데이터 시각화 기술 특허 동향 분석빅데이터 시각화 기술 특허 동향 분석
빅데이터 시각화 기술 특허 동향 분석
 
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
 
빅데이터 기술 현황과 시장 전망(2014)
빅데이터 기술 현황과 시장 전망(2014)빅데이터 기술 현황과 시장 전망(2014)
빅데이터 기술 현황과 시장 전망(2014)
 
빅데이터 분석 시각화 분석 : 3장 시각화 방법
빅데이터 분석 시각화 분석 : 3장 시각화 방법빅데이터 분석 시각화 분석 : 3장 시각화 방법
빅데이터 분석 시각화 분석 : 3장 시각화 방법
 
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
 

Similar to "빅" 데이터의 분석적 시각화

MFBLP Method Forecast for Regional Load Demand System
MFBLP Method Forecast for Regional Load Demand SystemMFBLP Method Forecast for Regional Load Demand System
MFBLP Method Forecast for Regional Load Demand SystemCSCJournals
 
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...BRNSSPublicationHubI
 
tadejko2007.pdf
tadejko2007.pdftadejko2007.pdf
tadejko2007.pdfMhartono
 
iPlan BOLD MRI Mapping Clinical White Paper
iPlan BOLD MRI Mapping Clinical White PaperiPlan BOLD MRI Mapping Clinical White Paper
iPlan BOLD MRI Mapping Clinical White PaperBrainlab
 
Adaptive lifting based image compression scheme using interactive artificial ...
Adaptive lifting based image compression scheme using interactive artificial ...Adaptive lifting based image compression scheme using interactive artificial ...
Adaptive lifting based image compression scheme using interactive artificial ...csandit
 
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBPIRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBPIRJET Journal
 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesIDES Editor
 
Hybrid medical image compression method using quincunx wavelet and geometric ...
Hybrid medical image compression method using quincunx wavelet and geometric ...Hybrid medical image compression method using quincunx wavelet and geometric ...
Hybrid medical image compression method using quincunx wavelet and geometric ...journalBEEI
 
Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...ijma
 
A Joint QRS Detection and Data Compression Scheme for Wearable Sensors
A Joint QRS Detection and Data Compression Scheme for Wearable SensorsA Joint QRS Detection and Data Compression Scheme for Wearable Sensors
A Joint QRS Detection and Data Compression Scheme for Wearable Sensorsecgpapers
 
Application of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of ReservoirsApplication of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of ReservoirsIOSR Journals
 
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...ijma
 
Reduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
Reduction of Active Power Loss byUsing Adaptive Cat Swarm OptimizationReduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
Reduction of Active Power Loss byUsing Adaptive Cat Swarm Optimizationijeei-iaes
 
Human Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerDataHuman Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerDataIRJET Journal
 
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingMaruthi Nataraj K
 
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Seval Çapraz
 

Similar to "빅" 데이터의 분석적 시각화 (20)

MFBLP Method Forecast for Regional Load Demand System
MFBLP Method Forecast for Regional Load Demand SystemMFBLP Method Forecast for Regional Load Demand System
MFBLP Method Forecast for Regional Load Demand System
 
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
 
tadejko2007.pdf
tadejko2007.pdftadejko2007.pdf
tadejko2007.pdf
 
iPlan BOLD MRI Mapping Clinical White Paper
iPlan BOLD MRI Mapping Clinical White PaperiPlan BOLD MRI Mapping Clinical White Paper
iPlan BOLD MRI Mapping Clinical White Paper
 
Adaptive lifting based image compression scheme using interactive artificial ...
Adaptive lifting based image compression scheme using interactive artificial ...Adaptive lifting based image compression scheme using interactive artificial ...
Adaptive lifting based image compression scheme using interactive artificial ...
 
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBPIRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
 
L14.pdf
L14.pdfL14.pdf
L14.pdf
 
PCA and SVD in brief
PCA and SVD in briefPCA and SVD in brief
PCA and SVD in brief
 
Hybrid medical image compression method using quincunx wavelet and geometric ...
Hybrid medical image compression method using quincunx wavelet and geometric ...Hybrid medical image compression method using quincunx wavelet and geometric ...
Hybrid medical image compression method using quincunx wavelet and geometric ...
 
Lec-3 DIP.pptx
Lec-3 DIP.pptxLec-3 DIP.pptx
Lec-3 DIP.pptx
 
Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...
 
A Joint QRS Detection and Data Compression Scheme for Wearable Sensors
A Joint QRS Detection and Data Compression Scheme for Wearable SensorsA Joint QRS Detection and Data Compression Scheme for Wearable Sensors
A Joint QRS Detection and Data Compression Scheme for Wearable Sensors
 
Application of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of ReservoirsApplication of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of Reservoirs
 
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
 
Reduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
Reduction of Active Power Loss byUsing Adaptive Cat Swarm OptimizationReduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
Reduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
 
H235055
H235055H235055
H235055
 
Human Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerDataHuman Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerData
 
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and Forecasting
 
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
 

More from Myung-Hoe Huh

법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117Myung-Hoe Huh
 
데이터 사이언티스트 키노트 Pt 20141008
데이터 사이언티스트 키노트 Pt 20141008데이터 사이언티스트 키노트 Pt 20141008
데이터 사이언티스트 키노트 Pt 20141008Myung-Hoe Huh
 
22 r data manipulation 2 pt 20140404
22 r data manipulation 2 pt 2014040422 r data manipulation 2 pt 20140404
22 r data manipulation 2 pt 20140404Myung-Hoe Huh
 
21 r data manipulation 1 pt 20140325
21 r data manipulation 1 pt 2014032521 r data manipulation 1 pt 20140325
21 r data manipulation 1 pt 20140325Myung-Hoe Huh
 
Data visualization using r pt 20140316
Data visualization using r pt 20140316Data visualization using r pt 20140316
Data visualization using r pt 20140316Myung-Hoe Huh
 
통계학의 유래와 전망 20130413
통계학의 유래와 전망 20130413통계학의 유래와 전망 20130413
통계학의 유래와 전망 20130413Myung-Hoe Huh
 
통계적 시각화 Pt 20130119 knou
통계적 시각화 Pt 20130119 knou통계적 시각화 Pt 20130119 knou
통계적 시각화 Pt 20130119 knouMyung-Hoe Huh
 

More from Myung-Hoe Huh (7)

법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
 
데이터 사이언티스트 키노트 Pt 20141008
데이터 사이언티스트 키노트 Pt 20141008데이터 사이언티스트 키노트 Pt 20141008
데이터 사이언티스트 키노트 Pt 20141008
 
22 r data manipulation 2 pt 20140404
22 r data manipulation 2 pt 2014040422 r data manipulation 2 pt 20140404
22 r data manipulation 2 pt 20140404
 
21 r data manipulation 1 pt 20140325
21 r data manipulation 1 pt 2014032521 r data manipulation 1 pt 20140325
21 r data manipulation 1 pt 20140325
 
Data visualization using r pt 20140316
Data visualization using r pt 20140316Data visualization using r pt 20140316
Data visualization using r pt 20140316
 
통계학의 유래와 전망 20130413
통계학의 유래와 전망 20130413통계학의 유래와 전망 20130413
통계학의 유래와 전망 20130413
 
통계적 시각화 Pt 20130119 knou
통계적 시각화 Pt 20130119 knou통계적 시각화 Pt 20130119 knou
통계적 시각화 Pt 20130119 knou
 

Recently uploaded

Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 

Recently uploaded (20)

Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 

"빅" 데이터의 분석적 시각화

  • 1. 한국보건정보통계학회 추계학술발표회 2013 “빅” 데이터의 분석적 시각화 Analytic Data Visualization 許 明 會 2013.11.29 고려대학교 통계학과 stat420@korea.ac.kr 1 Health Info & Stat
  • 2. Data Visualization - Descriptive vs Analytic ... - Small vs Big ... science technology art 2013.11.29 2 Health Info & Stat
  • 3. Contents - Scatterplot - Biplot - Regression Biplot - Kernel PCA - SVM Biplot 2013.11.29 3 Health Info & Stat
  • 4. Scatterplot: 산점도 - “Lego” for analytic data visualization - Reflecting the third variable quakes: 2013.11.29 longitude(=x), latitude(=y), depth(=z) 4 Health Info & Stat
  • 5. Scatterplot: 산점도 - For the case of large  (≧  ), over-plotting can produce serious outcome. Skin Segmentation Data:  (red) vs.  (green)        2013.11.29 5 Health Info & Stat
  • 6. Scatterplot: 산점도 - For the case of large  (≧  ), alpha channel can be utilized. Skin Segmentation Data:  (red) vs.  (green)        2013.11.29 6 Health Info & Stat
  • 7. Scatterplot: 산점도 - lowess: A nonparametric regression for bivariate data cars data: distance vs. speed 2013.11.29 7 Health Info & Stat
  • 8. Scatterplot: 산점도 - 3D Rotation for three variables Skin Segmentation Data:  (red),  (green),  (blue) - ggobi: 2013.11.29 3D Rotation for four or more variables 8 Health Info & Stat
  • 9. Biplot of Observations and Variables, Gabriel (1971) - The biplot is a graph that shows  observations and  variables. Protein data (row: 25 nations, column: 9 protein sources) 2013.11.29 9 Health Info & Stat
  • 10. Biplot of Observations and Variables, Gabriel (1971) - Idea: Linear projection Protein data: variable cereal 2013.11.29 10 Health Info & Stat
  • 11. Regression Biplot, Huh and Lee (2013) - Regression biplot is a graph for  observations of   ⋯    , arranged by predicted  . - Assume that the model fit is determined by a function of linear combination of   ⋯    . For instance,    ⋯  ,       or log           ⋯    .   - Set the vertical dimension by the direction of regression coefficients       ⋮ , or      . ∥∥    - Set the horizontal dimension by the direction of principal axis of      ⋯   ,  where   denotes the orthogonal component generated from the projection of   on  . 2013.11.29 11 Health Info & Stat
  • 12. Regression Biplot, Huh and Lee (2013) Example 1. Stack Loss Data (  ;   loss of ammonia,         ) 2013.11.29 12 Health Info & Stat
  • 13. Regression Biplot, Huh and Lee (2013) Example 2. Magazine Data (  ;   Subscription (0,1),   ) 2013.11.29 13 Health Info & Stat
  • 14. Kernel PCA, Scholkopf et al. (1998) - For  observations    ⋯    ( × ), consider the nonlinear mapping     ⋯    to a Hilbert space, in which                      . - Denoting            , Kernel PCA is obtained from eigen-decomposing              .       - Kernel PCA yields a plot of observations by projecting       ⋯       on      ′    where  2013.11.29  ′    ′  ,      ,   is an eigenvector of  .   14 Health Info & Stat
  • 15. Kernel PCA Diagram (or Kernel Biplot), Huh (2013) - Aim: Representation of  variables in Kernel PC plot of observations. - Proposed Procedure:  1) For each    ⋯    , map         on the plane,    ⋯   , where    is a constant and     ⋯   ⋯    . Projection is given by   ′     ′    ′           ″      ′ ″     ″ ″′  .   ″    ″    ″  ″′       2) For each  , link the projection points of   and   2013.11.29 15 by an arrow. Health Info & Stat
  • 16. Example 1. Arrow diagrams [  ] for kernel PCA of the iris data with rbf kernel,    2013.11.29 16 Health Info & Stat
  • 17. Example 1. Arrow diagrams [  ] for kernel PCA of the iris data with rbf kernel,    2013.11.29 17 Health Info & Stat
  • 18. Example 2. Arrow diagrams [  ] for kernel PCA of the spam data [      ] 2013.11.29 18 Health Info & Stat
  • 19. SVM-Guided Biplot as an extension of Regression Biplot - Idea: Combine Linear/Logistic Regression Biplot and Kernel PCA. - Classification/Regression Part: Classified as SVM classifier   -1 or 1 for    ⋯   .               ,  where         ,    Vertical dimension is set to              2013.11.29    ≧ .     (      ,        ). 19 Health Info & Stat
  • 20. SVM-Guided Biplot: Classification - Kernel PCA Part:                ∴   (          ′  ),  ′   ′     ⋯   .           ′                  ′   ′       ′   ′ ,   ′   ⋯   . Hence   →      (   ) or          .    Horizontal dimension is determined by eigen-decomposing  .  - Perturbation Scheme for Arrow Diagrams. Define      ,  ×  , where  represents a perturbation of which the magnitude is controlled by . Then, project   on the first (vertical) and the second (horizontal) dimension. 2013.11.29 20 Health Info & Stat
  • 21. Example 1. Iris Data: Versicolor vs. Virginica [sigma=0.1, C=1,   ] 2013.11.29 21 Health Info & Stat
  • 22. Importance of Variables (in the case of large ) - It is necessary to select a small number of variables in determining the first and second dimensions. - Measures of Importance (definition)  Length of Arrows 1) in vertical direction, 2) in horizontal direction. - Plot arrow diagrams for importance variables only. 2013.11.29 22 Health Info & Stat
  • 23. Example 2. Spam Data [sigma=0.1, C=10,   ],    2013.11.29 23 Health Info & Stat
  • 24. SVM-Guided Biplot: Regression - The same method can be applied to SVM regression. - Example 3. Aerobic Fitness [       ] for oxygen uptake (=  ) with RBF kernel ( =0.1, C=10,  =0.1,   ) 2013.11.29 24 Health Info & Stat
  • 25. Concluding Remarks - Biplot method can be extended to be suited for linear regression or classification (logistic regression). - Biplot method can be extended to allow nonlinear mapping of observations and variables, by fully utilizing kernel trick. http://blog.naver.com/huh4200 금붕어 어항 (on the iPad) 2013.11.29 25 Health Info & Stat
  • 26. References Gabriel, K.R. (1971). “The biplot display of matrices with the application to principal component analysis”. Biometrika, 58. 453-467. Huh, M.H. (2013). “Arrow diagrams for kernel principal component analysis”. Communications for Statistical Applications and Methods, 20. 175-184. Huh, M.H. (2013). “SVM-guided biplot of observations and variables”. Communications for Statistical Applications and Methods. (to appear) Huh, M.H. and Lee, Y.G. (2013). “Biplots of multivariate data guided by linear and/or logistic regression”. Communications for Statistical Applications and Methods, 20. 129-136. Scholkopf, B., Smola, A. and Muller, K.R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10. 1299–1319. 2013.11.29 26 Health Info & Stat