SlideShare a Scribd company logo
SVD Filtered Temporal Usage
Pattern Analysis & Clustering
Liang XieLiang XieLiang XieLiang Xie
SCSUG Educational Forum 2009SCSUG Educational Forum 2009SCSUG Educational Forum 2009SCSUG Educational Forum 2009
San Antonio, TXSan Antonio, TXSan Antonio, TXSan Antonio, TX
Business Objective
� Provide a robust algorithm to cluster customers based on their temporal
transactional data ;
� Issues :
� Data
� High Dimensionality: 360 features, multi-million records
� Capture amplitude at different resolution
� High volatility due to noise
� Possible Outliers
� Algorithm
� Robustness
� Efficiency
� Easy to implement in SAS!
� We Choose a SVD based algorithm
� Successful application on Gene-Expression Analysis by Alter et al (PNAS, 2000)
SVD as a Filter
� SVD Definition:
� Singular Value Decomposition is a mathematical tool to decompose
rectangular matrix
� Left Eigenvector matrix U can be regarded as an input rotation matrix;
Sigma is the scaling matrix, and right Eigenvector matrix V is output
matrix
� SVD is similar to Fourier analysis
� Filter:
� Each row of X is a linear combination of right Eigenvectors
� Each column of X is a linear combination of left Eigenvectors
'VUX Σ=
Relationship Between PCA and SVD
� SAS/STAT doesn’t explicitly support SVD
� We can tweak SAS/STAT to do SVD by link one computation method of
SVD to PCA
� SVD and PCA are essentially the same: SVD on the covariance matrix of
original data X is equivalent to PCA of X
� PCA on non-centered covariance matrix of X is equivalent to SVD of X,
with proper scaling
')'( VSVXXSVD =
SVD in SAS/STAT
� We call PROC PRINCOMP to conduct SVD in SAS/STAT
� The uncorrected covariance matrix in PROC PRINCOM is X’X/n, not X’X,
therefore the singular value matrix should be scaled by
� PROC PRINCOMPPROC PRINCOMPPROC PRINCOMPPROC PRINCOMP NOINT COV SING=
� ‘COV’ computes the principal components from the covariance matrix
� ‘NOINT’ omits the intercept from the model
� ‘SING=’ specifies the singularity criterion to ensure accuracy
n
Performance
� Accuracy
� Test the code on Hilbert matrix
� Specify ‘SING=1e-16’, our result is comparable to those obtained from R
and MATLAB
� Efficiency
� Test the code on an arbitrary rectangular matrix with 1.7million rows and
400 columns
� On a Core2Duo 1.86Ghz PC, it takes SAS 7min56sec to finish all data
processing and computations, user CPU time is 5min52sec
� Note that 32-bit Windows version RRRRRRRR is not able to handle data this big:
> X<-matrix(runif(1.7E6*400), ncol=400)
Error in runif(1700000 * 400) :
cannot allocate vector of length 680000000
� Multi-thread/Parallel SVD algorithm from SAS is highly desired!!
Temporal Usage Pattern Analysis
� Time series usage data from customers for one year at 60min interval
� Hourly usage data is normalized to:
� Year total
� Monthly Total
� We want to identify segments with distinct usage pattern over one
year, so that marketing department is able to design customized
messages to them
Traditional Approach
� Direct K-means clustering using PROC FASTCLUS on all features
� Problems:
� Not Robust: Subjective to outliers
� Ambiguity in choosing optimal number of clusters a prior
� High dimensionality will affect the distance measure between each pair:
� In high dimensional spaces, distances between points become relatively
uniform
� Combining Robustness and High Dimensionality, we could get segments
that are occupied by only a few observations which is usually not desired
� K-means clustering algorithm doesn’t take the time series nature into
consideration. All features are considered independent
Our Approach
� Apply SVD to the original data, obtain Eigenvectors and singular values
� Remove components associated with the first singular value (Low Pass
Filtering)
� Apply SVD again to the SVD Filtered matrix
� Calculate Pearson correlation of each observation to the right
Eigenvectors obtained in previous step
� Apply k-means clustering algorithm to this correlation elements matrix
Some Notes
� For a data matrix containing 360 days’ profile, we only need to use a
few of the correlation elements. We use correlation up to 85%
variation is accounted for in the data
� To determine optimal number of clusters, we applied Bayesian
Information Criteria. This measurement is very robust and simple to
calculate:
� BIC=Distortion + (Num of Var)*log(Num of Obs)*K
� Distortion=sum of total variance of each cluster=sum of Distance from
PROC FASTCLUS output
� With hourly data, we separate the analysis in two steps:
� Daily Level
� Hourly Level for a ‘typical day’ in a month
� Apply the SVD Filtered Clustering algorithm in each step
Simulated Data
� We simulate data using
Heterogeneous Mixed Model of
Verbeke
� High Usage among Month B-D
and Month H
� Some outliers were deliberately
generated by adding abnormal
ad-hoc error terms
Clustering Result on Filtered Data
THANK YOUTHANK YOUTHANK YOUTHANK YOU
� You can reach me at:
� xie1978@yahoo.com
� www.linkedin.com/liangxie
� My Blog:
� http://sas-programming.blogspot.com

More Related Content

What's hot

OPTIMIZED REVERSIBLE VEDIC MULTIPLIERS
OPTIMIZED REVERSIBLE VEDIC MULTIPLIERSOPTIMIZED REVERSIBLE VEDIC MULTIPLIERS
OPTIMIZED REVERSIBLE VEDIC MULTIPLIERS
Uday Prakash
 
CS106 Lab 8 - Nested loops
CS106 Lab 8 - Nested loopsCS106 Lab 8 - Nested loops
CS106 Lab 8 - Nested loops
Nada Kamel
 
Aa sort-v4
Aa sort-v4Aa sort-v4
CS106 Lab 3 - Modulus
CS106 Lab 3 - ModulusCS106 Lab 3 - Modulus
CS106 Lab 3 - Modulus
Nada Kamel
 
SparkNet presentation
SparkNet presentationSparkNet presentation
SparkNet presentation
Sneh Pahilwani
 
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Thien Q. Tran
 
Radix and Merge Sort
Radix and Merge SortRadix and Merge Sort
Radix and Merge Sort
Gelo Maribbay
 
CS106 Lab 4 - If statement
CS106 Lab 4 - If statementCS106 Lab 4 - If statement
CS106 Lab 4 - If statement
Nada Kamel
 
Level-Accurate Peak Activity Estimation in Combinational Circuit Using BILP
Level-Accurate Peak Activity Estimation in Combinational Circuit Using BILPLevel-Accurate Peak Activity Estimation in Combinational Circuit Using BILP
Level-Accurate Peak Activity Estimation in Combinational Circuit Using BILP
Deepak Malani
 
Optimized Reversible Vedic Multipliers for High Speed Low Power Operations
Optimized Reversible Vedic Multipliers for High Speed Low Power OperationsOptimized Reversible Vedic Multipliers for High Speed Low Power Operations
Optimized Reversible Vedic Multipliers for High Speed Low Power Operations
ijsrd.com
 
Introduction to MATLAB 1
Introduction to MATLAB 1Introduction to MATLAB 1
Introduction to MATLAB 1
Mohamed Gafar
 
Design of High speed Low Power Reversible Vedic multiplier and Reversible Div...
Design of High speed Low Power Reversible Vedic multiplier and Reversible Div...Design of High speed Low Power Reversible Vedic multiplier and Reversible Div...
Design of High speed Low Power Reversible Vedic multiplier and Reversible Div...
IJERA Editor
 
Johnson Trotter Algorithm(Permutation)
Johnson Trotter Algorithm(Permutation)Johnson Trotter Algorithm(Permutation)
Johnson Trotter Algorithm(Permutation)
International Islamic University
 
Ieee project reversible logic gates by_amit
Ieee project reversible logic gates  by_amitIeee project reversible logic gates  by_amit
Ieee project reversible logic gates by_amit
Amith Bhonsle
 
CS106 Lab 10 - Functions (passing by value)
CS106 Lab 10 - Functions (passing by value)CS106 Lab 10 - Functions (passing by value)
CS106 Lab 10 - Functions (passing by value)
Nada Kamel
 
implementation and design of 32-bit adder
implementation and design of 32-bit adderimplementation and design of 32-bit adder
implementation and design of 32-bit adder
veereshwararao
 
Automatic calibration of hysteretic models through multiple responses
Automatic calibration of hysteretic models through multiple responsesAutomatic calibration of hysteretic models through multiple responses
Automatic calibration of hysteretic models through multiple responses
openseesdays
 
Fpga implementation of high speed 8 bit vedic multiplier using barrel shifter(1)
Fpga implementation of high speed 8 bit vedic multiplier using barrel shifter(1)Fpga implementation of high speed 8 bit vedic multiplier using barrel shifter(1)
Fpga implementation of high speed 8 bit vedic multiplier using barrel shifter(1)Karthik Sagar
 
Modeling of Wireless Power Transfer by COMSOL: A Quick Tutorial
Modeling of Wireless Power Transfer by COMSOL: A Quick TutorialModeling of Wireless Power Transfer by COMSOL: A Quick Tutorial
Modeling of Wireless Power Transfer by COMSOL: A Quick Tutorial
Amirhossein Hajiaghajani
 
ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...
ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...
ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...
Jihun Yun
 

What's hot (20)

OPTIMIZED REVERSIBLE VEDIC MULTIPLIERS
OPTIMIZED REVERSIBLE VEDIC MULTIPLIERSOPTIMIZED REVERSIBLE VEDIC MULTIPLIERS
OPTIMIZED REVERSIBLE VEDIC MULTIPLIERS
 
CS106 Lab 8 - Nested loops
CS106 Lab 8 - Nested loopsCS106 Lab 8 - Nested loops
CS106 Lab 8 - Nested loops
 
Aa sort-v4
Aa sort-v4Aa sort-v4
Aa sort-v4
 
CS106 Lab 3 - Modulus
CS106 Lab 3 - ModulusCS106 Lab 3 - Modulus
CS106 Lab 3 - Modulus
 
SparkNet presentation
SparkNet presentationSparkNet presentation
SparkNet presentation
 
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
 
Radix and Merge Sort
Radix and Merge SortRadix and Merge Sort
Radix and Merge Sort
 
CS106 Lab 4 - If statement
CS106 Lab 4 - If statementCS106 Lab 4 - If statement
CS106 Lab 4 - If statement
 
Level-Accurate Peak Activity Estimation in Combinational Circuit Using BILP
Level-Accurate Peak Activity Estimation in Combinational Circuit Using BILPLevel-Accurate Peak Activity Estimation in Combinational Circuit Using BILP
Level-Accurate Peak Activity Estimation in Combinational Circuit Using BILP
 
Optimized Reversible Vedic Multipliers for High Speed Low Power Operations
Optimized Reversible Vedic Multipliers for High Speed Low Power OperationsOptimized Reversible Vedic Multipliers for High Speed Low Power Operations
Optimized Reversible Vedic Multipliers for High Speed Low Power Operations
 
Introduction to MATLAB 1
Introduction to MATLAB 1Introduction to MATLAB 1
Introduction to MATLAB 1
 
Design of High speed Low Power Reversible Vedic multiplier and Reversible Div...
Design of High speed Low Power Reversible Vedic multiplier and Reversible Div...Design of High speed Low Power Reversible Vedic multiplier and Reversible Div...
Design of High speed Low Power Reversible Vedic multiplier and Reversible Div...
 
Johnson Trotter Algorithm(Permutation)
Johnson Trotter Algorithm(Permutation)Johnson Trotter Algorithm(Permutation)
Johnson Trotter Algorithm(Permutation)
 
Ieee project reversible logic gates by_amit
Ieee project reversible logic gates  by_amitIeee project reversible logic gates  by_amit
Ieee project reversible logic gates by_amit
 
CS106 Lab 10 - Functions (passing by value)
CS106 Lab 10 - Functions (passing by value)CS106 Lab 10 - Functions (passing by value)
CS106 Lab 10 - Functions (passing by value)
 
implementation and design of 32-bit adder
implementation and design of 32-bit adderimplementation and design of 32-bit adder
implementation and design of 32-bit adder
 
Automatic calibration of hysteretic models through multiple responses
Automatic calibration of hysteretic models through multiple responsesAutomatic calibration of hysteretic models through multiple responses
Automatic calibration of hysteretic models through multiple responses
 
Fpga implementation of high speed 8 bit vedic multiplier using barrel shifter(1)
Fpga implementation of high speed 8 bit vedic multiplier using barrel shifter(1)Fpga implementation of high speed 8 bit vedic multiplier using barrel shifter(1)
Fpga implementation of high speed 8 bit vedic multiplier using barrel shifter(1)
 
Modeling of Wireless Power Transfer by COMSOL: A Quick Tutorial
Modeling of Wireless Power Transfer by COMSOL: A Quick TutorialModeling of Wireless Power Transfer by COMSOL: A Quick Tutorial
Modeling of Wireless Power Transfer by COMSOL: A Quick Tutorial
 
ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...
ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...
ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...
 

Viewers also liked

Mike O\'Brien: Visual Resume
Mike O\'Brien: Visual ResumeMike O\'Brien: Visual Resume
Mike O\'Brien: Visual Resume
rugrad91
 
Comparative tests of Fama-French Three and Five-Factor models using Principal...
Comparative tests of Fama-French Three and Five-Factor models using Principal...Comparative tests of Fama-French Three and Five-Factor models using Principal...
Comparative tests of Fama-French Three and Five-Factor models using Principal...
Eric Lai
 
Stats9[1] Factor Analysis From Web 1
Stats9[1] Factor Analysis From Web 1Stats9[1] Factor Analysis From Web 1
Stats9[1] Factor Analysis From Web 1guestf8b97c
 
Sas Plots Graphs
Sas Plots GraphsSas Plots Graphs
Sas Plots Graphs
guest2160992
 
Pca ppt
Pca pptPca ppt
Pca ppt
Alaa Tharwat
 
A Hybrid SVD Method Using Interpolation Algorithms for Image Compression
A Hybrid SVD Method Using Interpolation Algorithms for Image CompressionA Hybrid SVD Method Using Interpolation Algorithms for Image Compression
A Hybrid SVD Method Using Interpolation Algorithms for Image Compression
CSCJournals
 
SvD #smartden-oktober-2013
SvD #smartden-oktober-2013SvD #smartden-oktober-2013
SvD #smartden-oktober-2013
Ola Henriksson
 
A New Watermarking Algorithm Based on Image Scrambling and SVD in the Wavelet...
A New Watermarking Algorithm Based on Image Scrambling and SVD in the Wavelet...A New Watermarking Algorithm Based on Image Scrambling and SVD in the Wavelet...
A New Watermarking Algorithm Based on Image Scrambling and SVD in the Wavelet...
IDES Editor
 
Combining SFBC_OFDM Systems with SVD Assisted Multiuser Transmitter and Multi...
Combining SFBC_OFDM Systems with SVD Assisted Multiuser Transmitter and Multi...Combining SFBC_OFDM Systems with SVD Assisted Multiuser Transmitter and Multi...
Combining SFBC_OFDM Systems with SVD Assisted Multiuser Transmitter and Multi...
IOSR Journals
 
SVD and Lifting Wavelet Based Fragile Image Watermarking
SVD and Lifting Wavelet Based Fragile Image WatermarkingSVD and Lifting Wavelet Based Fragile Image Watermarking
SVD and Lifting Wavelet Based Fragile Image Watermarking
IDES Editor
 
CT-SVD and Arnold Transform for Secure Color Image Watermarking
CT-SVD and Arnold Transform for Secure Color Image WatermarkingCT-SVD and Arnold Transform for Secure Color Image Watermarking
CT-SVD and Arnold Transform for Secure Color Image Watermarking
AM Publications,India
 
Investigating the Effect of Mutual Coupling on SVD Based Beam-forming over MI...
Investigating the Effect of Mutual Coupling on SVD Based Beam-forming over MI...Investigating the Effect of Mutual Coupling on SVD Based Beam-forming over MI...
Investigating the Effect of Mutual Coupling on SVD Based Beam-forming over MI...
CSCJournals
 
Distinguishing the signal from noise in an SVD of simulation data
Distinguishing the signal from noise in an SVD of simulation dataDistinguishing the signal from noise in an SVD of simulation data
Distinguishing the signal from noise in an SVD of simulation data
David Gleich
 
Digital Image Watermarking using DWT and SVD
Digital Image Watermarking using DWT and SVDDigital Image Watermarking using DWT and SVD
Digital Image Watermarking using DWT and SVD
Vignesh Vetri Vel
 
Performance Analysis of Compression Techniques Using SVD, BTC, DCT and GP
Performance Analysis of Compression Techniques Using SVD, BTC, DCT and GPPerformance Analysis of Compression Techniques Using SVD, BTC, DCT and GP
Performance Analysis of Compression Techniques Using SVD, BTC, DCT and GP
IOSR Journals
 
Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)
Mohammed Musah
 
"Genome-Wide Annotation Prediction with SVD Truncation based on ROC Analysis"...
"Genome-Wide Annotation Prediction with SVD Truncation based on ROC Analysis"..."Genome-Wide Annotation Prediction with SVD Truncation based on ROC Analysis"...
"Genome-Wide Annotation Prediction with SVD Truncation based on ROC Analysis"...
Davide Chicco
 

Viewers also liked (20)

Mike O\'Brien: Visual Resume
Mike O\'Brien: Visual ResumeMike O\'Brien: Visual Resume
Mike O\'Brien: Visual Resume
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Comparative tests of Fama-French Three and Five-Factor models using Principal...
Comparative tests of Fama-French Three and Five-Factor models using Principal...Comparative tests of Fama-French Three and Five-Factor models using Principal...
Comparative tests of Fama-French Three and Five-Factor models using Principal...
 
Stats9[1] Factor Analysis From Web 1
Stats9[1] Factor Analysis From Web 1Stats9[1] Factor Analysis From Web 1
Stats9[1] Factor Analysis From Web 1
 
Sas Plots Graphs
Sas Plots GraphsSas Plots Graphs
Sas Plots Graphs
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
A Hybrid SVD Method Using Interpolation Algorithms for Image Compression
A Hybrid SVD Method Using Interpolation Algorithms for Image CompressionA Hybrid SVD Method Using Interpolation Algorithms for Image Compression
A Hybrid SVD Method Using Interpolation Algorithms for Image Compression
 
SvD #smartden-oktober-2013
SvD #smartden-oktober-2013SvD #smartden-oktober-2013
SvD #smartden-oktober-2013
 
A New Watermarking Algorithm Based on Image Scrambling and SVD in the Wavelet...
A New Watermarking Algorithm Based on Image Scrambling and SVD in the Wavelet...A New Watermarking Algorithm Based on Image Scrambling and SVD in the Wavelet...
A New Watermarking Algorithm Based on Image Scrambling and SVD in the Wavelet...
 
Combining SFBC_OFDM Systems with SVD Assisted Multiuser Transmitter and Multi...
Combining SFBC_OFDM Systems with SVD Assisted Multiuser Transmitter and Multi...Combining SFBC_OFDM Systems with SVD Assisted Multiuser Transmitter and Multi...
Combining SFBC_OFDM Systems with SVD Assisted Multiuser Transmitter and Multi...
 
SVD and Lifting Wavelet Based Fragile Image Watermarking
SVD and Lifting Wavelet Based Fragile Image WatermarkingSVD and Lifting Wavelet Based Fragile Image Watermarking
SVD and Lifting Wavelet Based Fragile Image Watermarking
 
DefenceSeminar
DefenceSeminarDefenceSeminar
DefenceSeminar
 
CT-SVD and Arnold Transform for Secure Color Image Watermarking
CT-SVD and Arnold Transform for Secure Color Image WatermarkingCT-SVD and Arnold Transform for Secure Color Image Watermarking
CT-SVD and Arnold Transform for Secure Color Image Watermarking
 
SVD
SVDSVD
SVD
 
Investigating the Effect of Mutual Coupling on SVD Based Beam-forming over MI...
Investigating the Effect of Mutual Coupling on SVD Based Beam-forming over MI...Investigating the Effect of Mutual Coupling on SVD Based Beam-forming over MI...
Investigating the Effect of Mutual Coupling on SVD Based Beam-forming over MI...
 
Distinguishing the signal from noise in an SVD of simulation data
Distinguishing the signal from noise in an SVD of simulation dataDistinguishing the signal from noise in an SVD of simulation data
Distinguishing the signal from noise in an SVD of simulation data
 
Digital Image Watermarking using DWT and SVD
Digital Image Watermarking using DWT and SVDDigital Image Watermarking using DWT and SVD
Digital Image Watermarking using DWT and SVD
 
Performance Analysis of Compression Techniques Using SVD, BTC, DCT and GP
Performance Analysis of Compression Techniques Using SVD, BTC, DCT and GPPerformance Analysis of Compression Techniques Using SVD, BTC, DCT and GP
Performance Analysis of Compression Techniques Using SVD, BTC, DCT and GP
 
Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)
 
"Genome-Wide Annotation Prediction with SVD Truncation based on ROC Analysis"...
"Genome-Wide Annotation Prediction with SVD Truncation based on ROC Analysis"..."Genome-Wide Annotation Prediction with SVD Truncation based on ROC Analysis"...
"Genome-Wide Annotation Prediction with SVD Truncation based on ROC Analysis"...
 

Similar to Svd filtered temporal usage clustering

casestudy_important.pptx
casestudy_important.pptxcasestudy_important.pptx
casestudy_important.pptx
ssuser31398b
 
Improvement in Computational Complexity of the MIMO ML Decoder in High Mobili...
Improvement in Computational Complexity of the MIMO ML Decoder in High Mobili...Improvement in Computational Complexity of the MIMO ML Decoder in High Mobili...
Improvement in Computational Complexity of the MIMO ML Decoder in High Mobili...
IRJET Journal
 
Human Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerDataHuman Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerData
IRJET Journal
 
Text Detection and Recognition in Natural Images
Text Detection and Recognition in Natural ImagesText Detection and Recognition in Natural Images
Text Detection and Recognition in Natural Images
IRJET Journal
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
Mohammad Junaid Khan
 
Vlsi projects
Vlsi projectsVlsi projects
Vlsi projects
shahu2212
 
Efficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketchingEfficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketching
Hsing-chuan Hsieh
 
Optimizing Data Encoding Technique For Dynamic Power Reduction In Network On ...
Optimizing Data Encoding Technique For Dynamic Power Reduction In Network On ...Optimizing Data Encoding Technique For Dynamic Power Reduction In Network On ...
Optimizing Data Encoding Technique For Dynamic Power Reduction In Network On ...
IRJET Journal
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
Optical Modulation Analysis (OMA) Present and Future
Optical Modulation Analysis (OMA) Present and FutureOptical Modulation Analysis (OMA) Present and Future
Optical Modulation Analysis (OMA) Present and FutureCPqD
 
Virtual Lab for Electronics
Virtual Lab for ElectronicsVirtual Lab for Electronics
Virtual Lab for Electronics
IRJET Journal
 
IRJET- Implementation of 16-Bit Pipelined ADC using 180nm CMOS Technology
IRJET-  	  Implementation of 16-Bit Pipelined ADC using 180nm CMOS TechnologyIRJET-  	  Implementation of 16-Bit Pipelined ADC using 180nm CMOS Technology
IRJET- Implementation of 16-Bit Pipelined ADC using 180nm CMOS Technology
IRJET Journal
 
Feature Matching using SIFT algorithm
Feature Matching using SIFT algorithmFeature Matching using SIFT algorithm
Feature Matching using SIFT algorithm
Sajid Pareeth
 
Abhi monal
Abhi monalAbhi monal
Abhi monal
Abhijeet Powar
 
Adaptive Hyper-Parameter Tuning for Black-box LiDAR Odometry [IROS2021]
Adaptive Hyper-Parameter Tuning for Black-box LiDAR Odometry [IROS2021]Adaptive Hyper-Parameter Tuning for Black-box LiDAR Odometry [IROS2021]
Adaptive Hyper-Parameter Tuning for Black-box LiDAR Odometry [IROS2021]
KenjiKoide1
 
An Optimal Design of UP-DOWN Counter as SAR Logic Based ADC using CMOS 45nm T...
An Optimal Design of UP-DOWN Counter as SAR Logic Based ADC using CMOS 45nm T...An Optimal Design of UP-DOWN Counter as SAR Logic Based ADC using CMOS 45nm T...
An Optimal Design of UP-DOWN Counter as SAR Logic Based ADC using CMOS 45nm T...
IJERA Editor
 
A Discrete Optimization Approach for SVD Best Truncation Choice based on ROC ...
A Discrete Optimization Approach for SVD Best Truncation Choice based on ROC ...A Discrete Optimization Approach for SVD Best Truncation Choice based on ROC ...
A Discrete Optimization Approach for SVD Best Truncation Choice based on ROC ...
Davide Chicco
 
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
PCA-SIFT: A More Distinctive Representation for Local Image DescriptorsPCA-SIFT: A More Distinctive Representation for Local Image Descriptors
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
wolf
 
FPGA Implementation of a GA
FPGA Implementation of a GAFPGA Implementation of a GA
FPGA Implementation of a GA
Hocine Merabti
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171Yaxin Liu
 

Similar to Svd filtered temporal usage clustering (20)

casestudy_important.pptx
casestudy_important.pptxcasestudy_important.pptx
casestudy_important.pptx
 
Improvement in Computational Complexity of the MIMO ML Decoder in High Mobili...
Improvement in Computational Complexity of the MIMO ML Decoder in High Mobili...Improvement in Computational Complexity of the MIMO ML Decoder in High Mobili...
Improvement in Computational Complexity of the MIMO ML Decoder in High Mobili...
 
Human Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerDataHuman Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerData
 
Text Detection and Recognition in Natural Images
Text Detection and Recognition in Natural ImagesText Detection and Recognition in Natural Images
Text Detection and Recognition in Natural Images
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Vlsi projects
Vlsi projectsVlsi projects
Vlsi projects
 
Efficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketchingEfficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketching
 
Optimizing Data Encoding Technique For Dynamic Power Reduction In Network On ...
Optimizing Data Encoding Technique For Dynamic Power Reduction In Network On ...Optimizing Data Encoding Technique For Dynamic Power Reduction In Network On ...
Optimizing Data Encoding Technique For Dynamic Power Reduction In Network On ...
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Optical Modulation Analysis (OMA) Present and Future
Optical Modulation Analysis (OMA) Present and FutureOptical Modulation Analysis (OMA) Present and Future
Optical Modulation Analysis (OMA) Present and Future
 
Virtual Lab for Electronics
Virtual Lab for ElectronicsVirtual Lab for Electronics
Virtual Lab for Electronics
 
IRJET- Implementation of 16-Bit Pipelined ADC using 180nm CMOS Technology
IRJET-  	  Implementation of 16-Bit Pipelined ADC using 180nm CMOS TechnologyIRJET-  	  Implementation of 16-Bit Pipelined ADC using 180nm CMOS Technology
IRJET- Implementation of 16-Bit Pipelined ADC using 180nm CMOS Technology
 
Feature Matching using SIFT algorithm
Feature Matching using SIFT algorithmFeature Matching using SIFT algorithm
Feature Matching using SIFT algorithm
 
Abhi monal
Abhi monalAbhi monal
Abhi monal
 
Adaptive Hyper-Parameter Tuning for Black-box LiDAR Odometry [IROS2021]
Adaptive Hyper-Parameter Tuning for Black-box LiDAR Odometry [IROS2021]Adaptive Hyper-Parameter Tuning for Black-box LiDAR Odometry [IROS2021]
Adaptive Hyper-Parameter Tuning for Black-box LiDAR Odometry [IROS2021]
 
An Optimal Design of UP-DOWN Counter as SAR Logic Based ADC using CMOS 45nm T...
An Optimal Design of UP-DOWN Counter as SAR Logic Based ADC using CMOS 45nm T...An Optimal Design of UP-DOWN Counter as SAR Logic Based ADC using CMOS 45nm T...
An Optimal Design of UP-DOWN Counter as SAR Logic Based ADC using CMOS 45nm T...
 
A Discrete Optimization Approach for SVD Best Truncation Choice based on ROC ...
A Discrete Optimization Approach for SVD Best Truncation Choice based on ROC ...A Discrete Optimization Approach for SVD Best Truncation Choice based on ROC ...
A Discrete Optimization Approach for SVD Best Truncation Choice based on ROC ...
 
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
PCA-SIFT: A More Distinctive Representation for Local Image DescriptorsPCA-SIFT: A More Distinctive Representation for Local Image Descriptors
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
 
FPGA Implementation of a GA
FPGA Implementation of a GAFPGA Implementation of a GA
FPGA Implementation of a GA
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 

Recently uploaded

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 

Recently uploaded (20)

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 

Svd filtered temporal usage clustering

  • 1. SVD Filtered Temporal Usage Pattern Analysis & Clustering Liang XieLiang XieLiang XieLiang Xie SCSUG Educational Forum 2009SCSUG Educational Forum 2009SCSUG Educational Forum 2009SCSUG Educational Forum 2009 San Antonio, TXSan Antonio, TXSan Antonio, TXSan Antonio, TX
  • 2. Business Objective � Provide a robust algorithm to cluster customers based on their temporal transactional data ; � Issues : � Data � High Dimensionality: 360 features, multi-million records � Capture amplitude at different resolution � High volatility due to noise � Possible Outliers � Algorithm � Robustness � Efficiency � Easy to implement in SAS! � We Choose a SVD based algorithm � Successful application on Gene-Expression Analysis by Alter et al (PNAS, 2000)
  • 3. SVD as a Filter � SVD Definition: � Singular Value Decomposition is a mathematical tool to decompose rectangular matrix � Left Eigenvector matrix U can be regarded as an input rotation matrix; Sigma is the scaling matrix, and right Eigenvector matrix V is output matrix � SVD is similar to Fourier analysis � Filter: � Each row of X is a linear combination of right Eigenvectors � Each column of X is a linear combination of left Eigenvectors 'VUX Σ=
  • 4. Relationship Between PCA and SVD � SAS/STAT doesn’t explicitly support SVD � We can tweak SAS/STAT to do SVD by link one computation method of SVD to PCA � SVD and PCA are essentially the same: SVD on the covariance matrix of original data X is equivalent to PCA of X � PCA on non-centered covariance matrix of X is equivalent to SVD of X, with proper scaling ')'( VSVXXSVD =
  • 5. SVD in SAS/STAT � We call PROC PRINCOMP to conduct SVD in SAS/STAT � The uncorrected covariance matrix in PROC PRINCOM is X’X/n, not X’X, therefore the singular value matrix should be scaled by � PROC PRINCOMPPROC PRINCOMPPROC PRINCOMPPROC PRINCOMP NOINT COV SING= � ‘COV’ computes the principal components from the covariance matrix � ‘NOINT’ omits the intercept from the model � ‘SING=’ specifies the singularity criterion to ensure accuracy n
  • 6. Performance � Accuracy � Test the code on Hilbert matrix � Specify ‘SING=1e-16’, our result is comparable to those obtained from R and MATLAB � Efficiency � Test the code on an arbitrary rectangular matrix with 1.7million rows and 400 columns � On a Core2Duo 1.86Ghz PC, it takes SAS 7min56sec to finish all data processing and computations, user CPU time is 5min52sec � Note that 32-bit Windows version RRRRRRRR is not able to handle data this big: > X<-matrix(runif(1.7E6*400), ncol=400) Error in runif(1700000 * 400) : cannot allocate vector of length 680000000 � Multi-thread/Parallel SVD algorithm from SAS is highly desired!!
  • 7. Temporal Usage Pattern Analysis � Time series usage data from customers for one year at 60min interval � Hourly usage data is normalized to: � Year total � Monthly Total � We want to identify segments with distinct usage pattern over one year, so that marketing department is able to design customized messages to them
  • 8. Traditional Approach � Direct K-means clustering using PROC FASTCLUS on all features � Problems: � Not Robust: Subjective to outliers � Ambiguity in choosing optimal number of clusters a prior � High dimensionality will affect the distance measure between each pair: � In high dimensional spaces, distances between points become relatively uniform � Combining Robustness and High Dimensionality, we could get segments that are occupied by only a few observations which is usually not desired � K-means clustering algorithm doesn’t take the time series nature into consideration. All features are considered independent
  • 9. Our Approach � Apply SVD to the original data, obtain Eigenvectors and singular values � Remove components associated with the first singular value (Low Pass Filtering) � Apply SVD again to the SVD Filtered matrix � Calculate Pearson correlation of each observation to the right Eigenvectors obtained in previous step � Apply k-means clustering algorithm to this correlation elements matrix
  • 10. Some Notes � For a data matrix containing 360 days’ profile, we only need to use a few of the correlation elements. We use correlation up to 85% variation is accounted for in the data � To determine optimal number of clusters, we applied Bayesian Information Criteria. This measurement is very robust and simple to calculate: � BIC=Distortion + (Num of Var)*log(Num of Obs)*K � Distortion=sum of total variance of each cluster=sum of Distance from PROC FASTCLUS output � With hourly data, we separate the analysis in two steps: � Daily Level � Hourly Level for a ‘typical day’ in a month � Apply the SVD Filtered Clustering algorithm in each step
  • 11. Simulated Data � We simulate data using Heterogeneous Mixed Model of Verbeke � High Usage among Month B-D and Month H � Some outliers were deliberately generated by adding abnormal ad-hoc error terms
  • 12. Clustering Result on Filtered Data
  • 13. THANK YOUTHANK YOUTHANK YOUTHANK YOU � You can reach me at: � xie1978@yahoo.com � www.linkedin.com/liangxie � My Blog: � http://sas-programming.blogspot.com