SlideShare a Scribd company logo
1 of 29
Electricity consumption optimization using K-means
clustering algorithm
Hyun Wong Choi, Nawab Muhammad Faseeh Qureshi and
Dong Ryeol Shin
Sungkyunkwan University, South Korea.
Outline
• Introduction
• Related work
• Proposed Approach
• Evaluation
• Conclusion
• References
Introduction
• Electricity consumption
• Power grid
• In the power grid, we measure the consumption through sensors
• Industrial consumption
• Housing consumption
• Factories consumption
• Housing Consumption
• Front end (Consumer End)
• Back end (electrical company end)
Introduction (Cont.)
• Back end (Company End)
• Dataset for consumption UCIRVINE
• So many techniques that solves the optimization problem of
electricity but, none of them focus on housing electricity
optimization,
• Reducing the cost
• Factors of overcharge
• Prediction
Are not available
Introduction (Cont.)
• Solution
• K-mean algorithm
• Why chose k-mean cluster
• Predict the answer from the dataset
• No any answer is available in terms of k-mean
• Why predicting the answers
• No clear result
• In this paper electricity usage of home is analyzed through k-means clustering
algorithm for obtaining the optimal home usage electricity usage of home is
• 3A is analyzed through K-means clustering algorithm for obtaining the optimal home usage
electricity data points The Calinski-Harabasz Index, davis-boulden index and silhouette_score find
detailed optimal number of clusters in the K-means algorithm and present the application
scenario of the machine learning algorithm.
• 3B is reducing the 1/8 dataset and result the same result
• The proposed approach delivers us efficient and meaning prediction results never
obtained before.
Related work
Proposed Approach
Evaluation
• Declaration & Resources
System PC configuration Software
3A 1st paper execution snapshot
3B 2nd paper execution snapshot
Analysis
From K-means algorithms calculate proper cluster things is
very important, from the data, estimate Silhouette_score, the
result is K – 7 each cluster centroid and data prices silhouette
score are 0.799 is the optimal score. From the formal Caliski-
Harabasz Index results are 560.3999 is the optimal result. Using
this k-means algorithm the fact is figure 11.
K = 7
Software
PC Perfomance
Software OS Software Ram Processor Harddisk
Anadconda3 + Pycham3 Window 10 Professional 16.0GB i7-6600U CPU @2.60GHz 420GB SSD
System PC configuration Software
• Dataset UC Irvine Machine learning
https://archive.ics.uci.edu/ml/index.php
• sci-kit learn, Anaconda 3 open-source personally can easily
follow it and because using BSD License to real works don’t
have difficulties to that.
3A execution snapshot
K = 2 K = 3 K = 4 K = 5
K = 6 K = 7 K = 8 K = 9
K = 10
3B Second paper execution(1)
1/8 dataset K =1 1/8 dataset K = 2 1/8 dataset K =3 1/8 dataset K =4 1/8 dataset K =5
1/8 dataset K =6 1/8 dataset K = 7 1/8 dataset K = 8 1/8 dataset K = 9 1/8 dataset K = 10
1/8 dataset K = 11
Conclusion
• Technique
• How we solve it 1st point
Using the k-means clustering algorithm and
• Association with multiple answers
• Electricity
• Blight side of conclusion
Published paper
1. Hyun Wong Choi, Nawab Muhammad Faseeh Qureshi and Dong Ryeol Shin “Comparative Analysis of
Electricity Consumption at Home through a Silhouette-score prospective” , ICACT 2019 , South Korea , 2019
Sungkyunkwan University, Korea
2. Hyun Wong Choi, Nawab Muhammad Faseeh Qureshi and Dong Ryeol Shin “Analysis of Electricity
Consumption at Home Using K-means Clustering Algorithm”, ICACT 2019 , South Korea , 2019
Sungkyunkwan University, Korea
Association
• Bigger dataset
• Reducing dataset
K-means clustering
• K-means algorithm is one of the clustering methods for
divided, divided is giving the data among the many partitions.
For example, receive data object n, divided data is input data
divided K (≤ n) data, each group consisting of cluster below
equation is the at K-means algorithm when cluster consists of
algorithms using cost function use it [11]
argmin
𝑖 =1
𝑘
𝑥 ∈ 𝑆 𝑖
𝑥 − 𝜇𝑖
2
Calinski-Harabasz Index
• How to be well to be clustering inner way is Caliski-Harabasz Index, Davies-Bouldin
index, Dunn index, Silhouette score. In this paper. Evaluate via Clainiski-Harabasz Index
and silhouette score evaluate it.
• From the Cluster Calinski-Harabasz Index s I the clusters distributed average and
cluster distributed ratio will give it to you.
𝑠 𝑘 =
𝑇𝑟(𝐵 𝑘)
𝑇𝑟(𝑊𝑘)
×
𝑁 − 𝑘
𝑘 − 1
• For this Bk is the distributed matrix from each group Wk is the cluster distributed
defined.
𝑊𝑘 =
𝑞=1
𝑘
𝑥∈𝐶 𝑞
(𝑥 − 𝑐 𝑞)(𝑥 − 𝑐 𝑞) 𝑇
𝐵 𝑘 =
𝑞
𝑛 𝑞(𝐶 𝑞 − 𝑐)(𝐶 𝑞 − 𝑐) 𝑇
Silhouette score
• Silhouette score is the easy way to in data I each data cluster in
data’s definition an (i) each data is not clustered inner and data’s
definition b(i) silhouette score s(i) is equal to calculate that
s i =
𝑏 𝑖 − 𝑎(𝑖)
max { 𝑎 𝑖 , 𝑏 𝑖 }
• From this calculate s(i) is equal to that function
−1 ≤ s i ≤ 1
• S(i) is the close to 1 is the data I is the correct cluster to each
thing, close to -1 cannot distribute cluster is distributed, from this
paper machine Using the machine learning library scikit-learn in
the house hold power consumption clustering [7].
Davies-Bouldin index
• If the ground truth labels are not known, the Davies-Bouldin index
(sklearn. Metrixdavis Boulden)
𝑅𝑖𝑗 =
𝑠 𝑖+𝑠 𝑗
𝑑 𝑖𝑗
• Then the Davis-Bouldin Index is defined as
DB =
1
𝑘
𝑖 = 1 𝑘 max
𝑖≠𝑗
𝑅𝑖𝑗
• The zero is the lowest score a possible. Score. Values closer to
zero indicate a better partition. But the problem is this algorithm
do not attach it in the Scikit-learn library and only explain it in the
document page but cannot experiment easily.
Conclusion
• From the paper, Household power consumption via k-means clustering, Used library
which is sci-kit learn, Anaconda 3 open-source personally can easily follow it and
because using BSD License to real works don’t have difficulties to that.
• Not only the K-means algorithm, PCA Algorithms, but also SVM algorithm etc other
machine learning algorithms clustering can also do it.
• From this result, in real life household power consumptions diverse analytics. And
electricity transformer, Transmission power can management period can estimate it.
And each data using electricity consumption. It can be used for progressive taxation,
regional to regional demand forecasting, maintenance of power plants and facilities.
Can do it.
• In the Gas company can estimate via k-means algorithms and also can estimate about
the gas consumption rate to via K-means clustering and index.
Q & A
Comparative Analysis of Electricity Consumption at
Home through a Silhouette-score prospective
Hyun Wong Choi, Nawab Muhammad Faseeh Qureshi and Dong Ryeol Shin
SungKyunKwan University, South Korea
Abstract
• Machine learning is a state-of-the-art sub-project of artificial intelligence, that is been evolved
for finding large-scale intelligent analytics in the distributed computing environment. In this
paper, we perform comparative analytics onto dataset collected for the electricity usage of
home based on the K-mean clustering algorithm using comparison to silhouette score with a
ratio of 1/8 dataset. The performance evaluation shows that, the comparison index is similar in
numbers of silhouette score even if dataset is smaller than before.
K-means clustering
• K-means clustering
K-means algorithm is one of the clustering methods for divided,
divided is giving the data among the many partitions. For example,
receive data object n, divided data is input data divided K (≤ n)
data, each group consisting of cluster below equation is the at K-
means algorithm when cluster consists of algorithms using cost
function use it [11]
argmin
𝑖 =1
𝑘
𝑥 ∈ 𝑆 𝑖
𝑥 − 𝜇𝑖
2
Silhouette score
• Silhouette score is the easy way to in data I each data cluster in data’s definition an (i) each
data is not clustered inner and data’s definition b(i) silhouette score s(i) is equal to calculate that
s i =
𝑏 𝑖 − 𝑎(𝑖)
max { 𝑎 𝑖 , 𝑏 𝑖 }
• From this calculate s(i) is equal to that function
−1 ≤ s i ≤ 1
• S(i) is the close to 1 is the data I is the correct cluster to each thing, close to -1 cannot
distribute cluster is distributed, from this paper machine Using the machine learning library
scikit-learn in the house hold power consumption clustering
Analysis
• From K-means algorithms calculate proper cluster things is very important, from the data,
estimate Silhouette_score, the result is K = 7 each cluster centroid and data prices silhouette
score is 0.799 is the optimal score. Even if dataset is so small but the 1/8 datasets K= 7 each
cluster centroid and data prices silhouette score 0.810 is the optimal score. From this K-means
algorithm cluster 7th, ( all dataset , 1/8 dataset ) each group’s centroid and each centroid
distance will be an optimal value. From this result, the dataset is decrease but the K-means
clustering ‘s class vector space. Its optimal cluster is same situation with before original Dataset
Household power consumption rate via clustering.
Figure 12. Shiloutette score according to change of cluster number. Figure 13. 1/8 dataset Silhouette score according to change of cluster number.
Conclusion
From the paper, Household power consumption via k-means clustering,
Used library which is sci-kit learn, Anaconda 3 open-source personally
can easily follow it and because using BSD License to real works don’t
have difficulties to that. From this result even if reduce the dataset 1/8
but the silhouette score and all the clustering result is same with before.
But the population will increase it can show the clearer result for the
classification and vector space. Large dataset to small dataset is clear to
show to the specific result for the Silhouette score but opposite site is
not clearly allowed. Because 4-dimension vector dataset. From the
experiment reduce the estimated time if received huge dataset from
analysis.
Q & A

More Related Content

What's hot

What's hot (19)

05 k-means clustering
05 k-means clustering05 k-means clustering
05 k-means clustering
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
 
Cost Efficient PageRank Computation using GPU : NOTES
Cost Efficient PageRank Computation using GPU : NOTESCost Efficient PageRank Computation using GPU : NOTES
Cost Efficient PageRank Computation using GPU : NOTES
 
Unsupervised Learning
Unsupervised LearningUnsupervised Learning
Unsupervised Learning
 
Clustering on database systems rkm
Clustering on database systems rkmClustering on database systems rkm
Clustering on database systems rkm
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
Clustering: A Survey
Clustering: A SurveyClustering: A Survey
Clustering: A Survey
 
Compare "Urdhva Tiryakbhyam Multiplier" and "Hierarchical Array of Array Mul...
 Compare "Urdhva Tiryakbhyam Multiplier" and "Hierarchical Array of Array Mul... Compare "Urdhva Tiryakbhyam Multiplier" and "Hierarchical Array of Array Mul...
Compare "Urdhva Tiryakbhyam Multiplier" and "Hierarchical Array of Array Mul...
 
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
 
K-means clustering algorithm
K-means clustering algorithmK-means clustering algorithm
K-means clustering algorithm
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
The improved k means with particle swarm optimization
The improved k means with particle swarm optimizationThe improved k means with particle swarm optimization
The improved k means with particle swarm optimization
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data mining
 
Grid based method & model based clustering method
Grid based method & model based clustering methodGrid based method & model based clustering method
Grid based method & model based clustering method
 
Clustering, k-means clustering
Clustering, k-means clusteringClustering, k-means clustering
Clustering, k-means clustering
 
Optimising Data Using K-Means Clustering Algorithm
Optimising Data Using K-Means Clustering AlgorithmOptimising Data Using K-Means Clustering Algorithm
Optimising Data Using K-Means Clustering Algorithm
 
A Novel Optimization of Cloud Instances with Inventory Theory Applied on Real...
A Novel Optimization of Cloud Instances with Inventory Theory Applied on Real...A Novel Optimization of Cloud Instances with Inventory Theory Applied on Real...
A Novel Optimization of Cloud Instances with Inventory Theory Applied on Real...
 
JAVA BASED VISUALIZATION AND ANIMATION FOR TEACHING THE DIJKSTRA SHORTEST PAT...
JAVA BASED VISUALIZATION AND ANIMATION FOR TEACHING THE DIJKSTRA SHORTEST PAT...JAVA BASED VISUALIZATION AND ANIMATION FOR TEACHING THE DIJKSTRA SHORTEST PAT...
JAVA BASED VISUALIZATION AND ANIMATION FOR TEACHING THE DIJKSTRA SHORTEST PAT...
 

Similar to Master defense presentation 2019 04_18_rev2

Improving K-NN Internet Traffic Classification Using Clustering and Principle...
Improving K-NN Internet Traffic Classification Using Clustering and Principle...Improving K-NN Internet Traffic Classification Using Clustering and Principle...
Improving K-NN Internet Traffic Classification Using Clustering and Principle...
journalBEEI
 

Similar to Master defense presentation 2019 04_18_rev2 (20)

Final edited master defense-hyun_wong choi_2019_05_23_rev21
Final edited master defense-hyun_wong choi_2019_05_23_rev21Final edited master defense-hyun_wong choi_2019_05_23_rev21
Final edited master defense-hyun_wong choi_2019_05_23_rev21
 
master defense hyun-wong choi_2019_05_14_rev19
master defense hyun-wong choi_2019_05_14_rev19master defense hyun-wong choi_2019_05_14_rev19
master defense hyun-wong choi_2019_05_14_rev19
 
defense hyun-wong choi_2019_05_14_rev18
defense hyun-wong choi_2019_05_14_rev18defense hyun-wong choi_2019_05_14_rev18
defense hyun-wong choi_2019_05_14_rev18
 
master defense hyun-wong choi_2019_05_14_rev19
master defense hyun-wong choi_2019_05_14_rev19master defense hyun-wong choi_2019_05_14_rev19
master defense hyun-wong choi_2019_05_14_rev19
 
master defense hyun-wong choi_2019_05_14_rev19
master defense hyun-wong choi_2019_05_14_rev19master defense hyun-wong choi_2019_05_14_rev19
master defense hyun-wong choi_2019_05_14_rev19
 
MachineLearning.pptx
MachineLearning.pptxMachineLearning.pptx
MachineLearning.pptx
 
New Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmNew Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids Algorithm
 
Experimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsExperimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithms
 
Presentation Template__TY_AIML_IE2_Project (1).pptx
Presentation Template__TY_AIML_IE2_Project (1).pptxPresentation Template__TY_AIML_IE2_Project (1).pptx
Presentation Template__TY_AIML_IE2_Project (1).pptx
 
002
002002
002
 
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETSFAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
 
Improving K-NN Internet Traffic Classification Using Clustering and Principle...
Improving K-NN Internet Traffic Classification Using Clustering and Principle...Improving K-NN Internet Traffic Classification Using Clustering and Principle...
Improving K-NN Internet Traffic Classification Using Clustering and Principle...
 
Clustering
ClusteringClustering
Clustering
 
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
001
001001
001
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
 
ML basic & clustering
ML basic & clusteringML basic & clustering
ML basic & clustering
 
Fa18_P2.pptx
Fa18_P2.pptxFa18_P2.pptx
Fa18_P2.pptx
 

More from Hyun Wong Choi

More from Hyun Wong Choi (20)

Airport security ver1
Airport security ver1Airport security ver1
Airport security ver1
 
Final
FinalFinal
Final
 
Chapter8 touch 6 10 group11
Chapter8 touch 6 10 group11Chapter8 touch 6 10 group11
Chapter8 touch 6 10 group11
 
Chapter6 power management ic group11
Chapter6 power management ic group11Chapter6 power management ic group11
Chapter6 power management ic group11
 
Chapter5 embedded storage
Chapter5 embedded storage Chapter5 embedded storage
Chapter5 embedded storage
 
Chapter5 embedded storage
Chapter5 embedded storage Chapter5 embedded storage
Chapter5 embedded storage
 
Chapter4 wireless connectivity group11
Chapter4 wireless connectivity group11Chapter4 wireless connectivity group11
Chapter4 wireless connectivity group11
 
Chapter2 ap group11
Chapter2 ap group11Chapter2 ap group11
Chapter2 ap group11
 
Chapter1
Chapter1Chapter1
Chapter1
 
003
003003
003
 
Hyun wong thesis 2019 06_22_rev40_final_grammerly
Hyun wong thesis 2019 06_22_rev40_final_grammerlyHyun wong thesis 2019 06_22_rev40_final_grammerly
Hyun wong thesis 2019 06_22_rev40_final_grammerly
 
Hyun wong thesis 2019 06_22_rev40_final_Submitted_online
Hyun wong thesis 2019 06_22_rev40_final_Submitted_onlineHyun wong thesis 2019 06_22_rev40_final_Submitted_online
Hyun wong thesis 2019 06_22_rev40_final_Submitted_online
 
Hyun wong thesis 2019 06_22_rev40_final_printed
Hyun wong thesis 2019 06_22_rev40_final_printedHyun wong thesis 2019 06_22_rev40_final_printed
Hyun wong thesis 2019 06_22_rev40_final_printed
 
Hyun wong thesis 2019 06_22_rev40_final
Hyun wong thesis 2019 06_22_rev40_finalHyun wong thesis 2019 06_22_rev40_final
Hyun wong thesis 2019 06_22_rev40_final
 
Hyun wong thesis 2019 06_22_rev39_final
Hyun wong thesis 2019 06_22_rev39_finalHyun wong thesis 2019 06_22_rev39_final
Hyun wong thesis 2019 06_22_rev39_final
 
Hyun wong thesis 2019 06_22_rev41_final
Hyun wong thesis 2019 06_22_rev41_finalHyun wong thesis 2019 06_22_rev41_final
Hyun wong thesis 2019 06_22_rev41_final
 
Hyun wong thesis 2019 06_22_rev40_final
Hyun wong thesis 2019 06_22_rev40_finalHyun wong thesis 2019 06_22_rev40_final
Hyun wong thesis 2019 06_22_rev40_final
 
Hyun wong thesis 2019 06_22_rev39_final
Hyun wong thesis 2019 06_22_rev39_finalHyun wong thesis 2019 06_22_rev39_final
Hyun wong thesis 2019 06_22_rev39_final
 
Hyun wong thesis 2019 06_19_rev38_final
Hyun wong thesis 2019 06_19_rev38_finalHyun wong thesis 2019 06_19_rev38_final
Hyun wong thesis 2019 06_19_rev38_final
 
Hyun wong thesis 2019 06_19_rev35_final
Hyun wong thesis 2019 06_19_rev35_finalHyun wong thesis 2019 06_19_rev35_final
Hyun wong thesis 2019 06_19_rev35_final
 

Recently uploaded

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Recently uploaded (20)

Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 

Master defense presentation 2019 04_18_rev2

  • 1. Electricity consumption optimization using K-means clustering algorithm Hyun Wong Choi, Nawab Muhammad Faseeh Qureshi and Dong Ryeol Shin Sungkyunkwan University, South Korea.
  • 2. Outline • Introduction • Related work • Proposed Approach • Evaluation • Conclusion • References
  • 3. Introduction • Electricity consumption • Power grid • In the power grid, we measure the consumption through sensors • Industrial consumption • Housing consumption • Factories consumption • Housing Consumption • Front end (Consumer End) • Back end (electrical company end)
  • 4. Introduction (Cont.) • Back end (Company End) • Dataset for consumption UCIRVINE • So many techniques that solves the optimization problem of electricity but, none of them focus on housing electricity optimization, • Reducing the cost • Factors of overcharge • Prediction Are not available
  • 5. Introduction (Cont.) • Solution • K-mean algorithm • Why chose k-mean cluster • Predict the answer from the dataset • No any answer is available in terms of k-mean • Why predicting the answers • No clear result • In this paper electricity usage of home is analyzed through k-means clustering algorithm for obtaining the optimal home usage electricity usage of home is • 3A is analyzed through K-means clustering algorithm for obtaining the optimal home usage electricity data points The Calinski-Harabasz Index, davis-boulden index and silhouette_score find detailed optimal number of clusters in the K-means algorithm and present the application scenario of the machine learning algorithm. • 3B is reducing the 1/8 dataset and result the same result • The proposed approach delivers us efficient and meaning prediction results never obtained before.
  • 8. Evaluation • Declaration & Resources System PC configuration Software 3A 1st paper execution snapshot 3B 2nd paper execution snapshot
  • 9. Analysis From K-means algorithms calculate proper cluster things is very important, from the data, estimate Silhouette_score, the result is K – 7 each cluster centroid and data prices silhouette score are 0.799 is the optimal score. From the formal Caliski- Harabasz Index results are 560.3999 is the optimal result. Using this k-means algorithm the fact is figure 11. K = 7
  • 10. Software PC Perfomance Software OS Software Ram Processor Harddisk Anadconda3 + Pycham3 Window 10 Professional 16.0GB i7-6600U CPU @2.60GHz 420GB SSD
  • 11. System PC configuration Software • Dataset UC Irvine Machine learning https://archive.ics.uci.edu/ml/index.php • sci-kit learn, Anaconda 3 open-source personally can easily follow it and because using BSD License to real works don’t have difficulties to that.
  • 12. 3A execution snapshot K = 2 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8 K = 9 K = 10
  • 13. 3B Second paper execution(1) 1/8 dataset K =1 1/8 dataset K = 2 1/8 dataset K =3 1/8 dataset K =4 1/8 dataset K =5 1/8 dataset K =6 1/8 dataset K = 7 1/8 dataset K = 8 1/8 dataset K = 9 1/8 dataset K = 10 1/8 dataset K = 11
  • 14. Conclusion • Technique • How we solve it 1st point Using the k-means clustering algorithm and • Association with multiple answers • Electricity • Blight side of conclusion
  • 15. Published paper 1. Hyun Wong Choi, Nawab Muhammad Faseeh Qureshi and Dong Ryeol Shin “Comparative Analysis of Electricity Consumption at Home through a Silhouette-score prospective” , ICACT 2019 , South Korea , 2019 Sungkyunkwan University, Korea 2. Hyun Wong Choi, Nawab Muhammad Faseeh Qureshi and Dong Ryeol Shin “Analysis of Electricity Consumption at Home Using K-means Clustering Algorithm”, ICACT 2019 , South Korea , 2019 Sungkyunkwan University, Korea
  • 17. K-means clustering • K-means algorithm is one of the clustering methods for divided, divided is giving the data among the many partitions. For example, receive data object n, divided data is input data divided K (≤ n) data, each group consisting of cluster below equation is the at K-means algorithm when cluster consists of algorithms using cost function use it [11] argmin 𝑖 =1 𝑘 𝑥 ∈ 𝑆 𝑖 𝑥 − 𝜇𝑖 2
  • 18. Calinski-Harabasz Index • How to be well to be clustering inner way is Caliski-Harabasz Index, Davies-Bouldin index, Dunn index, Silhouette score. In this paper. Evaluate via Clainiski-Harabasz Index and silhouette score evaluate it. • From the Cluster Calinski-Harabasz Index s I the clusters distributed average and cluster distributed ratio will give it to you. 𝑠 𝑘 = 𝑇𝑟(𝐵 𝑘) 𝑇𝑟(𝑊𝑘) × 𝑁 − 𝑘 𝑘 − 1 • For this Bk is the distributed matrix from each group Wk is the cluster distributed defined. 𝑊𝑘 = 𝑞=1 𝑘 𝑥∈𝐶 𝑞 (𝑥 − 𝑐 𝑞)(𝑥 − 𝑐 𝑞) 𝑇 𝐵 𝑘 = 𝑞 𝑛 𝑞(𝐶 𝑞 − 𝑐)(𝐶 𝑞 − 𝑐) 𝑇
  • 19. Silhouette score • Silhouette score is the easy way to in data I each data cluster in data’s definition an (i) each data is not clustered inner and data’s definition b(i) silhouette score s(i) is equal to calculate that s i = 𝑏 𝑖 − 𝑎(𝑖) max { 𝑎 𝑖 , 𝑏 𝑖 } • From this calculate s(i) is equal to that function −1 ≤ s i ≤ 1 • S(i) is the close to 1 is the data I is the correct cluster to each thing, close to -1 cannot distribute cluster is distributed, from this paper machine Using the machine learning library scikit-learn in the house hold power consumption clustering [7].
  • 20. Davies-Bouldin index • If the ground truth labels are not known, the Davies-Bouldin index (sklearn. Metrixdavis Boulden) 𝑅𝑖𝑗 = 𝑠 𝑖+𝑠 𝑗 𝑑 𝑖𝑗 • Then the Davis-Bouldin Index is defined as DB = 1 𝑘 𝑖 = 1 𝑘 max 𝑖≠𝑗 𝑅𝑖𝑗 • The zero is the lowest score a possible. Score. Values closer to zero indicate a better partition. But the problem is this algorithm do not attach it in the Scikit-learn library and only explain it in the document page but cannot experiment easily.
  • 21. Conclusion • From the paper, Household power consumption via k-means clustering, Used library which is sci-kit learn, Anaconda 3 open-source personally can easily follow it and because using BSD License to real works don’t have difficulties to that. • Not only the K-means algorithm, PCA Algorithms, but also SVM algorithm etc other machine learning algorithms clustering can also do it. • From this result, in real life household power consumptions diverse analytics. And electricity transformer, Transmission power can management period can estimate it. And each data using electricity consumption. It can be used for progressive taxation, regional to regional demand forecasting, maintenance of power plants and facilities. Can do it. • In the Gas company can estimate via k-means algorithms and also can estimate about the gas consumption rate to via K-means clustering and index.
  • 22. Q & A
  • 23. Comparative Analysis of Electricity Consumption at Home through a Silhouette-score prospective Hyun Wong Choi, Nawab Muhammad Faseeh Qureshi and Dong Ryeol Shin SungKyunKwan University, South Korea
  • 24. Abstract • Machine learning is a state-of-the-art sub-project of artificial intelligence, that is been evolved for finding large-scale intelligent analytics in the distributed computing environment. In this paper, we perform comparative analytics onto dataset collected for the electricity usage of home based on the K-mean clustering algorithm using comparison to silhouette score with a ratio of 1/8 dataset. The performance evaluation shows that, the comparison index is similar in numbers of silhouette score even if dataset is smaller than before.
  • 25. K-means clustering • K-means clustering K-means algorithm is one of the clustering methods for divided, divided is giving the data among the many partitions. For example, receive data object n, divided data is input data divided K (≤ n) data, each group consisting of cluster below equation is the at K- means algorithm when cluster consists of algorithms using cost function use it [11] argmin 𝑖 =1 𝑘 𝑥 ∈ 𝑆 𝑖 𝑥 − 𝜇𝑖 2
  • 26. Silhouette score • Silhouette score is the easy way to in data I each data cluster in data’s definition an (i) each data is not clustered inner and data’s definition b(i) silhouette score s(i) is equal to calculate that s i = 𝑏 𝑖 − 𝑎(𝑖) max { 𝑎 𝑖 , 𝑏 𝑖 } • From this calculate s(i) is equal to that function −1 ≤ s i ≤ 1 • S(i) is the close to 1 is the data I is the correct cluster to each thing, close to -1 cannot distribute cluster is distributed, from this paper machine Using the machine learning library scikit-learn in the house hold power consumption clustering
  • 27. Analysis • From K-means algorithms calculate proper cluster things is very important, from the data, estimate Silhouette_score, the result is K = 7 each cluster centroid and data prices silhouette score is 0.799 is the optimal score. Even if dataset is so small but the 1/8 datasets K= 7 each cluster centroid and data prices silhouette score 0.810 is the optimal score. From this K-means algorithm cluster 7th, ( all dataset , 1/8 dataset ) each group’s centroid and each centroid distance will be an optimal value. From this result, the dataset is decrease but the K-means clustering ‘s class vector space. Its optimal cluster is same situation with before original Dataset Household power consumption rate via clustering. Figure 12. Shiloutette score according to change of cluster number. Figure 13. 1/8 dataset Silhouette score according to change of cluster number.
  • 28. Conclusion From the paper, Household power consumption via k-means clustering, Used library which is sci-kit learn, Anaconda 3 open-source personally can easily follow it and because using BSD License to real works don’t have difficulties to that. From this result even if reduce the dataset 1/8 but the silhouette score and all the clustering result is same with before. But the population will increase it can show the clearer result for the classification and vector space. Large dataset to small dataset is clear to show to the specific result for the Silhouette score but opposite site is not clearly allowed. Because 4-dimension vector dataset. From the experiment reduce the estimated time if received huge dataset from analysis.
  • 29. Q & A

Editor's Notes

  1. Machine learning is a state-of-the-art sub-project of artificial intelligence, that is been evolved for finding large-scale intelligent analytics in the distributed computing environment. In this paper, we perform comparative analytics onto dataset collected for the electricity usage of home based on the K-mean clustering algorithm using comparison to silhouette score with a ratio of 1/8 dataset. The performance evaluation shows that, the comparison index is similar in numbers of silhouette score even if dataset is smaller than before.
  2. K-means clustering K-means algorithm is one of the clustering methods for divided, divided is giving the data among the many partitions. For example, receive data object n, divided data is input data divided K (≤n) data, each group consisting of cluster below equation is the at K-means algorithm when cluster consists of algorithms using cost function use it [11]
  3. Silhouette score is the easy way to in data I each data cluster in data’s definition an (i) each data is not clustered inner and data’s definition b(i) silhouette score s(i) is equal to calculate that
  4. From K-means algorithms calculate proper cluster things is very important, from the data, estimate Silhouette_score, the result is K = 7 each cluster centroid and data prices silhouette score is 0.799 is the optimal score. Even if dataset is so small but the 1/8 datasets K= 7 each cluster centroid and data prices silhouette score 0.810 is the optimal score. From this K-means algorithm cluster 7th, ( all dataset , 1/8 dataset ) each group’s centroid and each centroid distance will be an optimal value. From this result, the dataset is decrease but the K-means clustering ‘s class vector space. Its optimal cluster is same situation with before original Dataset Household power consumption rate via clustering.
  5. From the paper, Household power consumption via k-means clustering, Used library which is sci-kit learn, Anaconda 3 open-source personally can easily follow it and because using BSD License to real works don’t have difficulties to that. From this result even if reduce the dataset 1/8 but the silhouette score and all the clustering result is same with before. But the population will increase it can show the clearer result for the classification and vector space. Large dataset to small dataset is clear to show to the specific result for the Silhouette score but opposite site is not clearly allowed. Because 4-dimension vector dataset. From the experiment reduce the estimated time if received huge dataset from analysis.