This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 analyzes a full home electricity usage dataset through K-means clustering to obtain optimal data points, evaluating cluster numbers using indices like Davis-Boulden and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds similar results for silhouette score, showing the approach works on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage, costs, and predict factors driving overcharges.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering algorithms. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. Results from k-means clustering on the full and 1/8 reduced datasets are presented through figures showing clustering at different values of k. The indices are calculated to evaluate the clustering results and identify the best k for the datasets.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering algorithms. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes show similar performance even when the dataset is reduced to 1/8 of its original size. This demonstrates the ability of machine learning algorithms to analyze datasets efficiently at different scales.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering techniques. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes show similar performance even when the dataset is reduced to 1/8 of its original size. This demonstrates the ability of machine learning algorithms to analyze datasets efficiently at different scales.
This document appears to be a master's dissertation that analyzes electricity consumption in homes using machine learning clustering algorithms. It contains two papers. Paper 1 uses K-means clustering on a home electricity usage dataset to obtain optimal clusters of usage data points. It evaluates the optimal number of clusters using silhouette scores, Calinski-Harabasz Index, and Davis-Boulden Index. Paper 2 reduces the dataset to 1/8 size and finds that the comparison indices remain similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning to optimize home electricity usage, reducing costs and predicting factors that influence overcharging.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It applies k-means to household electricity consumption data to obtain optimal data points. The Calinski-Harabasz index and silhouette score determine the optimal number of clusters. Chapter 4 discusses paper 2, which also uses k-means clustering on a reduced 1/8 dataset and finds similar silhouette scores. The conclusion is that machine learning can efficiently predict electricity consumption through clustering algorithms even with smaller datasets.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering algorithms. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes are similar even when the dataset is reduced to 1/8 of the original size, showing the approach is effective with smaller datasets.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It proposes applying k-means to household electricity usage data to obtain optimal usage data points. The Calinski-Harabasz index and silhouette score are used to determine the optimal number of clusters. Experimental results on 1/8 of the dataset show k-means clustering at different values of k. Paper 2, discussed in Chapter 4, introduces a related work and methodology for a second analysis using k-means clustering and evaluation metrics. The conclusion discusses obtaining efficient and novel prediction results for household electricity optimization.
Hyun wong sample thesis 2019 06_19_rev22_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It applies k-means to household electricity consumption data to obtain optimal data points. The Calinski-Harabasz index and silhouette score determine the optimal number of clusters. Chapter 4 discusses paper 2, which also uses k-means clustering on a reduced 1/8 dataset and finds similar silhouette scores. The conclusion is that machine learning can efficiently predict electricity consumption through clustering algorithms like k-means.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering algorithms. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. Results from k-means clustering on the full and 1/8 reduced datasets are presented through figures showing clustering at different values of k. The indices are calculated to evaluate the clustering results and identify the best k for the datasets.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering algorithms. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes show similar performance even when the dataset is reduced to 1/8 of its original size. This demonstrates the ability of machine learning algorithms to analyze datasets efficiently at different scales.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering techniques. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes show similar performance even when the dataset is reduced to 1/8 of its original size. This demonstrates the ability of machine learning algorithms to analyze datasets efficiently at different scales.
This document appears to be a master's dissertation that analyzes electricity consumption in homes using machine learning clustering algorithms. It contains two papers. Paper 1 uses K-means clustering on a home electricity usage dataset to obtain optimal clusters of usage data points. It evaluates the optimal number of clusters using silhouette scores, Calinski-Harabasz Index, and Davis-Boulden Index. Paper 2 reduces the dataset to 1/8 size and finds that the comparison indices remain similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning to optimize home electricity usage, reducing costs and predicting factors that influence overcharging.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It applies k-means to household electricity consumption data to obtain optimal data points. The Calinski-Harabasz index and silhouette score determine the optimal number of clusters. Chapter 4 discusses paper 2, which also uses k-means clustering on a reduced 1/8 dataset and finds similar silhouette scores. The conclusion is that machine learning can efficiently predict electricity consumption through clustering algorithms even with smaller datasets.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering algorithms. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes are similar even when the dataset is reduced to 1/8 of the original size, showing the approach is effective with smaller datasets.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It proposes applying k-means to household electricity usage data to obtain optimal usage data points. The Calinski-Harabasz index and silhouette score are used to determine the optimal number of clusters. Experimental results on 1/8 of the dataset show k-means clustering at different values of k. Paper 2, discussed in Chapter 4, introduces a related work and methodology for a second analysis using k-means clustering and evaluation metrics. The conclusion discusses obtaining efficient and novel prediction results for household electricity optimization.
Hyun wong sample thesis 2019 06_19_rev22_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It applies k-means to household electricity consumption data to obtain optimal data points. The Calinski-Harabasz index and silhouette score determine the optimal number of clusters. Chapter 4 discusses paper 2, which also uses k-means clustering on a reduced 1/8 dataset and finds similar silhouette scores. The conclusion is that machine learning can efficiently predict electricity consumption through clustering algorithms like k-means.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Paper 1 introduces machine learning and k-means clustering. It applies k-means to electricity usage data from the UC Irvine repository to obtain optimal cluster groupings. The Calinski-Harabasz index and silhouette score are used to determine the optimal number of clusters. Paper 1 results show k-means clustering of the electricity data from 1 to 10 clusters. Paper 2 further analyzes a reduced 1/8th dataset and finds similar silhouette scoring, indicating the analysis is consistent even with less data. The dissertation applies machine learning techniques to provide new insights into home electricity optimization and usage patterns.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Paper 1 introduces machine learning and k-means clustering. It applies k-means to electricity usage data from the UC Irvine repository to obtain optimal cluster groupings. The Calinski-Harabasz index and silhouette score are used to determine the optimal number of clusters. Paper 1 results show k-means clustering of the electricity data from 1 to 10 clusters. Paper 2 further analyzes a reduced 1/8th dataset and finds similar silhouette scoring, indicating the analysis is consistent even with less data. The dissertation applies machine learning techniques to provide new insights into home electricity optimization and usage patterns.
Hyun wong sample thesis 2019 06_19_rev21_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It applies k-means to household electricity consumption data to obtain optimal data points. The Calinski-Harabasz index and silhouette score determine the optimal number of clusters. Paper 2, discussed in Chapter 4, also uses k-means clustering on a reduced 1/8 dataset and finds similar silhouette scores, showing the approach is effective even with smaller data. The dissertation evaluates different clustering techniques and indices to analyze household electricity usage through unsupervised machine learning.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It applies k-means to household electricity consumption data to obtain optimal data points. The Calinski-Harabasz index and silhouette score determine the optimal number of clusters. Paper 2, discussed in Chapter 4, also uses k-means clustering on a reduced 1/8 dataset and finds similar silhouette scores, showing the approach is effective even with smaller data. The dissertation evaluates different clustering validity indices and applies machine learning techniques to optimize home electricity usage.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering techniques. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes show similar performance even when the dataset is reduced to 1/8 of its original size. This demonstrates the ability of machine learning algorithms to analyze datasets efficiently at different scales.
This document surveys common MPPT (maximum power point tracking) methods used in photovoltaic systems, including conventional and advanced algorithms. It analyzes the Perturbation and Observation (P&O), Incremental Conductance (IncCond), and fuzzy logic-based MPPT controllers through MATLAB/Simulink simulations. The simulations show the fuzzy logic controller has better static and dynamic performance than the conventional P&O and IncCond techniques under varying weather conditions such as changes in solar radiation and temperature.
This document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering. The dissertation contains two papers:
1. The first paper analyzes household electricity usage data through K-means clustering to obtain optimal data points. It uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters.
2. The second paper performs a comparative analysis on a dataset that is 1/8 the size of the original. It finds that the Silhouette score is half of the original dataset, even with the smaller data.
The dissertation applies unsupervised machine learning clustering techniques to analyze household electricity consumption data, in order to optimize costs and identify factors
Hyun wong sample thesis 2019 06_19_rev20_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption in homes through K-means clustering and silhouette scoring of a dataset. The document contains two papers. Paper 1 introduces machine learning and K-means clustering. It applies K-means to a household electricity consumption dataset from UC Irvine, testing clusters from 1 to 10. The optimal number of clusters is identified as 7 based on maximizing the Calinski-Harabasz index and silhouette score. Paper 2 applies the same methodology to a reduced 1/8 size version of the dataset, finding similar silhouette scores indicating the clustering remains effective with less data.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm and silhouette score. The document contains two papers that analyze a household electricity consumption dataset from the University of California, Irvine using K-means clustering. Paper 1 uses the Calinski-Harabasz Index, Davis-Boulden index, and silhouette score to determine the optimal number of clusters. Paper 2 performs a comparative analysis using a 1/8 subset of the full dataset and finds that the silhouette scores are similar even when using a smaller dataset. The dissertation aims to optimize household electricity usage and costs through machine learning clustering techniques.
The document analyzes electricity consumption at home through K-means clustering and evaluates different cluster validity indices, including the Silhouette score, to determine the optimal number of clusters in the dataset. It performs K-means clustering on a household electricity consumption dataset and compares the results of the Silhouette score and other indices at different values of K to identify the best number of clusters. The analysis aims to help optimize home electricity usage through machine learning clustering techniques.
Hyun wong thesis 2019 06_22_rev40_final_grammerlyHyun Wong Choi
The document analyzes electricity consumption at home through K-means clustering and evaluates different cluster validity indices, including the Silhouette score, to determine the optimal number of clusters in the dataset. It performs K-means clustering on a household electricity consumption dataset and compares the results of the Silhouette score and other indices at different values of K to identify the best clustering. The analysis aims to optimize home electricity usage through unsupervised machine learning clustering techniques.
Hyun wong thesis 2019 06_22_rev40_final_printedHyun Wong Choi
This document summarizes a master's dissertation that analyzes electricity consumption at home through k-means clustering. The dissertation contains two papers:
1. The first paper analyzes electricity usage data from homes using k-means clustering to identify optimal clusters of usage patterns. It evaluates different metrics like silhouette score and clustering indices to determine the optimal number of clusters in the data.
2. The second paper performs a comparative analysis using a reduced 1/8th dataset to validate that the silhouette score and optimal number of clusters is similar even with smaller data.
The dissertation applies machine learning clustering techniques to analyze electricity consumption data from homes with the goal of optimizing costs and identifying factors for overcharging.
The document analyzes electricity consumption data from homes using K-means clustering to determine optimal clusters in the data. It evaluates different cluster validity indices like the Calinski-Harabasz Index, Davis-Boulden index, and Silhouette score to find the optimal number of clusters. The analysis is also performed on a reduced 1/8th dataset to see if the results are similar when using less data.
Hyun wong thesis 2019 06_22_rev40_final_Submitted_onlineHyun Wong Choi
The document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering and silhouette scoring. It introduces machine learning and clustering techniques. It then describes the experimental environment, dataset used, previous work on related topics, and the proposed approach of applying K-means clustering to analyze the electricity consumption dataset. The key aspects analyzed are the optimal number of clusters determined by indices like Calinski-Harabasz, Davis-Boulden, and silhouette score. Results are compared between the full and 1/8 reduced datasets.
Hyun wong sample thesis 2019 06_01_rev17_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption at home through a comparative analysis using a silhouette-score prospective. The dissertation contains two papers that apply k-means clustering to household electricity usage data. Paper 1 uses k-means clustering and evaluates the optimal number of clusters using Davis-Bouldin Index and Silhouette_score. Paper 2 performs a comparative analysis on a 1/8 size dataset using silhouette score. The evaluation shows that the comparison index results are similar even when using smaller datasets. The dissertation applies machine learning techniques to analyze electricity consumption and optimize cluster analysis for effective load forecasting and management.
This master's thesis explores optimal control of energy and thermal management systems in fuel cell hybrid electric vehicles (FCHEVs) to minimize hydrogen consumption. A model of an FCHEV powertrain is developed for optimal control using dynamic programming. Control strategies are found that optimally operate the energy and thermal systems during driving missions. The results provide insight into how to control the powertrain to efficiently use hydrogen. It is concluded that integrated energy and thermal strategies can increase fuel efficiency, with the optimal strategy dependent on fuel cell characteristics.
This master's thesis examines dynamic programming control for energy management in smart homes with photovoltaic systems and battery storage. It first provides background on photovoltaic generation and feed-in tariffs in Germany. It then formulates the energy management problem as a Markov decision process and explores various control approaches including rule-based control, linear programming, dynamic programming, and approximate dynamic programming. The thesis evaluates these methods using real solar generation and electricity price data. The goal is to optimize battery charging and discharging to minimize energy costs while satisfying household demand.
This document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering and silhouette scoring. The dissertation contains two papers. Paper 1 analyzes household electricity consumption data from UC Irvine using K-means clustering to determine the optimal number of clusters based on silhouette scoring and other indices. The analysis finds seven clusters to be optimal. Paper 2 performs a comparative analysis using a 1/8 subset of the full dataset, finding that silhouette scores are approximately half of the full dataset but the optimal number of clusters is similar. The dissertation concludes that machine learning clustering can effectively analyze electricity consumption patterns and predict optimal clustering even with smaller datasets.
This project describes integrating wind power into a DC microgrid that stores and transforms power. A microgrid consists of distributed energy sources like wind turbines and solar PV systems connected to electrical loads. The project simulates connecting a wind turbine to an asynchronous machine, rectifier, and DC bus using Simulink. Operational optimization of the microgrid is analyzed to minimize costs and emissions while maintaining supply-demand balance and battery state of charge. Integration of the DC microgrid is proposed and simulation results are presented.
This document is the thesis presented by Joanie Michellene Claudette Geldenhuys to Stellenbosch University for the degree of Master of Science in Electrical and Electronic Engineering. The thesis investigates the use of model predictive control for current control of a three-phase grid-tied voltage source converter with an LCL filter. A cost function is formulated to minimize reference tracking error and switching frequency. The grid voltage is incorporated into the system model as an additional input. Simulation results show the controller provides fast transient response and good reference tracking at high switching frequencies but is unable to meet harmonic limits at low switching frequencies as required by the South African grid code.
This document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering and silhouette scoring. It contains two papers. Paper 1 analyzes a household electricity usage dataset using K-means clustering to identify the optimal number of clusters, as determined by the Calinski-Harabasz Index, Davis-Boulden index, and silhouette score. Paper 2 performs a similar analysis but with a reduced 1/8 size dataset to compare results. The dissertation concludes that both analyses produce similar silhouette scores even with a smaller dataset.
This document is the final report for an industrial project that aims to optimize electrical energy use through an automated lighting control system using a programmable logic controller (PLC). The report includes sections on background theory, detailed design, experimental evaluation, results analysis, discussion, and conclusions. It describes designing a system to intelligently control lighting in buildings by using PIR sensors, a day/night sensor, and a PLC to automatically turn lights on and off based on occupancy and daylight levels, in order to conserve energy.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Paper 1 introduces machine learning and k-means clustering. It applies k-means to electricity usage data from the UC Irvine repository to obtain optimal cluster groupings. The Calinski-Harabasz index and silhouette score are used to determine the optimal number of clusters. Paper 1 results show k-means clustering of the electricity data from 1 to 10 clusters. Paper 2 further analyzes a reduced 1/8th dataset and finds similar silhouette scoring, indicating the analysis is consistent even with less data. The dissertation applies machine learning techniques to provide new insights into home electricity optimization and usage patterns.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Paper 1 introduces machine learning and k-means clustering. It applies k-means to electricity usage data from the UC Irvine repository to obtain optimal cluster groupings. The Calinski-Harabasz index and silhouette score are used to determine the optimal number of clusters. Paper 1 results show k-means clustering of the electricity data from 1 to 10 clusters. Paper 2 further analyzes a reduced 1/8th dataset and finds similar silhouette scoring, indicating the analysis is consistent even with less data. The dissertation applies machine learning techniques to provide new insights into home electricity optimization and usage patterns.
Hyun wong sample thesis 2019 06_19_rev21_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It applies k-means to household electricity consumption data to obtain optimal data points. The Calinski-Harabasz index and silhouette score determine the optimal number of clusters. Paper 2, discussed in Chapter 4, also uses k-means clustering on a reduced 1/8 dataset and finds similar silhouette scores, showing the approach is effective even with smaller data. The dissertation evaluates different clustering techniques and indices to analyze household electricity usage through unsupervised machine learning.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It applies k-means to household electricity consumption data to obtain optimal data points. The Calinski-Harabasz index and silhouette score determine the optimal number of clusters. Paper 2, discussed in Chapter 4, also uses k-means clustering on a reduced 1/8 dataset and finds similar silhouette scores, showing the approach is effective even with smaller data. The dissertation evaluates different clustering validity indices and applies machine learning techniques to optimize home electricity usage.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering techniques. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes show similar performance even when the dataset is reduced to 1/8 of its original size. This demonstrates the ability of machine learning algorithms to analyze datasets efficiently at different scales.
This document surveys common MPPT (maximum power point tracking) methods used in photovoltaic systems, including conventional and advanced algorithms. It analyzes the Perturbation and Observation (P&O), Incremental Conductance (IncCond), and fuzzy logic-based MPPT controllers through MATLAB/Simulink simulations. The simulations show the fuzzy logic controller has better static and dynamic performance than the conventional P&O and IncCond techniques under varying weather conditions such as changes in solar radiation and temperature.
This document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering. The dissertation contains two papers:
1. The first paper analyzes household electricity usage data through K-means clustering to obtain optimal data points. It uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters.
2. The second paper performs a comparative analysis on a dataset that is 1/8 the size of the original. It finds that the Silhouette score is half of the original dataset, even with the smaller data.
The dissertation applies unsupervised machine learning clustering techniques to analyze household electricity consumption data, in order to optimize costs and identify factors
Hyun wong sample thesis 2019 06_19_rev20_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption in homes through K-means clustering and silhouette scoring of a dataset. The document contains two papers. Paper 1 introduces machine learning and K-means clustering. It applies K-means to a household electricity consumption dataset from UC Irvine, testing clusters from 1 to 10. The optimal number of clusters is identified as 7 based on maximizing the Calinski-Harabasz index and silhouette score. Paper 2 applies the same methodology to a reduced 1/8 size version of the dataset, finding similar silhouette scores indicating the clustering remains effective with less data.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm and silhouette score. The document contains two papers that analyze a household electricity consumption dataset from the University of California, Irvine using K-means clustering. Paper 1 uses the Calinski-Harabasz Index, Davis-Boulden index, and silhouette score to determine the optimal number of clusters. Paper 2 performs a comparative analysis using a 1/8 subset of the full dataset and finds that the silhouette scores are similar even when using a smaller dataset. The dissertation aims to optimize household electricity usage and costs through machine learning clustering techniques.
The document analyzes electricity consumption at home through K-means clustering and evaluates different cluster validity indices, including the Silhouette score, to determine the optimal number of clusters in the dataset. It performs K-means clustering on a household electricity consumption dataset and compares the results of the Silhouette score and other indices at different values of K to identify the best number of clusters. The analysis aims to help optimize home electricity usage through machine learning clustering techniques.
Hyun wong thesis 2019 06_22_rev40_final_grammerlyHyun Wong Choi
The document analyzes electricity consumption at home through K-means clustering and evaluates different cluster validity indices, including the Silhouette score, to determine the optimal number of clusters in the dataset. It performs K-means clustering on a household electricity consumption dataset and compares the results of the Silhouette score and other indices at different values of K to identify the best clustering. The analysis aims to optimize home electricity usage through unsupervised machine learning clustering techniques.
Hyun wong thesis 2019 06_22_rev40_final_printedHyun Wong Choi
This document summarizes a master's dissertation that analyzes electricity consumption at home through k-means clustering. The dissertation contains two papers:
1. The first paper analyzes electricity usage data from homes using k-means clustering to identify optimal clusters of usage patterns. It evaluates different metrics like silhouette score and clustering indices to determine the optimal number of clusters in the data.
2. The second paper performs a comparative analysis using a reduced 1/8th dataset to validate that the silhouette score and optimal number of clusters is similar even with smaller data.
The dissertation applies machine learning clustering techniques to analyze electricity consumption data from homes with the goal of optimizing costs and identifying factors for overcharging.
The document analyzes electricity consumption data from homes using K-means clustering to determine optimal clusters in the data. It evaluates different cluster validity indices like the Calinski-Harabasz Index, Davis-Boulden index, and Silhouette score to find the optimal number of clusters. The analysis is also performed on a reduced 1/8th dataset to see if the results are similar when using less data.
Hyun wong thesis 2019 06_22_rev40_final_Submitted_onlineHyun Wong Choi
The document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering and silhouette scoring. It introduces machine learning and clustering techniques. It then describes the experimental environment, dataset used, previous work on related topics, and the proposed approach of applying K-means clustering to analyze the electricity consumption dataset. The key aspects analyzed are the optimal number of clusters determined by indices like Calinski-Harabasz, Davis-Boulden, and silhouette score. Results are compared between the full and 1/8 reduced datasets.
Hyun wong sample thesis 2019 06_01_rev17_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption at home through a comparative analysis using a silhouette-score prospective. The dissertation contains two papers that apply k-means clustering to household electricity usage data. Paper 1 uses k-means clustering and evaluates the optimal number of clusters using Davis-Bouldin Index and Silhouette_score. Paper 2 performs a comparative analysis on a 1/8 size dataset using silhouette score. The evaluation shows that the comparison index results are similar even when using smaller datasets. The dissertation applies machine learning techniques to analyze electricity consumption and optimize cluster analysis for effective load forecasting and management.
This master's thesis explores optimal control of energy and thermal management systems in fuel cell hybrid electric vehicles (FCHEVs) to minimize hydrogen consumption. A model of an FCHEV powertrain is developed for optimal control using dynamic programming. Control strategies are found that optimally operate the energy and thermal systems during driving missions. The results provide insight into how to control the powertrain to efficiently use hydrogen. It is concluded that integrated energy and thermal strategies can increase fuel efficiency, with the optimal strategy dependent on fuel cell characteristics.
This master's thesis examines dynamic programming control for energy management in smart homes with photovoltaic systems and battery storage. It first provides background on photovoltaic generation and feed-in tariffs in Germany. It then formulates the energy management problem as a Markov decision process and explores various control approaches including rule-based control, linear programming, dynamic programming, and approximate dynamic programming. The thesis evaluates these methods using real solar generation and electricity price data. The goal is to optimize battery charging and discharging to minimize energy costs while satisfying household demand.
This document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering and silhouette scoring. The dissertation contains two papers. Paper 1 analyzes household electricity consumption data from UC Irvine using K-means clustering to determine the optimal number of clusters based on silhouette scoring and other indices. The analysis finds seven clusters to be optimal. Paper 2 performs a comparative analysis using a 1/8 subset of the full dataset, finding that silhouette scores are approximately half of the full dataset but the optimal number of clusters is similar. The dissertation concludes that machine learning clustering can effectively analyze electricity consumption patterns and predict optimal clustering even with smaller datasets.
This project describes integrating wind power into a DC microgrid that stores and transforms power. A microgrid consists of distributed energy sources like wind turbines and solar PV systems connected to electrical loads. The project simulates connecting a wind turbine to an asynchronous machine, rectifier, and DC bus using Simulink. Operational optimization of the microgrid is analyzed to minimize costs and emissions while maintaining supply-demand balance and battery state of charge. Integration of the DC microgrid is proposed and simulation results are presented.
This document is the thesis presented by Joanie Michellene Claudette Geldenhuys to Stellenbosch University for the degree of Master of Science in Electrical and Electronic Engineering. The thesis investigates the use of model predictive control for current control of a three-phase grid-tied voltage source converter with an LCL filter. A cost function is formulated to minimize reference tracking error and switching frequency. The grid voltage is incorporated into the system model as an additional input. Simulation results show the controller provides fast transient response and good reference tracking at high switching frequencies but is unable to meet harmonic limits at low switching frequencies as required by the South African grid code.
This document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering and silhouette scoring. It contains two papers. Paper 1 analyzes a household electricity usage dataset using K-means clustering to identify the optimal number of clusters, as determined by the Calinski-Harabasz Index, Davis-Boulden index, and silhouette score. Paper 2 performs a similar analysis but with a reduced 1/8 size dataset to compare results. The dissertation concludes that both analyses produce similar silhouette scores even with a smaller dataset.
This document is the final report for an industrial project that aims to optimize electrical energy use through an automated lighting control system using a programmable logic controller (PLC). The report includes sections on background theory, detailed design, experimental evaluation, results analysis, discussion, and conclusions. It describes designing a system to intelligently control lighting in buildings by using PIR sensors, a day/night sensor, and a PLC to automatically turn lights on and off based on occupancy and daylight levels, in order to conserve energy.
This document provides guidelines for calculating the energy consumption of air handling units (AHUs). It was prepared by a working group of European AHU manufacturers and experts. The document defines terms and symbols, and describes how to calculate the energy used by various AHU components, including fans, heating/cooling coils, energy recovery devices, and humidification/dehumidification. It provides a standardized method for calculating annual thermal and electrical energy consumption of AHUs based on ambient weather conditions. Annexes include correlation factors for different locations in Europe and sample calculation sheets.
Diseno en ingenieria mecanica de Shigley - 8th ---HDes
descarga el contenido completo de aqui http://paralafakyoumecanismos.blogspot.com.ar/2014/08/libro-para-mecanismos-y-elementos-de.html
This project report describes the development of an automatic power factor detection and correction system using an Arduino Uno microcontroller. The system includes current and potential transformers to sample current and voltage from the main circuit. It also includes a zero crossing detector, summer circuit, relay module and capacitor bank. The microcontroller measures the power factor from the sampled current and voltage values and controls the relay module to switch the appropriate capacitors in/out of the capacitor bank in order to correct the power factor. The goal of the project is to develop a microprocessor-based control system for single phase capacitor banks to enhance power factor correction capabilities.
Semester Project 3: Security of Power SupplySøren Aagaard
The project is about the security of power supply, both current and in the future. Renewable energys part, of the total electricity production will continue to grow in the following years, this will be illuminated and analyzed.
The applicable legislation will be provided and explained to help grasping the legal aspect of the security of power supply.
The economical optimum power supply will be calculated, to help evaluate if it is profitable to uphold Denmarks high security of power supply.
To provide a more practical view, a model of the powergrid has come together, analysing how the grid react to the strain caused by errors, to help fathom by which criteria the grid is constructed.
Similar to Hyun wong thesis 2019 06_19_rev32_final (20)
1) The document discusses the past, current, and future of smartphone technology.
2) In the past, "Pen on Projection" technology allowed writing on any surface using a Bluetooth pen and projected screen.
3) Currently, Qualcomm uses fingerprint sensor technology for authentication and security.
4) In the future, Qualcomm will introduce ultrasonic fingerprint sensors that can scan fingerprints through OLED displays of various thicknesses.
This document appears to be a master's dissertation that analyzes electricity consumption in homes using k-means clustering. It contains chapters that introduce the topic, provide an overview and motivation, describe two papers analyzing electricity consumption data through k-means clustering with silhouette scores to determine optimal cluster numbers, present results, and conclude. The dissertation applies machine learning techniques to optimize home electricity usage by reducing costs and overcharging through clustering and prediction.
This document appears to be a master's dissertation that analyzes electricity consumption in homes using k-means clustering. It contains chapters that introduce the topic, provide an overview and motivation, describe two papers analyzing electricity consumption data through k-means clustering with silhouette scores to determine optimal cluster numbers, present results of experiments on datasets, and conclude with findings. The dissertation aims to optimize home electricity usage through machine learning clustering techniques by reducing costs and overcharging factors while enabling prediction of consumption. It applies k-means clustering to electricity usage data from homes to predict consumption patterns and determine the optimal number of clusters using silhouette scores.
This document appears to be a master's dissertation that analyzes electricity consumption in homes using k-means clustering. It contains chapters that introduce the topic, provide an overview and motivation, describe two papers analyzing electricity consumption data through k-means clustering with silhouette scores to determine optimal cluster numbers, present results of clustering a full and 1/8 sized dataset, and conclude. The dissertation aims to optimize home electricity usage through k-means clustering and determine factors influencing overcharges or costs by analyzing household consumption data.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption at home through K-means clustering and compares results using different dataset sizes and evaluation metrics. The dissertation contains two papers: the first analyzes a full home electricity usage dataset using K-means clustering and evaluates optimal cluster numbers with Calinski-Harabasz Index, Davis-Boulden Index, and silhouette score. The second analyzes a reduced 1/8 size dataset with K-means clustering and finds similar optimal cluster numbers based on silhouette score, demonstrating machine learning can produce consistent results even with smaller datasets. The dissertation applies machine learning algorithms to optimize home electricity usage and costs.
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Diana Rendina
Librarians are leading the way in creating future-ready citizens – now we need to update our spaces to match. In this session, attendees will get inspiration for transforming their library spaces. You’ll learn how to survey students and patrons, create a focus group, and use design thinking to brainstorm ideas for your space. We’ll discuss budget friendly ways to change your space as well as how to find funding. No matter where you’re at, you’ll find ideas for reimagining your space in this session.
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Hyun wong thesis 2019 06_19_rev32_final
1. Master’s Dissertation
Comparative Analysis of Electricity
Consumption at Home through a
Silhouette-score prospective
Hyun Wong Choi
Department of Electrical and Computer Engineering
The Graduate School
Sungkyunkwan University
2. Comparative Analysis of Electricity
Consumption at Home through a
Silhouette-score prospective
Hyun Wong Choi
Department of Electrical and Computer Engineering
The Graduate School
Sungkyunkwan University
3. Comparative Analysis of Electricity
Consumption at Home through a
Silhouette-score prospective
Hyun Wong Choi
A Dissertation Submitted to the Department of
Electrical and Computer Engineering and
the Graduate School of Sungkyunkwan University
in partial fulfillment of the requirements
for the degree of Master of Science in Engineering
April 2019
Approved by
Professor Dr. Dong Ryeol Shin
4. This certifies that the dissertation of
Hyun Wong Choi is approved.
Dr. MUHAMMAD MANNAN SAEED
Committee Chair: Prof.
Dr. Eung Mo Kim
Committee Member : Prof.
Dr. Dong Ryeol Shin
Major Advisor: Prof.
Dr. Nawab Muhammad Faseeh Querish
Co-Advisor: Prof.
The Graduate School
Sungkyunkwan University
June 2019
8. - 6 -
Abstract
Title of Abstract
Machine learning is a modern field that has emerged as a new tool for data
analytics in a distributed computing environment. There are several aspects, at
which, machine learning has improved the processing capacity along with the
effectiveness of analysis. In this paper, the electricity usage of the home is analyzed
through K-means clustering algorithm for obtaining the optimal home usage
electricity data points. The Davis Boulden Index and Silhouette_score finds the
detailed optimal number of clusters in the K-means algorithm and present the
application scenario of the machine learning clustering analytics
Machine learning is a state-of-the-art sub-project of artificial intelligence, that
is been evolved for finding large-scale intelligent analytics in the distributed
computing environment. In this paper, we perform comparative analytics onto
dataset collected for the electricity usage of home based on the K-means clustering
algorithm using comparison to silhouette score with a ratio 1/8 dataset. The
performance evaluation shows that the comparison index is similar in numbers of
silhouette score even if datasets are smaller than before
KeyMAwords: Machine Learning, K-means clustering
Big data analytics has simplified the complexity of large-scale dataset
processing in a parallel distributed environment.
9. - 7 -
Chapter 1
Introduction
Electiricty consumption from power grid
In the power grid, we measure the consumption through sensors
Industrial consumption
Housing consumption
Factories consumption
Housing Consumption
Front end( Consumer End )
Back end ( Electircal Company end)
Back end ( Company end )
- Dataset For consumption UCIRVINE
Somany techniques that solves the optimization problem of electricity but, non of
them focus on housing electricity optimization,
- Reducing the cost
- Factors of overcharge
- Prediction
Are not available.
Solution
K-means algorithm
Why chose k-mean cluster
Predict the answer from the dataset
No any answer is available in terms of k-mean
Why predicting the answers
No clear result
In this paper electricity usage of home is analyzed through k-means
clustering algorithm for obtaining the optimal home usage electricity usage
10. - 8 -
of home is
3A is analyzed through k-means clustering algorithm for obtaining the
optimal home usage electricity data points The calinski-Harabasz Index,
davis-boulden index and silhouette_score find detailed optimal number of
clusters in the K-menas algorithm and present the application scenario of the
machine learning algorithm.
3B is reducing the 1/8 dataset and result the same result
The proposed approach delivers us efficient and meaning prediction results
never obtained before.
Machine learning is an analyzing mechanism that fetches and identifies the
matching patterns from existing datasets for newer result formations. This paper
discusses comparative analytics related to unsupervised learning algorithms. At
which we compare the K-mean clustering result with a ratio of half dataset to
silhouette_score result. We performed analysis and came to a conclusion that
Davis-Boulden index is not working smoothly in the sci-kit learn library, so
performed a check analysis for Caliski-Harabasz Index and Silhouette score along
with and Davis – Boulden index and compared results to each of them so to learn
that when we reduce the dataset to a mentioned proportion, the resultant dataset
shows half score than the traditional dataset score.
11. - 9 -
Chapter 2
Overview & Motivation
In real life household power consumptions diverse analytics and electricity
transformer, Transmission power can management period can estimate it.
And each data using electricity consumption. It can be used for progressive
taxation. Regional to regional demand, forecasting, maintenance of power
plant and facilities. In the gas company or Car, company can estimate about
the consumption for the via k-means algorithm and also can estimate via k-
means algorithms and also can estimate about the gas consumption rate to
via k-means clustering and index.
Motivated from Google AI, Tensor-flow Conference 2017
12. - 10 -
Chapter 3
Paper-1 Content
3.1. Introduction
Machine learning is a sub-project of artificial intelligence, that is used
to develop algorithms and techniques for enabling the computers to learn [1].
It is used to train the computer for various aspects such as (i) distinguish
whether e-mails received are s pam or not, (ii) data classification application,
(iii) association rule identification, and (iv) character recognition.
Machine learning includes a series of processes, in which a computer
lookup for (i) similar patterns, (ii) generate a novel classification system, (iii)
data analytics, and (iv) producing meaningful results. It is a kind of artificial
intelligence, that can be predicted based on the result, if it is supported only
by analytics algorithms. Machine learning is a step-by-step evolution process
that leads from big data analytics to predict future actions towards making
decisions on its own through past learned results. The key issues for
processing a successful prediction model remains to be within increasing the
probability and reducing the error and the said problems are resolved through
enabling numerous iterative learnings [2].
13. - 11 -
At the heart of machine learning are Representation and Generalization,
where expression is an evaluation of data and generalization is the processing
of future data. Unsupervised learning is a type of machine learning that is
used primarily to determine how data is organized. Unlike Supervised
Learning or Reinforcement Learning, this method does not give a target
value for input values [3].
Autonomous learning is closely related to the density estimation of
statistics. These autonomous learning can summarize and describe the main
characteristics of the data. An example of autonomous learning is clustering.
In this paper, we use the K-means algorithm to measure the optimal number
of clusters based on the Calinski-Harabasz Index and Silhouette_score,
Davis-Boulden index and then apply it to household electricity consumption
analysis.
14. - 12 -
Paper-1 Methodology
3.1.1.1. Sub-topics
3.4 Paper-1 EVALUATION
2.4.1. Experimental Environment
Software : Anaconda3 + Pycharm3
OS Software : Window 10 Professional
Ram 16.0GB
Processor : i7-6600U CPU @2.60GHz
Harddisk : 420GB SSD
2.4.2. Experimental Dataset
3.2. Previous work
Machine Learning
Machine learning is like data mining, but it is different in predicting
data based on learned attributes, mainly through training data. In addition to
the three techniques, Unsupervised learning, Supervised Learning or
Reinforcement Learning, various types of machine learning techniques such
as Semi-Supervised Learning and Deep Learning algorithms are developed
Has been used.
Clustering
Clustering is a method of data mining by defining a cluster of data
considering the characteristics of given data and finding a representative
15. - 13 -
point that can represent the data group. A cluster is a group of data with
similar characteristics. If the characteristics of the data are different, they
must belong to different clusters. It is the main task of exploratory data
mining, and a common technique for statistical data analysis, used in many
fields, including pattern recognition, information retrieval, machine learning,
and computer graphics [3].
(1) Maximizing inter-cluster variance
(2) Minimizing the inner-cluster variance
Note, however, that clustering should be distinguished from
Classification. Clustering is unsupervised learning without correct answers.
In other words, we group similar objects without group information of each
object. Classification, on the other hand, is supervised learning. When you
carry out classification tasks, you will learn to predict the dependent variable
(Y) with the independent variable (X) of the data [4].
Community Feasibility Assessment
16. - 14 -
Since clustering tasks are not correct, they cannot be evaluated as
indicators, such as simple accuracy, as in a typical machine learning
algorithm. As you can see in the example below, it is not easy to find the
optimal number of clusters without the correct answers. Cluster analysis
itself is not one specific algorithm, but the general task to be solved. It can
be achieved by various algorithms that differ significantly in their
understanding of what constitutes a cluster and how to efficiently find them.
Popular notions of clusters include a group with small distances between
cluster members, dense areas of data space,
Scikit-learn
17. - 15 -
In general, a learning problem considers a set of n samples of data and
then tries to predict the properties of unknown data. If each sample is more
than a single number and for instance. A multi-dimensional entry, it is said
to have several attributes or features.
Supervised learning, in which the data comes with additional attributes
that we want to predict this problem can be either.
Classification: samples belong to two or more classes and we want to learn
from already labeled data on how to predict the class of unlabeled data. An
example of a classification problem would be handwritten digit recognition,
in which the aim is to assign each input vector to one of a finite number of
discrete categories. Another way to think of classification is as a discrete( as
opposed to continuous) form of supervised learning where one has a limited
number of categories and for each of n samples provided. One if to try to
label them with the correct category or class.
Scikit-learn is the machine learning platform in the middle range of
superficial broad python module this package high-level language can us
easily high-level documentation and proper API suggested. Using BSD
license as academic or commercially use it. Source-code, documentation is
downloaded from websites [10]
Supervised learning, Unsupervised Learning is the many problems is
inserted in the Scikit-learn, Generalized Models, Linear and Quadratic
Recruitment Analysis, Kernel Ridged regression, Support Vector machine,
Stochastic Gradient Decent model’s solution also inserted in the Scikit-learn
3.3. Proposed Approach
K-means algorithms is one of the clustering methods for divided,
18. - 16 -
divided is giving the data among the many partitions, For example, receive
data object n, divided data is input data divided K(<= n) data, each group
consisting of cluster below equation is the at K-means algorithm when
cluster consists of algorithms using cost function use it [11]
argmin ∑ ∑ ‖𝑥 − 𝜇𝑖‖
𝑥 ∈ 𝑆 𝑖
2𝑘
𝑖 =1
In other words, one of the data objects divided by the K group.
Currently, divided similarity is (dissimilarity with reducting the cost function
about it. And from this theory each object similarity increase, different group
similarity will decrease.[12] K-means algorithm is each centroid and in each
group’s data object times’ summation, from this function result, the data
object group updated clustering progressed.[5]
How to be well to be clustering inner way is Caliski-Harabasz Index,
19. - 17 -
Davies-Bouldin index, Dunn index, Silhouette score. In this paper. Evaluate
via Clainiski-Harabasz Index and silhouette score evaluate it.
From the Cluster Calinski-Harabasz Index s I the clusters distributed
average and cluster distributed ratio will give it to you.
𝑠(𝑘) =
𝑇𝑟(𝐵 𝑘)
𝑇𝑟(𝑊𝑘)
×
𝑁 − 𝑘
𝑘 − 1
For this Bk is the distributed matrix from each group Wk is the cluster
distributed defined.
𝑊𝑘 = ∑ ∑ (𝑥 − 𝑐 𝑞)(𝑥 − 𝑐 𝑞
𝑥∈𝐶 𝑞
𝑘
𝑞=1
) 𝑇
𝐵 𝑘 = ∑ 𝑛 𝑞(𝐶 𝑞 − 𝑐)(𝐶 𝑞
𝑞
− 𝑐) 𝑇
N is the number of Data, Cq data group in Cq, Cq is the cluster q’s centroid,
c is the E of the Centroid, NQ is the number of data number in cluster_q
20. - 18 -
Silhouette score is the easy way to in data I each data cluster in data’s
definition an (i) each data is not clustered inner and data’s definition b(i)
silhouette score s(i) is equal to calculate that
s(i) =
𝑏(𝑖) − 𝑎(𝑖)
max { 𝑎(𝑖), 𝑏(𝑖)}
From this calculate s(i) is equal to that function
−1 ≤ s(i) ≤ 1
S(i) is the close to 1 is the data 1 is the correct dluster to each thing, close
to -1 cannot distribute cluster is distributed, from this paper machine Using
the machine learning library scikit-learn in the house hold power
consumption clustering[7],
21. - 19 -
Household power consumption from the dataset Download from
University California Irvine Machine Learning Data Repository[8] and then
use it, this dataset is via delimiter is divided. Global_active_power, Global
Reactive_power, Voltage, Global _intensity is divided. Global
Active_powere and Global Reactive power the X, Y axis experiment it,
Python library is Anaconda3 K-means algorithm’s key point is using Data
keep K clusters, reduce cluster’s distance, K-means algorithms input data put
the labels. Figure 1 is the before check Calinski-Harabasz Index and
Silhouette_score execute K-means algorithm’s result. Figure 1 to Figure 11
are 1/8 dataset k-means clustering result for Household power consumption
from UC Irvine Repository and reduce the dataset 1/8 times from original
UCI machine learning data repository.
2.4.3. Experimental Results
22. - 20 -
Figure 1. Clustering result at K = 1 Figure2. Clustering result at K=2
Figure 3. Clustering result at K = 3 Figure4. Clustering result at K=4
23. - 21 -
Figure 5. Clustering result at K = 5 Figure 6. Clustering result at K=6
Figure 7. Clustering result at K=7 Figure 8. Clustering result at K=8
24. - 22 -
Figure 9. Clustering result at K = 9 Figure 10. Clustering result at K=10
After all, reduce each cluster’s distance calculate each cluster’s
Calinski-Harabasz Index, increasing clusters’ Calinski-Harabasz Index will
decrease with K ratio is too law estimate K this cluster partition will one
more or not electric consumption rate is very important. This one is the most
important fact.
25. - 23 -
Figure 11. Silhouette score according to change of cluster number.
Equal with Caliski-Harabasz Index estimation, calculate
Silhouette_score. The cluster will increase Silhouette_score will decreases
with K distributed, a low factor with optimal K represented.
From K-means algorithms calculate proper cluster things is very
important, from the data, estimate Silhouette_score, the result is K=7 each
cluster centroid and data prices silhouette score are 0.799 is the optimal score.
From the formal Caliski-Harabasz Index results are 560.3999 is the optimal
result. Using this k-means algorithm the fact is figure 11.
From this K-means algorithm cluster 7th,
each group’s centroid and
each centroid distance will be an optimal value. From this result, each
Centroid can divide. Household power consumption rate via clustering.
26. - 24 -
Figure 12: Clustering result at K=7
Davies-Bouldin index
If the ground truth labels are not known, the Davies-Bouldin index
(sklearn. Metrixdavis Boulden)
𝑅𝑖𝑗 =
𝑠𝑖 + 𝑠𝑗
𝑑𝑖𝑗
27. - 25 -
Then the Davis-Bouldin Index is defined as
DB =
1
𝑘
∑ 𝑖 = 1 𝑘
max
𝑖≠𝑗
𝑅𝑖𝑗
The zero is the lowest score a possible. Score. Values closer to zero
indicate a better partition. But the problem is this algorithm does not attach
it in the Scikit-learn library and only explain it in the document page but
cannot experiment easily.
3.4. Related work
Machine learning is a sub-project of artificial intelligence, that is used
to develop algorithms and techniques for enabling the computers to learn [1].
It is used to train the computer for various aspects such as (i) distinguish
whether e-mails received are spam or not, (ii) data classification application,
(iii) association rule identification, and (iv) character recognition.
Machine learning includes a series of processes, in which a computer
lookup for (i) similar patterns, (ii) generate a novel classification system, (iii)
28. - 26 -
data analytics, and (iv) producing meaningful results. It is a kind of artificial
intelligence, that can be predicted based on the result if it is supported only
by analytics algorithms. Machine learning is a step-by-step evolution process
that leads from big data analytics to predict future actions towards making
decisions on its own through past learned results. The key issues for
processing a successful prediction model remains to be within increasing the
probability and reducing the error and the said problems are resolved through
enabling numerous iterative learnings [2].
At the heart of machine learning are Representation and Generalization,
where expression is an evaluation of data and generalization is the processing
of future data. Unsupervised learning is a type of machine learning that is
used primarily to determine how data is organized. Unlike Supervised
Learning or Reinforcement Learning, this method does not give a target
value for input values [3].
Autonomous learning is closely related to the density estimation of
statistics. These autonomous learning can summarize and describe the main
characteristics of the data. An example of autonomous learning is clustering.
In this paper, we use the K-means algorithm to measure the optimal number
of clusters based on the Calinski-Harabasz Index and Silhouette_score,
Davis-Boulden index and then apply it to household electricity consumption
29. - 27 -
analysis.
3.1 Summary
From the paper, Household power consumption via k-means clustering,
Used library which is sci-kit learn, Anaconda 3 open-source personally can
easily follow it and because using BSD License to real works don’t have
difficulties to that. Not only the K-means algorithm, PCA Algorithms but
30. - 28 -
also SVM algorithm, etc other machine learning algorithms clustering can
also do it. From this result, in real life household power consumptions
diverse analytics. And the electricity transformer, Transmission power can
management period can estimate it. And each data using electricity
consumption. It can be used for progressive taxation, regional to regional
demand forecasting, maintenance of power plants and facilities. Can do it. In
the Gas, the company can estimate via k-means algorithms and also can
estimate the gas consumption rate to via K-means clustering and index.
Chapter 4
Paper-2 Comparative Analsysis of Electricity Consumption at Home
through a Silhouette-score prospective
4.1Introduction
Machine learning is an analyzing mechanism that fetches and identifies
the matching patterns from existing datasets for newer result formations.
This paper discusses comparative analytics related to unsupervised learning
algorithms, at which we compare the K-mean clustering result with a ratio
31. - 29 -
of half dataset to Silhouette_score results. We performed analysis and came
to the conclusion that Davis-Boulden index is not working smoothly in the
sci-kit learning, so performed a check analysis for Caliski-Harabasz Index
and Silhouette score along with and Davis – Boulden index and compared
results to each of them so to learn that when we reduce the dataset to a
mentioned proportion, the resultant dataset shows half score than the
traditional dataset score.
4.2 Related work
Machine learning is a field of artificial intelligence, that is used to
develop algorithms and techniques that enable computers to learn [1]. It is
used to train the computer to distinguish whether e-mails received are spam
or not, and there are various applications such as data classification,
associated rule identification, and character recognition, which comply to the
standard machine learning perspectives.
32. - 30 -
It includes a series of processes, in which a computer finds its own
patterns, creates a new classification system, analyzes the data, and produces
meaningful results. The successful prediction occurs with the increase in
probability and decrease in the error issues. Machine learning enables to sort
out the issues with various iterative learning [2]. Among them, supervised
learning is highly related to summarizing the learning methods for re-
enforcement mechanisms [3].
Clustering is a process of mining the dataset by defining a cluster of
data that considers the characteristics of input and finds a representative
method to point out the data group. In this way, a cluster is a group of relevant
data elements with similar characteristics. If the functions are not the same,
the ingredients belong to contrast clusters [3]. Clustering is unsupervised
learning without accuracy in answers. In the same way, the objects having
the same information are grouped together for similar elements. However,
the classification is a way related to supervised learning. When you perform
classification operations, the system will learn to predict the dependent
33. - 31 -
variable (Y) with the independent variable (X) of the data [4].
Scikit-learn is the machine learning platform in the middle range of
superficial broad python module this package high-level language can us
easily high-level documentation and proper API suggested. Using BSD
license as academic or commercially use it. Source-code, documentation is
downloaded from websites [10]. Supervised learning, Unsupervised
Learning is the many problems is inserted in the Scikit-learn, Generalized
Models, Linear and Quadratic Decruitment Analysis, Kernel Ridged
regression, Support Vector machine, Stochastic Gradient Decent model’s
solution also inserted in the Scikit-learn.
3.5. Paper-2 Methodology
K-means algorithm is one of the clustering methods for divided,
divided is giving the data among the many partitions. For example, receive
data object n, divided data is input data divided K (≤ n) data, each group
consisting of cluster below equation is the at K-means algorithm when
34. - 32 -
cluster consists of algorithms using cost function use it [11]
argmin ∑ ∑ ‖𝑥 − 𝜇𝑖‖
𝑥 ∈ 𝑆 𝑖
2𝑘
𝑖 =1
In other words, one of the data objects divided by the K group.
Currently, the divided similarity is (dissimilarity with reducing the cost
function about it. And from this theory each object similarity increase,
different group similarity will decrease. [12] K-means algorithm is each
centroid and in each group’s data object times’ summation, from this
function result, the data object group updated clustering progressed. [5]
Silhouette score is the easy way to in data I each data cluster in data’s
definition an (i) each data is not clustered inner and data’s definition b(i)
silhouette score s(i) is equal to calculate that
35. - 33 -
s(i) =
𝑏(𝑖) − 𝑎(𝑖)
max { 𝑎(𝑖), 𝑏(𝑖)}
From this calculate s(i) is equal to that function
−1 ≤ s(i) ≤ 1
S(i) is the close to 1 is the data I is the correct cluster to each thing,
close to -1 cannot distribute cluster is distributed, from this paper machine
Using the machine learning library scikit-learn in the household power
consumption clustering [7]. Household power consumption from the
dataset Download from University California Irvine Machine Learning
Dataset Repository [8] and then use it, this dataset is via delimiter is divided.
Global_active_power, Global Reactive_power, Voltage, Global_intensity is
divided. Global Active_power and Global Reactive power the X, Y axis
experiment it.
Python library is Anaconda3 K-means algorithm’s key point is using
Data keep K clusters, reduce cluster’s distance, K-means algorithms input
data put the labels. figure 1 is the before check Calinski-Harabasz Index and
Silhouette_score execute K-means algorithm’s result. Figure 1 to Figure 11
are 1/8 dataset k-means clustering result for House hold power consumption
from UC Irvine Repository and reduce the dataset 1/8 times from original
36. - 34 -
UCI machine learning data repository.
1.1.1. Experimental Environment
Software : Anaconda3 + Pycharm3
OS Software : Window 10 Professional
Ram 16.0GB
Processor : i7-6600U CPU @2.60GHz
Harddisk : 420GB SSD
1.1.2. Experimental Dataset
1.date: Date in format dd/mm/yyyy
2.time: time in format hh:mm:ss
3.global_active_power: household global minute-averaged active
37. - 35 -
power (in kilowatt)
4.global_reactive_power: household global minute-averaged reactive
power (in kilowatt)
5.voltage: minute-averaged voltage (in volt)
6.global_intensity: household global minute-averaged current intensity
(in ampere)
7.sub_metering_1: energy sub-metering No. 1 (in watt-hour of active
energy). It corresponds to the kitchen, containing mainly a dishwasher, an
oven and a microwave (hot plates are not electric but gas powered).
8.sub_metering_2: energy sub-metering No. 2 (in watt-hour of active
energy). It corresponds to the laundry room, containing a washing-machine,
a tumble-drier, a refrigerator and a light.
9.sub_metering_3: energy sub-metering No. 3 (in watt-hour of active
energy). It corresponds to an electric water-heater and an air-conditioner.
1.1.3. Experimental Results
41. - 39 -
Figure 12. Shiloutette score according to change of cluster number.
Figure 13. 1/8 dataset Silhouette score according to change of cluster number.
42. - 40 -
From K-means algorithms calculate proper cluster things is very
important, from the data, estimate Silhouette_score, the result is K = 7 each
cluster centroid and data prices silhouette score is 0.799 is the optimal score.
Even if the dataset is so small but the 1/8 datasets K= 7 each cluster
centroid and data prices silhouette score 0.810 is the optimal score. From this
K-means algorithm cluster 7th,
( all dataset, 1/8 dataset ) each group’s
centroid and each centroid distance will be an optimal value. From this result,
the dataset is decreased but the K-means clustering ‘s class vector space. Its
optimal cluster is the same situation with before original Dataset Household
power consumption rate via clustering.
43. - 41 -
Summary
From the paper, Household power consumption via k-means clustering,
Used library which is sci-kit learn, Anaconda 3 open-source personally can
easily follow it and because using BSD License to real works don’t have
difficulties to that. From this result even if reduce the dataset 1/8 but the
silhouette score and all the clustering result is same as before. But the
population will increase it can show a clearer result for the classification and
vector space. Large dataset to small dataset is clear to show to the specific
result for the Silhouette score but the opposite site is not clearly allowed.
Because of 4-dimension vector dataset. From the experiment reduce the
estimated time if received huge dataset from the analysis.
44. - 42 -
Chapter 5
Conclusion
his dissertation approach to a diverse aspect of the k-means clustering
applications, First time try to reduce the k-means algorithm’s time
consumption but next time I try to change my aspect to the how to reduce
the time from Large dataset, the approach is changed. These days, via
machine learning algorithm, can estimate about the when changing the part,
(life span) From this result, all of the experiment Used library scikit-learn
Anaconda3, open-source, it can easily implement any environment, because
using BSD License. Can analyze diverse indexes from the first experiment.
From second experiment, if the dataset is huge need time to analyze, how
many centroid is proper k-mean cluster, at that time can reduce time ,
compare with 1/8 dataset, but limited classification and vector space. From
the experiment reduce the estimated time if received huge dataset from
analysis.
45. - 43 -
Acknowledgement
대학원 석사 생활 중 총 114회의 컨퍼런스 참가와 7회의 발
표를 하였습니다. IEEE Globcom 2017 이 그 중 인상적이었으며, 본
논문은 Google AI, Tensor-flow Conference 2017 에서 Motivation 을 얻
어 실험하게 되었습니다. 지도교수 이시면서 성균관대학교를 대표
하는 총장님이신 신동렬 교수님의 지도와, Co-Advisor 이신 Nawab
Muhammad Fasheeh Queshi 와의 Co-work 에도 부족한 저를 항상 웃
으며 지도해 주신대에 대하여 감사의 인사를 전합니다.
성균관대학교에 처음 Join 하게 도와주신 모바일 컴퓨팅연구
실 윤희용 교수님 SKKU Fellow 께도 감사드리며, 오픈랩에 생활
함에 있어 불편함이 없이 도와주신 남춘성 박사님과, 같이 사용한
최기현 박사님, Muhammad Hamza, Janaid , 김우현, 소 청에게도 고
마움의 뜻을 전합니다.
학위기간 동안에 끝까지 후원해 주셨던 어머니 이신 동남보
건대학교 이봉순 교수님, LG전자 평택캠퍼스 창립멤버 이신 아버
지 최한청 부장님 (현 온누리이엔지 이사) 에게도 감사의 인사를
전합니다.
학위기간동안 종종 집까지 바래다 주신, 친형인 포스코건설 최현
석 과장 및 분당서울대병원 안여울 간호사, 귀여운 조카 연우에게
도 고마움을 전합니다.
학위를 하면서 이정표가 되어준 사촌 누나 형들께도 감사의 인사
를 드리며 이만 갈음합니다.
2019년 06월 19일
46. - 44 -
Acknowledgment
I participated in a total of 114 conferences and 7 presentations during
my graduate school life. IEEE Globcom 2017 was impressive, and this paper
was experimented with Motivation at Google AI, Tensor-flow Conference
2017. I would like to express my gratitude to Professor, Dr. Dong-Ryul, Shin
who is the president of Sungkyunkwan University, and co-work with Co-
Advisor Assistant Professor, Dr. Nawab Muhammad Fasheeh Queshi.
Thank you to SKKU Fellow, Professor Hee Yong Yoon Director of
Mobile Computing Lab for helping me to join Sungkyunkwan University for
the first time. I am thankful to Dr. Min Ki Hyun, Muhammad Hamza, Janaid,
Kim Woohyun, I also want to thank you.
I would like to extend my sincere thanks to Bong-Soon, Lee mother,
Professor of Dongnam Health University, who supported me for the duration
of my degree, and to my father Han-chung Choi, who is a founding member
of LG Electronics Pyeongtaek Campus.
I am also grateful to Hyeon-suk, Choi my brother-in-law, who has often
took his car to home during my degree, and Ye-ul, Ahn Nurse at Seoul
National University Bundang Hospital and my cute nephew. Youn-Woo
I give my thanks to my cousins and older brothers who gave me a
milestone in my degree.
June 19, 2019
47. - 45 -
References
[1] https://en.wikipedia.org/wiki/K-means_clustering
[2] https://en.wikipedia.org/wiki/Cluster_analysis
[3] https://en.wikipedia.org/wiki/Silhouette_(clustering)
[4] https://github.com/sarguido.
[5] http://archive.ics.uci.edu/ml/datasets.html.
[6] http://scikit-learn.org/stable/modules/clustering.html#calinski-harabaz-index
[7] http://scikit-learn.org/stable/.
[8] T. Calinski and J. Harabasz, 1974. “A dendrite method for cluster analysis”.
Communications in Statistics
[9] Kanungo, Tapas et al. “An Efficient k-Means Clustering Algorithm: Analysis and
Implementation.” IEEE Trans. Pattern Anal. Mach. Intell. 24 (2002): 881-892.
[10]David, and Sergei Vassilvitskii ,“k-means++: The advantages of careful seeding”
Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete
algorithms, Society for Industrial and Applied Mathematics (2007): 1027-1035
[11]Wagstaff, K., Cardie, C., Rogers, S., & Schrödl, S. (2001, June). Constrained k-
means clustering with background knowledge. In ICML (Vol. 1, pp. 577-584).
[12]Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering
algorithm. Journal of the Royal Statistical Society. Series C (Applied
Statistics), 28(1), 100-108.
[13]Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., & Wu,
A. Y. (2002). An efficient k-means clustering algorithm: Analysis and
implementation. IEEE Transactions on Pattern Analysis & Machine Intelligence, (7),
881-892.
[14]Alsabti, K., Ranka, S., & Singh, V. (1997). An efficient k-means clustering algorithm.
[15]Likas, A., Vlassis, N., & Verbeek, J. J. (2003). The global k-means clustering
algorithm. Pattern recognition, 36(2), 451-461.
[16]Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... &
Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of machine
learning research, 12(Oct), 2825-2830.
[17]Buitinck, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel, O., ... &
Layton, R. (2013). API design for machine learning software: experiences from the
scikit-learn project. arXiv preprint arXiv:1309.0238.
[18]Abraham, A., Pedregosa, F., Eickenberg, M., Gervais, P., Mueller, A., Kossaifi, J., ...
& Varoquaux, G. (2014). Machine learning for neuroimaging with scikit-
learn. Frontiers in neuroinformatics, 8, 14.
[19]Fabian, P., Gaël, V., Alexandre, G., Vincent, M., Bertrand, T., Olivier, G., ... &
Alexandre, P. (2011). Scikit-learn: Machine learning in Python. Journal of Machine
Learning Research, 12, 2825-2830.
[20]Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., & Scholkopf, B. (1998). Support
vector machines. IEEE Intelligent Systems and their applications, 13(4), 18-28.
[21]Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." Journal of
machine learning research 12.Oct (2011): 2825-2830.
[22]Alsabti, Khaled, Sanjay Ranka, and Vineet Singh. "An efficient k-means clustering
algorithm." (1997).
48. - 46 -
[23]Ding, Chris, and Xiaofeng He. "K-means clustering via principal component
analysis." Proceedings of the twenty-first international conference on Machine
learning. ACM, 2004.
[24]Paneque-Gálvez, Jaime, et al. "Small drones for community-based forest monitoring:
An assessment of their feasibility and potential in tropical areas." Forests 5.6 (2014):
1481-1507.
[25]Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." Journal of
machine learning research 12.Oct (2011): 2825-2830.
[26]Bishop, Christopher M. Pattern recognition and machine learning. springer, 2006.
[27]Rasmussen, Carl Edward. "Gaussian processes in machine learning." Summer
School on Machine Learning. Springer, Berlin, Heidelberg, 2003.
[28]Hartigan, John A., and Manchek A. Wong. "Algorithm AS 136: A k-means clustering
algorithm." Journal of the Royal Statistical Society. Series C (Applied Statistics) 28.1
(1979): 100-108.
[29]Paneque-Gálvez, Jaime, et al. "Small drones for community-based forest monitoring:
An assessment of their feasibility and potential in tropical areas." Forests 5.6 (2014):
1481-1507.
[30]Sass, Ron, et al. "Reconfigurable computing cluster (RCC) project: Investigating the
feasibility of FPGA-based petascale computing." 15th Annual IEEE Symposium on
Field-Programmable Custom Computing Machines (FCCM 2007). IEEE, 2007.
[31] Duda, Richard O., Peter E. Hart, and David G. Stork. Pattern classification. John
Wiley & Sons, 2012.
[32]Cover, Thomas M., and Peter E. Hart. "Nearest neighbor pattern
classification." IEEE transactions on information theory13.1 (1967): 21-27.
[33]Breiman, Leo. Classification and regression trees. Routledge, 2017.
[34]Haralick, Robert M., and Karthikeyan Shanmugam. "Textural features for image
classification." IEEE Transactions on systems, man, and cybernetics 6 (1973): 610-
621.
[35]Chapelle, Olivier, Bernhard Scholkopf, and Alexander Zien. "Semi-supervised
learning (chapelle, o. et al., eds.; 2006)[book reviews]." IEEE Transactions on
Neural Networks 20.3 (2009): 542-542.
[36]Zhu, Xiaojin, Zoubin Ghahramani, and John D. Lafferty. "Semi-supervised learning
using gaussian fields and harmonic functions." Proceedings of the 20th International
conference on Machine learning (ICML-03). 2003.
[37]Caruana, Rich, and Alexandru Niculescu-Mizil. "An empirical comparison of
supervised learning algorithms." Proceedings of the 23rd international conference
on Machine learning. ACM, 2006.
[38]Jain, Anil K. "Data clustering: 50 years beyond K-means." Pattern recognition
letters 31.8 (2010): 651-666.
[39]Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation
learning with deep convolutional generative adversarial networks." arXiv preprint
arXiv:1511.06434 (2015).
[40]Figueiredo, Mario A. T., and Anil K. Jain. "Unsupervised learning of finite mixture
models." IEEE Transactions on Pattern Analysis & Machine Intelligence 3 (2002):
381-396.
[41]Lovmar, Lovisa, et al. "Silhouette scores for assessment of SNP genotype clusters."
BMC genomics 6.1 (2005): 35.
[42]Collins, Robert T., Ralph Gross, and Jianbo Shi. "Silhouette-based human
identification from body shape and gait." Proceedings of fifth IEEE international
conference on automatic face gesture recognition. IEEE, 2002.
49. - 47 -
[43]Gat-Viks, Irit, Roded Sharan, and Ron Shamir. "Scoring clustering solutions by their
biological relevance." Bioinformatics 19.18 (2003): 2381-2389.
[44]Maulik, Ujjwal, and Sanghamitra Bandyopadhyay. "Performance evaluation of some
clustering algorithms and validity indices." IEEE Transactions on pattern analysis
and machine intelligence 24.12 (2002): 1650-1654.
[45]Łukasik, Szymon, et al. "Clustering using flower pollination algorithm and calinski-
harabasz index." 2016 IEEE Congress on Evolutionary Computation (CEC). IEEE,
2016.
[46]Desgraupes, Bernard. "Clustering indices." University of Paris Ouest-Lab Modal’X
1 (2013): 34.
[47]Petrovic, Slobodan. "A comparison between the silhouette index and the davies-
bouldin index in labelling ids clusters." Proceedings of the 11th Nordic Workshop of
Secure IT Systems. sn, 2006.
[48]Maulik, Ujjwal, and Sanghamitra Bandyopadhyay. "Performance evaluation of some
clustering algorithms and validity indices." IEEE Transactions on pattern analysis
and machine intelligence 24.12 (2002): 1650-1654.
[49]Petrovic, Slobodan. "A comparison between the silhouette index and the davies-
bouldin index in labelling ids clusters." Proceedings of the 11th Nordic Workshop of
Secure IT Systems. sn, 2006.
[50] https://scikit-learn.org/stable/
[51] https://www.anaconda.com/
[52] https://www.jetbrains.com/pycharm/
[53] Petrovic, Slobodan. "A comparison between the silhouette index and the davies-
bouldin index in labelling ids clusters." Proceedings of the 11th Nordic Workshop of
Secure IT Systems. sn, 2006.
[54] Bandyopadhyay, Sanghamitra, and Ujjwal Maulik. "Nonparametric genetic
clustering: comparison of validity indices." IEEE Transactions on Systems, Man, and
Cybernetics, Part C (Applications and Reviews) 31.1 (2001): 120-125.
[55]
https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consu
mption
[56] https://github.com/sarguido