This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It applies k-means to household electricity consumption data to obtain optimal data points. The Calinski-Harabasz index and silhouette score determine the optimal number of clusters. Chapter 4 discusses paper 2, which also uses k-means clustering on a reduced 1/8 dataset and finds similar silhouette scores. The conclusion is that machine learning can efficiently predict electricity consumption through clustering algorithms like k-means.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It applies k-means to household electricity consumption data to obtain optimal data points. The Calinski-Harabasz index and silhouette score determine the optimal number of clusters. Chapter 4 discusses paper 2, which also uses k-means clustering on a reduced 1/8 dataset and finds similar silhouette scores. The conclusion is that machine learning can efficiently predict electricity consumption through clustering algorithms even with smaller datasets.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering algorithms. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes show similar performance even when the dataset is reduced to 1/8 of its original size. This demonstrates the ability of machine learning algorithms to analyze datasets efficiently at different scales.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering algorithms. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. Results from k-means clustering on the full and 1/8 reduced datasets are presented through figures showing clustering at different values of k. The indices are calculated to evaluate the clustering results and identify the best k for the datasets.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering techniques. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes show similar performance even when the dataset is reduced to 1/8 of its original size. This demonstrates the ability of machine learning algorithms to analyze datasets efficiently at different scales.
This document appears to be a master's dissertation that analyzes electricity consumption in homes using machine learning clustering algorithms. It contains two papers. Paper 1 uses K-means clustering on a home electricity usage dataset to obtain optimal clusters of usage data points. It evaluates the optimal number of clusters using silhouette scores, Calinski-Harabasz Index, and Davis-Boulden Index. Paper 2 reduces the dataset to 1/8 size and finds that the comparison indices remain similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning to optimize home electricity usage, reducing costs and predicting factors that influence overcharging.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 analyzes a full home electricity usage dataset through K-means clustering to obtain optimal data points, evaluating cluster numbers using indices like Davis-Boulden and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds similar results for silhouette score, showing the approach works on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage, costs, and predict factors driving overcharges.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption at home through K-means clustering and compares results using different dataset sizes and evaluation metrics. The dissertation contains two papers: the first analyzes a full home electricity usage dataset using K-means clustering and evaluates optimal cluster numbers with Calinski-Harabasz Index, Davis-Boulden Index, and silhouette score. The second analyzes a reduced 1/8 size dataset with K-means clustering and finds similar optimal cluster numbers based on silhouette score, demonstrating machine learning can produce consistent results even with smaller datasets. The dissertation applies machine learning algorithms to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption in homes through k-means clustering and silhouette scoring. Chapter 3 discusses paper 1, which introduces machine learning and k-means clustering. It applies k-means to household electricity consumption data to obtain optimal data points. The Calinski-Harabasz index and silhouette score determine the optimal number of clusters. Chapter 4 discusses paper 2, which also uses k-means clustering on a reduced 1/8 dataset and finds similar silhouette scores. The conclusion is that machine learning can efficiently predict electricity consumption through clustering algorithms even with smaller datasets.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering algorithms. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes show similar performance even when the dataset is reduced to 1/8 of its original size. This demonstrates the ability of machine learning algorithms to analyze datasets efficiently at different scales.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering algorithms. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. Results from k-means clustering on the full and 1/8 reduced datasets are presented through figures showing clustering at different values of k. The indices are calculated to evaluate the clustering results and identify the best k for the datasets.
This paper analyzes electricity consumption in homes through k-means clustering. It introduces machine learning and clustering techniques. The paper uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters in k-means for a household electricity consumption dataset from UC Irvine. It performs k-means clustering with varying values of k and calculates the indexes to evaluate the results. The key findings are that the indexes show similar performance even when the dataset is reduced to 1/8 of its original size. This demonstrates the ability of machine learning algorithms to analyze datasets efficiently at different scales.
This document appears to be a master's dissertation that analyzes electricity consumption in homes using machine learning clustering algorithms. It contains two papers. Paper 1 uses K-means clustering on a home electricity usage dataset to obtain optimal clusters of usage data points. It evaluates the optimal number of clusters using silhouette scores, Calinski-Harabasz Index, and Davis-Boulden Index. Paper 2 reduces the dataset to 1/8 size and finds that the comparison indices remain similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning to optimize home electricity usage, reducing costs and predicting factors that influence overcharging.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 analyzes a full home electricity usage dataset through K-means clustering to obtain optimal data points, evaluating cluster numbers using indices like Davis-Boulden and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds similar results for silhouette score, showing the approach works on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage, costs, and predict factors driving overcharges.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption at home through K-means clustering and compares results using different dataset sizes and evaluation metrics. The dissertation contains two papers: the first analyzes a full home electricity usage dataset using K-means clustering and evaluates optimal cluster numbers with Calinski-Harabasz Index, Davis-Boulden Index, and silhouette score. The second analyzes a reduced 1/8 size dataset with K-means clustering and finds similar optimal cluster numbers based on silhouette score, demonstrating machine learning can produce consistent results even with smaller datasets. The dissertation applies machine learning algorithms to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering. The dissertation contains two papers:
1. The first paper analyzes household electricity usage data through K-means clustering to obtain optimal data points. It uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters.
2. The second paper performs a comparative analysis on a dataset that is 1/8 the size of the original. It finds that the Silhouette score is half of the original dataset, even with the smaller data.
The dissertation applies unsupervised machine learning clustering techniques to analyze household electricity consumption data, in order to optimize costs and identify factors
Hyun wong sample thesis 2019 06_19_rev20_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption in homes through K-means clustering and silhouette scoring of a dataset. The document contains two papers. Paper 1 introduces machine learning and K-means clustering. It applies K-means to a household electricity consumption dataset from UC Irvine, testing clusters from 1 to 10. The optimal number of clusters is identified as 7 based on maximizing the Calinski-Harabasz index and silhouette score. Paper 2 applies the same methodology to a reduced 1/8 size version of the dataset, finding similar silhouette scores indicating the clustering remains effective with less data.
Hyun wong thesis 2019 06_22_rev40_final_printedHyun Wong Choi
This document summarizes a master's dissertation that analyzes electricity consumption at home through k-means clustering. The dissertation contains two papers:
1. The first paper analyzes electricity usage data from homes using k-means clustering to identify optimal clusters of usage patterns. It evaluates different metrics like silhouette score and clustering indices to determine the optimal number of clusters in the data.
2. The second paper performs a comparative analysis using a reduced 1/8th dataset to validate that the silhouette score and optimal number of clusters is similar even with smaller data.
The dissertation applies machine learning clustering techniques to analyze electricity consumption data from homes with the goal of optimizing costs and identifying factors for overcharging.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm and silhouette score. The document contains two papers that analyze a household electricity consumption dataset from the University of California, Irvine using K-means clustering. Paper 1 uses the Calinski-Harabasz Index, Davis-Boulden index, and silhouette score to determine the optimal number of clusters. Paper 2 performs a comparative analysis using a 1/8 subset of the full dataset and finds that the silhouette scores are similar even when using a smaller dataset. The dissertation aims to optimize household electricity usage and costs through machine learning clustering techniques.
The document analyzes electricity consumption at home through K-means clustering and evaluates different cluster validity indices, including the Silhouette score, to determine the optimal number of clusters in the dataset. It performs K-means clustering on a household electricity consumption dataset and compares the results of the Silhouette score and other indices at different values of K to identify the best number of clusters. The analysis aims to help optimize home electricity usage through machine learning clustering techniques.
Hyun wong thesis 2019 06_22_rev40_final_grammerlyHyun Wong Choi
The document analyzes electricity consumption at home through K-means clustering and evaluates different cluster validity indices, including the Silhouette score, to determine the optimal number of clusters in the dataset. It performs K-means clustering on a household electricity consumption dataset and compares the results of the Silhouette score and other indices at different values of K to identify the best clustering. The analysis aims to optimize home electricity usage through unsupervised machine learning clustering techniques.
The document analyzes electricity consumption data from homes using K-means clustering to determine optimal clusters in the data. It evaluates different cluster validity indices like the Calinski-Harabasz Index, Davis-Boulden index, and Silhouette score to find the optimal number of clusters. The analysis is also performed on a reduced 1/8th dataset to see if the results are similar when using less data.
Hyun wong thesis 2019 06_22_rev40_final_Submitted_onlineHyun Wong Choi
The document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering and silhouette scoring. It introduces machine learning and clustering techniques. It then describes the experimental environment, dataset used, previous work on related topics, and the proposed approach of applying K-means clustering to analyze the electricity consumption dataset. The key aspects analyzed are the optimal number of clusters determined by indices like Calinski-Harabasz, Davis-Boulden, and silhouette score. Results are compared between the full and 1/8 reduced datasets.
This master's thesis examines dynamic programming control for energy management in smart homes with photovoltaic systems and battery storage. It first provides background on photovoltaic generation and feed-in tariffs in Germany. It then formulates the energy management problem as a Markov decision process and explores various control approaches including rule-based control, linear programming, dynamic programming, and approximate dynamic programming. The thesis evaluates these methods using real solar generation and electricity price data. The goal is to optimize battery charging and discharging to minimize energy costs while satisfying household demand.
Semester Project 3: Security of Power SupplySøren Aagaard
The project is about the security of power supply, both current and in the future. Renewable energys part, of the total electricity production will continue to grow in the following years, this will be illuminated and analyzed.
The applicable legislation will be provided and explained to help grasping the legal aspect of the security of power supply.
The economical optimum power supply will be calculated, to help evaluate if it is profitable to uphold Denmarks high security of power supply.
To provide a more practical view, a model of the powergrid has come together, analysing how the grid react to the strain caused by errors, to help fathom by which criteria the grid is constructed.
This document is the final report for an industrial project that aims to optimize electrical energy use through an automated lighting control system using a programmable logic controller (PLC). The report includes sections on background theory, detailed design, experimental evaluation, results analysis, discussion, and conclusions. It describes designing a system to intelligently control lighting in buildings by using PIR sensors, a day/night sensor, and a PLC to automatically turn lights on and off based on occupancy and daylight levels, in order to conserve energy.
Hyun wong sample thesis 2019 06_01_rev17_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption at home through a comparative analysis using a silhouette-score prospective. The dissertation contains two papers that apply k-means clustering to household electricity usage data. Paper 1 uses k-means clustering and evaluates the optimal number of clusters using Davis-Bouldin Index and Silhouette_score. Paper 2 performs a comparative analysis on a 1/8 size dataset using silhouette score. The evaluation shows that the comparison index results are similar even when using smaller datasets. The dissertation applies machine learning techniques to analyze electricity consumption and optimize cluster analysis for effective load forecasting and management.
This document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering and silhouette scoring. The dissertation contains two papers. Paper 1 analyzes household electricity consumption data from UC Irvine using K-means clustering to determine the optimal number of clusters based on silhouette scoring and other indices. The analysis finds seven clusters to be optimal. Paper 2 performs a comparative analysis using a 1/8 subset of the full dataset, finding that silhouette scores are approximately half of the full dataset but the optimal number of clusters is similar. The dissertation concludes that machine learning clustering can effectively analyze electricity consumption patterns and predict optimal clustering even with smaller datasets.
This master's thesis explores optimal control of energy and thermal management systems in fuel cell hybrid electric vehicles (FCHEVs) to minimize hydrogen consumption. A model of an FCHEV powertrain is developed for optimal control using dynamic programming. Control strategies are found that optimally operate the energy and thermal systems during driving missions. The results provide insight into how to control the powertrain to efficiently use hydrogen. It is concluded that integrated energy and thermal strategies can increase fuel efficiency, with the optimal strategy dependent on fuel cell characteristics.
This document summarizes a master's thesis that proposes a novel hybrid model approach for appliance load disaggregation. The thesis combines convolutional neural networks and hidden semi-Markov models to model appliances. As a proof of concept, the hybrid model is evaluated on power data from six households to predict washing machine usage. The hybrid model is shown to perform considerably better than a CNN alone, and including transitional features in the HSMM improves performance significantly.
This thesis extends the electromagnetic field calculation capabilities of the open-source CFD software OpenFOAM. It develops new solvers within OpenFOAM to solve magnetostatic problems for materials like copper, steel, and permanent magnets. Two formulations (A-V and A-J) are derived from Maxwell's equations and implemented as OpenFOAM solvers through custom C++ code. Force calculation methods are also implemented to calculate Lorenz force and Maxwell stress. Simple test cases are modeled and solved to validate the new solvers. Results are compared to COMSOL Multiphysics and good agreement is found. The developed solvers could be applied to the design of electromagnetic devices like electric machines.
This document is a lecture note on hydropower engineering from Arbaminch University's Department of Hydraulic Engineering. It covers topics related to hydropower development including an introduction to energy sources and hydropower status in Ethiopia. It also discusses the development and layout of hydropower plants, including hydrological analysis, estimation of power potential, and load predictions. Finally, it addresses water passages within hydropower systems such as intake structures, head races, tunnels and forebays. The overall document provides an overview of key concepts and components involved in hydropower engineering.
1) The document discusses the past, current, and future of smartphone technology.
2) In the past, "Pen on Projection" technology allowed writing on any surface using a Bluetooth pen and projected screen.
3) Currently, Qualcomm uses fingerprint sensor technology for authentication and security.
4) In the future, Qualcomm will introduce ultrasonic fingerprint sensors that can scan fingerprints through OLED displays of various thicknesses.
More Related Content
Similar to Hyun wong sample thesis 2019 06_19_rev22_final
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm using a silhouette score perspective. The dissertation contains two papers. Paper 1 uses K-means clustering on a full home electricity usage dataset to obtain optimal clusters, evaluated using Calinski-Harabasz Index, Davis-Boulden Index and silhouette score. Paper 2 reduces the dataset to 1/8 size and finds that the silhouette score results are similar, showing the approach is effective even on smaller datasets. The dissertation applies machine learning clustering techniques to optimize home electricity usage and costs.
This document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering. The dissertation contains two papers:
1. The first paper analyzes household electricity usage data through K-means clustering to obtain optimal data points. It uses the Calinski-Harabasz Index and Silhouette_score to determine the optimal number of clusters.
2. The second paper performs a comparative analysis on a dataset that is 1/8 the size of the original. It finds that the Silhouette score is half of the original dataset, even with the smaller data.
The dissertation applies unsupervised machine learning clustering techniques to analyze household electricity consumption data, in order to optimize costs and identify factors
Hyun wong sample thesis 2019 06_19_rev20_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption in homes through K-means clustering and silhouette scoring of a dataset. The document contains two papers. Paper 1 introduces machine learning and K-means clustering. It applies K-means to a household electricity consumption dataset from UC Irvine, testing clusters from 1 to 10. The optimal number of clusters is identified as 7 based on maximizing the Calinski-Harabasz index and silhouette score. Paper 2 applies the same methodology to a reduced 1/8 size version of the dataset, finding similar silhouette scores indicating the clustering remains effective with less data.
Hyun wong thesis 2019 06_22_rev40_final_printedHyun Wong Choi
This document summarizes a master's dissertation that analyzes electricity consumption at home through k-means clustering. The dissertation contains two papers:
1. The first paper analyzes electricity usage data from homes using k-means clustering to identify optimal clusters of usage patterns. It evaluates different metrics like silhouette score and clustering indices to determine the optimal number of clusters in the data.
2. The second paper performs a comparative analysis using a reduced 1/8th dataset to validate that the silhouette score and optimal number of clusters is similar even with smaller data.
The dissertation applies machine learning clustering techniques to analyze electricity consumption data from homes with the goal of optimizing costs and identifying factors for overcharging.
This master's dissertation analyzes electricity consumption at home through a K-means clustering algorithm and silhouette score. The document contains two papers that analyze a household electricity consumption dataset from the University of California, Irvine using K-means clustering. Paper 1 uses the Calinski-Harabasz Index, Davis-Boulden index, and silhouette score to determine the optimal number of clusters. Paper 2 performs a comparative analysis using a 1/8 subset of the full dataset and finds that the silhouette scores are similar even when using a smaller dataset. The dissertation aims to optimize household electricity usage and costs through machine learning clustering techniques.
The document analyzes electricity consumption at home through K-means clustering and evaluates different cluster validity indices, including the Silhouette score, to determine the optimal number of clusters in the dataset. It performs K-means clustering on a household electricity consumption dataset and compares the results of the Silhouette score and other indices at different values of K to identify the best number of clusters. The analysis aims to help optimize home electricity usage through machine learning clustering techniques.
Hyun wong thesis 2019 06_22_rev40_final_grammerlyHyun Wong Choi
The document analyzes electricity consumption at home through K-means clustering and evaluates different cluster validity indices, including the Silhouette score, to determine the optimal number of clusters in the dataset. It performs K-means clustering on a household electricity consumption dataset and compares the results of the Silhouette score and other indices at different values of K to identify the best clustering. The analysis aims to optimize home electricity usage through unsupervised machine learning clustering techniques.
The document analyzes electricity consumption data from homes using K-means clustering to determine optimal clusters in the data. It evaluates different cluster validity indices like the Calinski-Harabasz Index, Davis-Boulden index, and Silhouette score to find the optimal number of clusters. The analysis is also performed on a reduced 1/8th dataset to see if the results are similar when using less data.
Hyun wong thesis 2019 06_22_rev40_final_Submitted_onlineHyun Wong Choi
The document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering and silhouette scoring. It introduces machine learning and clustering techniques. It then describes the experimental environment, dataset used, previous work on related topics, and the proposed approach of applying K-means clustering to analyze the electricity consumption dataset. The key aspects analyzed are the optimal number of clusters determined by indices like Calinski-Harabasz, Davis-Boulden, and silhouette score. Results are compared between the full and 1/8 reduced datasets.
This master's thesis examines dynamic programming control for energy management in smart homes with photovoltaic systems and battery storage. It first provides background on photovoltaic generation and feed-in tariffs in Germany. It then formulates the energy management problem as a Markov decision process and explores various control approaches including rule-based control, linear programming, dynamic programming, and approximate dynamic programming. The thesis evaluates these methods using real solar generation and electricity price data. The goal is to optimize battery charging and discharging to minimize energy costs while satisfying household demand.
Semester Project 3: Security of Power SupplySøren Aagaard
The project is about the security of power supply, both current and in the future. Renewable energys part, of the total electricity production will continue to grow in the following years, this will be illuminated and analyzed.
The applicable legislation will be provided and explained to help grasping the legal aspect of the security of power supply.
The economical optimum power supply will be calculated, to help evaluate if it is profitable to uphold Denmarks high security of power supply.
To provide a more practical view, a model of the powergrid has come together, analysing how the grid react to the strain caused by errors, to help fathom by which criteria the grid is constructed.
This document is the final report for an industrial project that aims to optimize electrical energy use through an automated lighting control system using a programmable logic controller (PLC). The report includes sections on background theory, detailed design, experimental evaluation, results analysis, discussion, and conclusions. It describes designing a system to intelligently control lighting in buildings by using PIR sensors, a day/night sensor, and a PLC to automatically turn lights on and off based on occupancy and daylight levels, in order to conserve energy.
Hyun wong sample thesis 2019 06_01_rev17_finalHyun Wong Choi
This master's dissertation analyzes electricity consumption at home through a comparative analysis using a silhouette-score prospective. The dissertation contains two papers that apply k-means clustering to household electricity usage data. Paper 1 uses k-means clustering and evaluates the optimal number of clusters using Davis-Bouldin Index and Silhouette_score. Paper 2 performs a comparative analysis on a 1/8 size dataset using silhouette score. The evaluation shows that the comparison index results are similar even when using smaller datasets. The dissertation applies machine learning techniques to analyze electricity consumption and optimize cluster analysis for effective load forecasting and management.
This document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering and silhouette scoring. The dissertation contains two papers. Paper 1 analyzes household electricity consumption data from UC Irvine using K-means clustering to determine the optimal number of clusters based on silhouette scoring and other indices. The analysis finds seven clusters to be optimal. Paper 2 performs a comparative analysis using a 1/8 subset of the full dataset, finding that silhouette scores are approximately half of the full dataset but the optimal number of clusters is similar. The dissertation concludes that machine learning clustering can effectively analyze electricity consumption patterns and predict optimal clustering even with smaller datasets.
This master's thesis explores optimal control of energy and thermal management systems in fuel cell hybrid electric vehicles (FCHEVs) to minimize hydrogen consumption. A model of an FCHEV powertrain is developed for optimal control using dynamic programming. Control strategies are found that optimally operate the energy and thermal systems during driving missions. The results provide insight into how to control the powertrain to efficiently use hydrogen. It is concluded that integrated energy and thermal strategies can increase fuel efficiency, with the optimal strategy dependent on fuel cell characteristics.
This document summarizes a master's thesis that proposes a novel hybrid model approach for appliance load disaggregation. The thesis combines convolutional neural networks and hidden semi-Markov models to model appliances. As a proof of concept, the hybrid model is evaluated on power data from six households to predict washing machine usage. The hybrid model is shown to perform considerably better than a CNN alone, and including transitional features in the HSMM improves performance significantly.
This thesis extends the electromagnetic field calculation capabilities of the open-source CFD software OpenFOAM. It develops new solvers within OpenFOAM to solve magnetostatic problems for materials like copper, steel, and permanent magnets. Two formulations (A-V and A-J) are derived from Maxwell's equations and implemented as OpenFOAM solvers through custom C++ code. Force calculation methods are also implemented to calculate Lorenz force and Maxwell stress. Simple test cases are modeled and solved to validate the new solvers. Results are compared to COMSOL Multiphysics and good agreement is found. The developed solvers could be applied to the design of electromagnetic devices like electric machines.
This document is a lecture note on hydropower engineering from Arbaminch University's Department of Hydraulic Engineering. It covers topics related to hydropower development including an introduction to energy sources and hydropower status in Ethiopia. It also discusses the development and layout of hydropower plants, including hydrological analysis, estimation of power potential, and load predictions. Finally, it addresses water passages within hydropower systems such as intake structures, head races, tunnels and forebays. The overall document provides an overview of key concepts and components involved in hydropower engineering.
Similar to Hyun wong sample thesis 2019 06_19_rev22_final (20)
1) The document discusses the past, current, and future of smartphone technology.
2) In the past, "Pen on Projection" technology allowed writing on any surface using a Bluetooth pen and projected screen.
3) Currently, Qualcomm uses fingerprint sensor technology for authentication and security.
4) In the future, Qualcomm will introduce ultrasonic fingerprint sensors that can scan fingerprints through OLED displays of various thicknesses.
This document summarizes a master's dissertation that analyzes electricity consumption at home through K-means clustering and silhouette scoring. It contains two papers. Paper 1 analyzes a household electricity usage dataset using K-means clustering to identify the optimal number of clusters, as determined by the Calinski-Harabasz Index, Davis-Boulden index, and silhouette score. Paper 2 performs a similar analysis but with a reduced 1/8 size dataset to compare results. The dissertation concludes that both analyses produce similar silhouette scores even with a smaller dataset.
This document appears to be a master's dissertation that analyzes electricity consumption in homes using k-means clustering. It contains chapters that introduce the topic, provide an overview and motivation, describe two papers analyzing electricity consumption data through k-means clustering with silhouette scores to determine optimal cluster numbers, present results, and conclude. The dissertation applies machine learning techniques to optimize home electricity usage by reducing costs and overcharging through clustering and prediction.
This document appears to be a master's dissertation that analyzes electricity consumption in homes using k-means clustering. It contains chapters that introduce the topic, provide an overview and motivation, describe two papers analyzing electricity consumption data through k-means clustering with silhouette scores to determine optimal cluster numbers, present results of experiments on datasets, and conclude with findings. The dissertation aims to optimize home electricity usage through machine learning clustering techniques by reducing costs and overcharging factors while enabling prediction of consumption. It applies k-means clustering to electricity usage data from homes to predict consumption patterns and determine the optimal number of clusters using silhouette scores.
This document appears to be a master's dissertation that analyzes electricity consumption in homes using k-means clustering. It contains chapters that introduce the topic, provide an overview and motivation, describe two papers analyzing electricity consumption data through k-means clustering with silhouette scores to determine optimal cluster numbers, present results of clustering a full and 1/8 sized dataset, and conclude. The dissertation aims to optimize home electricity usage through k-means clustering and determine factors influencing overcharges or costs by analyzing household consumption data.
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
How to Build a Module in Odoo 17 Using the Scaffold MethodCeline George
Odoo provides an option for creating a module by using a single line command. By using this command the user can make a whole structure of a module. It is very easy for a beginner to make a module. There is no need to make each file manually. This slide will show how to create a module using the scaffold method.
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
Leveraging Generative AI to Drive Nonprofit Innovation
Hyun wong sample thesis 2019 06_19_rev22_final
1. Master’s Dissertation
Comparative Analysis of Electricity
Consumption at Home through a
Silhouette-score prospective
Hyun Wong Choi
Department of Electrical and Computer Engineering
The Graduate School
Sungkyunkwan University
2. Comparative Analysis of Electricity
Consumption at Home through a
Silhouette-score prospective
Hyun Wong Choi
Department of Electrical and Computer Engineering
The Graduate School
Sungkyunkwan University
3. Comparative Analysis of Electricity
Consumption at Home through a
Silhouette-score prospective
Hyun Wong Choi
A Dissertation Submitted to the Department of
Electrical and Computer Engineering and
the Graduate School of Sungkyunkwan University
in partial fulfillment of the requirements
for the degree of Master of Science in Engineering
April 2019
Approved by
Professor Dr. Dong Ryeol Shin
4. This certifies that the dissertation of
Hyun Wong Choi is approved.
Dr. MUHAMMAD MANNAN SAEED
Committee Chair: Prof.
Dr. Eung Mo Kim
Committee Member : Prof.
Dr. Dong Ryeol Shin
Major Advisor: Prof.
Dr. Nawab Muhammad Faseeh Querish
Co-Advisor: Prof.
The Graduate School
Sungkyunkwan University
June 2019
7. - 5 -
List of Figures
Fig.1. Figure-1 Description 15
Fig.2. Figure-2 Description 16
Fig.3. Figure-3 Description 19
8. - 6 -
Abstract
Title of Abstract
Machine learning is a modern field that has emerged as a new tool for data
analytics in a distributed computing environment. There are several aspects, at
which, machine learning has improved the processing capacity along with the
effectiveness of analysis. In this paper, the electricity usage of the home is analyzed
through K-means clustering algorithm for obtaining the optimal home usage
electricity data points. The Davis Boulden Index and Silhouette_score finds the
detailed optimal number of clusters in the K-means algorithm and present the
application scenario of the machine learning clustering analytics
Machine learning is a state-of-the-art sub-project of artificial intelligence, that
is been evolved for finding large-scale intelligent analytics in the distributed
computing environment. In this paper, we perform comparative analytics onto
dataset collected for the electricity usage of home based on the K-means clustering
algorithm using comparison to silhouette score with a ratio 1/8 dataset. The
performance evaluation shows that the comparison index is similar in numbers of
silhouette score even if datasets are smaller than before
KeyMAwords: Machine Learning, K-means clustering
Big data analytics has simplified the complexity of large-scale dataset
processing in a parallel distributed environment.
9. - 7 -
Chapter 1
Introduction
Electiricty consumption from power grid
In the power grid, we measure the consumption through sensors
Industrial consumption
Housing consumption
Factories consumption
Housing Consumption
Front end( Consumer End )
Back end ( Electircal Company end)
Back end ( Company end )
- Dataset For consumption UCIRVINE
Somany techniques that solves the optimization problem of electricity but, non of
them focus on housing electricity optimization,
- Reducing the cost
- Factors of overcharge
- Prediction
Are not available.
Solution
K-means algorithm
Why chose k-mean cluster
Predict the answer from the dataset
No any answer is available in terms of k-mean
Why predicting the answers
No clear result
In this paper electricity usage of home is analyzed through k-means
clustering algorithm for obtaining the optimal home usage electricity usage
10. - 8 -
of home is
3A is analyzed through k-means clustering algorithm for obtaining the
optimal home usage electricity data points The calinski-Harabasz Index,
davis-boulden index and silhouette_score find detailed optimal number of
clusters in the K-menas algorithm and present the application scenario of the
machine learning algorithm.
3B is reducing the 1/8 dataset and result the same result
The proposed approach delivers us efficient and meaning prediction results
never obtained before.
Machine learning is an analyzing mechanism that fetches and identifies the
matching patterns from existing datasets for newer result formations. This paper
discusses comparative analytics related to unsupervised learning algorithms. At
which we compare the K-mean clustering result with a ratio of half dataset to
silhouette_score result. We performed analysis and came to a conclusion that
Davis-Boulden index is not working smoothly in the sci-kit learn library, so
performed a check analysis for Caliski-Harabasz Index and Silhouette score along
with and Davis – Boulden index and compared results to each of them so to learn
that when we reduce the dataset to a mentioned proportion, the resultant dataset
shows half score than the traditional dataset score.
11. - 9 -
Chapter 2
Overview & Motivation
In real life household power consumptions diverse analytics and electricity
transformer, Transmission power can management period can estimate it.
And each data using electricity consumption. It can be used for progressive
taxation. Regional to regional demand, forecasting, maintenance of power
plant and facilities. In the gas company or Car, company can estimate about
the consumption for the via k-means algorithm and also can estimate via k-
means algorithms and also can estimate about the gas consumption rate to
via k-means clustering and index.
Motivated from Google AI, Tensor-flow Conference 2017
12. - 10 -
Chapter 3
Paper-1 Content
3.1. Introduction
Machine learning is a sub-project of artificial intelligence, that is used
to develop algorithms and techniques for enabling the computers to learn [1].
It is used to train the computer for various aspects such as (i) distinguish
whether e-mails received are s pam or not, (ii) data classification application,
(iii) association rule identification, and (iv) character recognition.
Machine learning includes a series of processes, in which a computer
lookup for (i) similar patterns, (ii) generate a novel classification system, (iii)
data analytics, and (iv) producing meaningful results. It is a kind of artificial
intelligence, that can be predicted based on the result, if it is supported only
by analytics algorithms. Machine learning is a step-by-step evolution process
that leads from big data analytics to predict future actions towards making
decisions on its own through past learned results. The key issues for
processing a successful prediction model remains to be within increasing the
probability and reducing the error and the said problems are resolved through
enabling numerous iterative learnings [2].
13. - 11 -
At the heart of machine learning are Representation and Generalization,
where expression is an evaluation of data and generalization is the processing
of future data. Unsupervised learning is a type of machine learning that is
used primarily to determine how data is organized. Unlike Supervised
Learning or Reinforcement Learning, this method does not give a target
value for input values [3].
Autonomous learning is closely related to the density estimation of
statistics. These autonomous learning can summarize and describe the main
characteristics of the data. An example of autonomous learning is clustering.
In this paper, we use the K-means algorithm to measure the optimal number
of clusters based on the Calinski-Harabasz Index and Silhouette_score,
Davis-Boulden index and then apply it to household electricity consumption
analysis.
14. - 12 -
Paper-1 Methodology
3.1.1.1. Sub-topics
3.4 Paper-1 VALUATION
2.4.1. Experimental Environment
2.4.2. Experimental Dataset
3.2. Previous work
Machine Learning
Machine learning is like data mining, but it is different in predicting
data based on learned attributes, mainly through training data. In addition to
the three techniques, Unsupervised learning, Supervised Learning or
Reinforcement Learning, various types of machine learning techniques such
as Semi-Supervised Learning and Deep Learning algorithms are developed
Has been used.
Clustering
Clustering is a method of data mining by defining a cluster of data
considering the characteristics of given data and finding a representative
point that can represent the data group. A cluster is a group of data with
similar characteristics. If the characteristics of the data are different, they
must belong to different clusters. It is the main task of exploratory data
mining, and a common technique for statistical data analysis, used in many
fields, including pattern recognition, information retrieval, machine learning,
15. - 13 -
and computer graphics [3].
(1) Maximizing inter-cluster variance
(2) Minimizing the inner-cluster variance
Note, however, that clustering should be distinguished from
Classification. Clustering is unsupervised learning without correct answers.
In other words, we group similar objects without group information of each
object. Classification, on the other hand, is supervised learning. When you
carry out classification tasks, you will learn to predict the dependent variable
(Y) with the independent variable (X) of the data [4].
Community Feasibility Assessment
Since clustering tasks are not correct, they cannot be evaluated as
indicators, such as simple accuracy, as in a typical machine learning
algorithm. As you can see in the example below, it is not easy to find the
optimal number of clusters without the correct answers. Cluster analysis
itself is not one specific algorithm, but the general task to be solved. It can
16. - 14 -
be achieved by various algorithms that differ significantly in their
understanding of what constitutes a cluster and how to efficiently find them.
Popular notions of clusters include a group with small distances between
cluster members, dense areas of data space,
Scikit-learn
In general, a learning problem considers a set of n samples of data and
then tries to predict the properties of unknown data. If each sample is more
than a single number and for instance. A multi-dimensional entry, it is said
to have several attributes or features.
Supervised learning, in which the data comes with additional attributes
17. - 15 -
that we want to predict this problem can be either.
Classification: samples belong to two or more classes and we want to learn
from already labeled data on how to predict the class of unlabeled data. An
example of a classification problem would be handwritten digit recognition,
in which the aim is to assign each input vector to one of a finite number of
discrete categories. Another way to think of classification is as a discrete( as
opposed to continuous) form of supervised learning where one has a limited
number of categories and for each of n samples provided. One if to try to
label them with the correct category or class.
Scikit-learn is the machine learning platform in the middle range of
superficial broad python module this package high-level language can us
easily high-level documentation and proper API suggested. Using BSD
license as academic or commercially use it. Source-code, documentation is
downloaded from websites [10]
Supervised learning, Unsupervised Learning is the many problems is
inserted in the Scikit-learn, Generalized Models, Linear and Quadratic
Recruitment Analysis, Kernel Ridged regression, Support Vector machine,
Stochastic Gradient Decent model’s solution also inserted in the Scikit-learn
3.3. Proposed Approach
K-means algorithms is one of the clustering methods for divided,
divided is giving the data among the many partitions, For example, receive
data object n, divided data is input data divided K(<= n) data, each group
consisting of cluster below equation is the at K-means algorithm when
cluster consists of algorithms using cost function use it [11]
18. - 16 -
argmin ∑ ∑ ‖𝑥 − 𝜇𝑖‖
𝑥 ∈ 𝑆 𝑖
2𝑘
𝑖 =1
In other words, one of the data objects divided by the K group.
Currently, divided similarity is (dissimilarity with reducting the cost function
about it. And from this theory each object similarity increase, different group
similarity will decrease.[12] K-means algorithm is each centroid and in each
group’s data object times’ summation, from this function result, the data
object group updated clustering progressed.[5]
How to be well to be clustering inner way is Caliski-Harabasz Index,
Davies-Bouldin index, Dunn index, Silhouette score. In this paper. Evaluate
via Clainiski-Harabasz Index and silhouette score evaluate it.
From the Cluster Calinski-Harabasz Index s I the clusters distributed
average and cluster distributed ratio will give it to you.
19. - 17 -
𝑠(𝑘) =
𝑇𝑟(𝐵 𝑘)
𝑇𝑟(𝑊𝑘)
×
𝑁 − 𝑘
𝑘 − 1
For this Bk is the distributed matrix from each group Wk is the cluster
distributed defined.
𝑊𝑘 = ∑ ∑ (𝑥 − 𝑐 𝑞)(𝑥 − 𝑐 𝑞
𝑥∈𝐶 𝑞
𝑘
𝑞=1
) 𝑇
𝐵 𝑘 = ∑ 𝑛 𝑞(𝐶 𝑞 − 𝑐)(𝐶 𝑞
𝑞
− 𝑐) 𝑇
N is the number of Data, Cq data group in Cq, Cq is the cluster q’s centroid,
c is the E of the Centroid, NQ is the number of data number in cluster_q
Silhouette score is the easy way to in data I each data cluster in data’s
definition an (i) each data is not clustered inner and data’s definition b(i)
silhouette score s(i) is equal to calculate that
s(i) =
𝑏(𝑖) − 𝑎(𝑖)
max { 𝑎(𝑖), 𝑏(𝑖)}
20. - 18 -
From this calculate s(i) is equal to that function
−1 ≤ s(i) ≤ 1
S(i) is the close to 1 is the data 1 is the correct dluster to each thing, close
to -1 cannot distribute cluster is distributed, from this paper machine Using
the machine learning library scikit-learn in the house hold power
consumption clustering[7],
Household power consumption from the dataset Download from
University California Irvine Machine Learning Data Repository[8] and then
use it, this dataset is via delimiter is divided. Global_active_power, Global
Reactive_power, Voltage, Global _intensity is divided. Global
Active_powere and Global Reactive power the X, Y axis experiment it,
Python library is Anaconda3 K-means algorithm’s key point is using Data
21. - 19 -
keep K clusters, reduce cluster’s distance, K-means algorithms input data put
the labels. Figure 1 is the before check Calinski-Harabasz Index and
Silhouette_score execute K-means algorithm’s result. Figure 1 to Figure 11
are 1/8 dataset k-means clustering result for Household power consumption
from UC Irvine Repository and reduce the dataset 1/8 times from original
UCI machine learning data repository.
2.4.3. Experimental Results
22. - 20 -
Figure 1. Clustering result at K = 1 Figure2. Clustering result at K=2
Figure 3. Clustering result at K = 3 Figure4. Clustering result at K=4
23. - 21 -
Figure 5. Clustering result at K = 5 Figure 6. Clustering result at K=6
Figure 7. Clustering result at K=7 Figure 8. Clustering result at K=8
24. - 22 -
Figure 9. Clustering result at K = 9 Figure 10. Clustering result at K=10
After all, reduce each cluster’s distance calculate each cluster’s Calinski-
Harabasz Index, increasing clusters’ Calinski-Harabasz Index will decrease with
K ratio is too law estimate K this cluster partition will one more or not electric
25. - 23 -
consumption rate is very important. This one is the most important fact.
Figure 11. Silhouette score according to change of cluster number.
Equal with Caliski-Harabasz Index estimation, calculate Silhouette_score.
The cluster will increase Silhouette_score will decreases with K distributed, a
low factor with optimal K represented.
From K-means algorithms calculate proper cluster things is very important,
from the data, estimate Silhouette_score, the result is K – 7 each cluster
centroid and data prices silhouette score are 0.799 is the optimal score. From
the formal Caliski-Harabasz Index results are 560.3999 is the optimal result.
Using this k-means algorithm the fact is figure 11.
26. - 24 -
From this K-means algorithm cluster 7th, each group’s centroid and each
centroid distance will be an optimal value. From this result, each Centroid can
divide. Household power consumption rate via clustering.
Figure 12: Clustering result at K=7
27. - 25 -
Davies-Bouldin index
If the ground truth labels are not known, the Davies-Bouldin index
(sklearn. Metrixdavis Boulden)
𝑅𝑖𝑗 =
𝑠𝑖 + 𝑠𝑗
𝑑𝑖𝑗
Then the Davis-Bouldin Index is defined as
DB =
1
𝑘
∑ 𝑖 = 1 𝑘
max
𝑖≠𝑗
𝑅𝑖𝑗
The zero is the lowest score a possible. Score. Values closer to zero
indicate a better partition. But the problem is this algorithm does
not attach it in the Scikit-learn library and only explain it in the
document page but cannot experiment easily.
3.4. Related work
28. - 26 -
Machine learning is a sub-project of artificial intelligence, that is used
to develop algorithms and techniques for enabling the computers to learn [1].
It is used to train the computer for various aspects such as (i) distinguish
whether e-mails received are spam or not, (ii) data classification application,
(iii) association rule identification, and (iv) character recognition.
Machine learning includes a series of processes, in which a computer
lookup for (i) similar patterns, (ii) generate a novel classification system, (iii)
data analytics, and (iv) producing meaningful results. It is a kind of artificial
intelligence, that can be predicted based on the result if it is supported only
by analytics algorithms. Machine learning is a step-by-step evolution process
that leads from big data analytics to predict future actions towards making
decisions on its own through past learned results. The key issues for
processing a successful prediction model remains to be within increasing the
probability and reducing the error and the said problems are resolved through
enabling numerous iterative learnings [2].
At the heart of machine learning are Representation and Generalization,
where expression is an evaluation of data and generalization is the processing
of future data. Unsupervised learning is a type of machine learning that is
used primarily to determine how data is organized. Unlike Supervised
Learning or Reinforcement Learning, this method does not give a target
value for input values [3].
Autonomous learning is closely related to the density estimation of
statistics. These autonomous learning can summarize and describe the main
29. - 27 -
characteristics of the data. An example of autonomous learning is clustering.
In this paper, we use the K-means algorithm to measure the optimal number
of clusters based on the Calinski-Harabasz Index and Silhouette_score,
Davis-Boulden index and then apply it to household electricity consumption
analysis.
3.1 Summary
30. - 28 -
From the paper, Household power consumption via k-means clustering,
Used library which is sci-kit learn, Anaconda 3 open-source personally can
easily follow it and because using BSD License to real works don’t have
difficulties to that. Not only the K-means algorithm, PCA Algorithms but
also SVM algorithm, etc other machine learning algorithms clustering can
also do it. From this result, in real life household power consumptions
diverse analytics. And the electricity transformer, Transmission power can
management period can estimate it. And each data using electricity
consumption. It can be used for progressive taxation, regional to regional
demand forecasting, maintenance of power plants and facilities. Can do it. In
the Gas, the company can estimate via k-means algorithms and also can
estimate the gas consumption rate to via K-means clustering and index.
Chapter 4
Paper-2 Content
4.1Introduction
Machine learning is an analyzing mechanism that fetches and identifies
31. - 29 -
the matching patterns from existing datasets for newer result formations.
This paper discusses comparative analytics related to unsupervised learning
algorithms, at which we compare the K-mean clustering result with a ratio
of half dataset to Silhouette_score results. We performed analysis and came
to the conclusion that Davis-Boulden index is not working smoothly in the
sci-kit learning, so performed a check analysis for Caliski-Harabasz Index
and Silhouette score along with and Davis – Boulden index and compared
results to each of them so to learn that when we reduce the dataset to a
mentioned proportion, the resultant dataset shows half score than the
traditional dataset score.
32. - 30 -
4.2 Related work
Machine learning is a field of artificial intelligence, that is used to
develop algorithms and techniques that enable computers to learn [1]. It is
used to train the computer to distinguish whether e-mails received are spam
or not, and there are various applications such as data classification,
associated rule identification, and character recognition, which comply to the
standard machine learning perspectives.
It includes a series of processes, in which a computer finds its own
patterns, creates a new classification system, analyzes the data, and produces
meaningful results. The successful prediction occurs with the increase in
probability and decrease in the error issues. Machine learning enables to sort
out the issues with various iterative learning [2]. Among them, supervised
learning is highly related to summarizing the learning methods for re-
enforcement mechanisms [3].
Clustering is a process of mining the dataset by defining a cluster of
data that considers the characteristics of input and finds a representative
33. - 31 -
method to point out the data group. In this way, a cluster is a group of relevant
data elements with similar characteristics. If the functions are not the same,
the ingredients belong to contrast clusters [3]. Clustering is unsupervised
learning without accuracy in answers. In the same way, the objects having
the same information are grouped together for similar elements. However,
the classification is a way related to supervised learning. When you perform
classification operations, the system will learn to predict the dependent
variable (Y) with the independent variable (X) of the data [4].
Scikit-learn is the machine learning platform in the middle range of
superficial broad python module this package high-level language can us
easily high-level documentation and proper API suggested. Using BSD
license as academic or commercially use it. Source-code, documentation is
downloaded from websites [10]. Supervised learning, Unsupervised
Learning is the many problems is inserted in the Scikit-learn, Generalized
Models, Linear and Quadratic Decruitment Analysis, Kernel Ridged
regression, Support Vector machine, Stochastic Gradient Decent model’s
solution also inserted in the Scikit-learn.
3.5. Paper-2 Methodology
K-means algorithm is one of the clustering methods for divided,
divided is giving the data among the many partitions. For example, receive
34. - 32 -
data object n, divided data is input data divided K (≤ n) data, each group
consisting of cluster below equation is the at K-means algorithm when
cluster consists of algorithms using cost function use it [11]
argmin ∑ ∑ ‖𝑥 − 𝜇𝑖‖
𝑥 ∈ 𝑆 𝑖
2𝑘
𝑖 =1
In other words, one of the data objects divided by the K group.
Currently, the divided similarity is (dissimilarity with reducing the cost
function about it. And from this theory each object similarity increase,
different group similarity will decrease. [12] K-means algorithm is each
centroid and in each group’s data object times’ summation, from this
function result, the data object group updated clustering progressed. [5]
Silhouette score is the easy way to in data I each data cluster in data’s
definition an (i) each data is not clustered inner and data’s definition b(i)
silhouette score s(i) is equal to calculate that
s(i) =
𝑏(𝑖) − 𝑎(𝑖)
max { 𝑎(𝑖), 𝑏(𝑖)}
From this calculate s(i) is equal to that function
−1 ≤ s(i) ≤ 1
35. - 33 -
S(i) is the close to 1 is the data I is the correct cluster to each thing,
close to -1 cannot distribute cluster is distributed, from this paper machine
Using the machine learning library scikit-learn in the household power
consumption clustering [7]. Household power consumption from the
dataset Download from University California Irvine Machine Learning
Dataset Repository [8] and then use it, this dataset is via delimiter is divided.
Global_active_power, Global Reactive_power, Voltage, Global_intensity is
divided. Global Active_power and Global Reactive power the X, Y axis
experiment it.
Python library is Anaconda3 K-means algorithm’s key point is using
Data keep K clusters, reduce cluster’s distance, K-means algorithms input
data put the labels. figure 1 is the before check Calinski-Harabasz Index and
Silhouette_score execute K-means algorithm’s result. Figure 1 to Figure 11
are 1/8 dataset k-means clustering result for House hold power consumption
from UC Irvine Repository and reduce the dataset 1/8 times from original
UCI machine learning data repository.
39. - 37 -
Figure 12. Shiloutette score according to change of cluster number.
Figure 13. 1/8 dataset Silhouette score according to change of cluster number.
40. - 38 -
From K-means algorithms calculate proper cluster things is very
important, from the data, estimate Silhouette_score, the result is K = 7 each
cluster centroid and data prices silhouette score is 0.799 is the optimal score.
Even if the dataset is so small but the 1/8 datasets K= 7 each cluster
centroid and data prices silhouette score 0.810 is the optimal score. From this
K-means algorithm cluster 7th,
( all dataset, 1/8 dataset ) each group’s
centroid and each centroid distance will be an optimal value. From this result,
the dataset is decreased but the K-means clustering ‘s class vector space. Its
optimal cluster is the same situation with before original Dataset Household
power consumption rate via clustering.
41. - 39 -
2.4. Paper-2 EVALUATION
2.4.1. Experimental Environment
2.4.2. Experimental Dataset
2.4.3. Experimental Results
Summary
From the paper, Household power consumption via k-means clustering,
Used library which is sci-kit learn, Anaconda 3 open-source personally can
easily follow it and because using BSD License to real works don’t have
difficulties to that. From this result even if reduce the dataset 1/8 but the
silhouette score and all the clustering result is same as before. But the
population will increase it can show a clearer result for the classification and
vector space. Large dataset to small dataset is clear to show to the specific
result for the Silhouette score but the opposite site is not clearly allowed.
Because of 4-dimension vector dataset. From the experiment reduce the
estimated time if received huge dataset from the analysis.
Chapter 5
Conclusion
his dissertation approach to a diverse aspect of the k-means clustering
applications, First time try to reduce the k-means algorithm’s time
consumption but next time I try to change my aspect to the how to reduce
the time from Large dataset, the approach is changed. These days, via
machine learning algorithm, can estimate about the when changing the part,
42. - 40 -
(life span) From this result, all of the experiment Used library scikit-learn
Anaconda3, open-source, it can easily implement any environment, because
using BSD License. Can analyze diverse indexes from the first experiment.
From second experiment, if the dataset is huge need time to analyze, how
many centroid is proper k-mean cluster, at that time can reduce time ,
compare with 1/8 dataset, but limited classification and vector space. From
the experiment reduce the estimated time if received huge dataset from
analysis.
43. - 41 -
Acknowledgement
대학원 석사 생활 중 총 114회의 컨퍼런스 참가와 7회의 발
표를 하였습니다. IEEE Globcom 2017 이 그 중 인상적이었으며, 본
논문은 Google AI, Tensor-flow Conference 2017 에서 Motivation 을 얻
어 실험하게 되었습니다. 지도교수 이시면서 성균관대학교를 대표
하는 총장님이신 신동렬 교수님의 지도와, Co-Advisor 이신 Nawab
Muhammad Fasheeh Queshi 와의 Co-work 에도 부족한 저를 항상 웃
으며 지도해 주신대에 대하여 감사의 인사를 전합니다.
성균관대학교에 처음 Join 하게 도와주신 모바일 컴퓨팅연구
실 윤희용 교수님 SKKU Fellow 께도 감사드리며, 오픈랩에 생활
함에 있어 불편함이 없이 도와주신 남춘성 박사님과, 같이 사용한
최기현 박사님, Muhammad Hamza, Janaid , 김우현, 소 청에게도 고
마움의 뜻을 전합니다.
학위기간 동안에 끝까지 후원해 주셨던 어머니 이신 동남보
건대학교 이봉순 교수님, LG전자 평택캠퍼스 창립멤버 이신 아버
지 최한청 부장님 (현 온누리이엔지 이사) 에게도 감사의 인사를
전합니다.
학위기간동안 종종 집까지 바래다 주신, 친형인 포스코건설 최현
석 과장 및 분당서울대병원 안여울 간호사, 귀여운 조카 연우에게
도 고마움을 전합니다.
학위를 하면서 이정표가 되어준 사촌 누나 형들께도 감사의 인사
를 드리며 이만 갈음합니다.
2019년 06월 19일
44. - 42 -
Acknowledgment
I participated in a total of 114 conferences and 7 presentations during
my graduate school life. IEEE Globcom 2017 was impressive, and this paper
was experimented with Motivation at Google AI, Tensor-flow Conference
2017. I would like to express my gratitude to Professor, Dr. Dong-Ryul, Shin
who is the president of Sungkyunkwan University, and co-work with Co-
Advisor Assistant Professor, Dr. Nawab Muhammad Fasheeh Queshi.
Thank you to SKKU Fellow, Professor Hee Yong Yoon Director of
Mobile Computing Lab for helping me to join Sungkyunkwan University for
the first time. I am thankful to Dr. Min Ki Hyun, Muhammad Hamza, Janaid,
Kim Woohyun, I also want to thank you.
I would like to extend my sincere thanks to Bong-Soon, Lee mother,
Professor of Dongnam Health University, who supported me for the duration
of my degree, and to my father Han-chung Choi, who is a founding member
of LG Electronics Pyeongtaek Campus.
I am also grateful to Hyeon-seok, Choi my brother-in-law, who has
often took his car to home during my degree, and Ye-ul, Ahn Nurse at Seoul
National University Bundang Hospital and my cute nephew. Youn-Woo
I give my thanks to my cousins and older brothers who gave me a
milestone in my degree.
June 19, 2019
45. - 43 -
References
[1] https://en.wikipedia.org/wiki/K-means_clustering
[2] https://en.wikipedia.org/wiki/Cluster_analysis
[3] https://en.wikipedia.org/wiki/Silhouette_(clustering)
[4] https://github.com/sarguido.
[5] http://archive.ics.uci.edu/ml/datasets.html.
[6] http://scikit-learn.org/stable/modules/clustering.html#calinski-harabaz-index
[7] http://scikit-learn.org/stable/.
[8] T. Calinski and J. Harabasz, 1974. “A dendrite method for cluster analysis”.
Communications in Statistics
[9] Kanungo, Tapas et al. “An Efficient k-Means Clustering Algorithm: Analysis and
Implementation.” IEEE Trans. Pattern Anal. Mach. Intell. 24 (2002): 881-892.
[10]David, and Sergei Vassilvitskii ,“k-means++: The advantages of careful seeding”
Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete
algorithms, Society for Industrial and Applied Mathematics (2007): 1027-1035
[11]Wagstaff, K., Cardie, C., Rogers, S., & Schrödl, S. (2001, June). Constrained k-
means clustering with background knowledge. In ICML (Vol. 1, pp. 577-584).
[12]Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering
algorithm. Journal of the Royal Statistical Society. Series C (Applied
Statistics), 28(1), 100-108.
[13]Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., & Wu,
A. Y. (2002). An efficient k-means clustering algorithm: Analysis and
implementation. IEEE Transactions on Pattern Analysis & Machine Intelligence, (7),
881-892.
[14]Alsabti, K., Ranka, S., & Singh, V. (1997). An efficient k-means clustering algorithm.
[15]Likas, A., Vlassis, N., & Verbeek, J. J. (2003). The global k-means clustering
algorithm. Pattern recognition, 36(2), 451-461.
[16]Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... &
Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of machine
learning research, 12(Oct), 2825-2830.
[17]Buitinck, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel, O., ... &
Layton, R. (2013). API design for machine learning software: experiences from the
scikit-learn project. arXiv preprint arXiv:1309.0238.
[18]Abraham, A., Pedregosa, F., Eickenberg, M., Gervais, P., Mueller, A., Kossaifi, J., ...
& Varoquaux, G. (2014). Machine learning for neuroimaging with scikit-
learn. Frontiers in neuroinformatics, 8, 14.
[19]Fabian, P., Gaël, V., Alexandre, G., Vincent, M., Bertrand, T., Olivier, G., ... &
Alexandre, P. (2011). Scikit-learn: Machine learning in Python. Journal of Machine
Learning Research, 12, 2825-2830.
[20]Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., & Scholkopf, B. (1998). Support
vector machines. IEEE Intelligent Systems and their applications, 13(4), 18-28.
[21]Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." Journal of
machine learning research 12.Oct (2011): 2825-2830.
[22]Alsabti, Khaled, Sanjay Ranka, and Vineet Singh. "An efficient k-means clustering
algorithm." (1997).
46. - 44 -
[23]Ding, Chris, and Xiaofeng He. "K-means clustering via principal component
analysis." Proceedings of the twenty-first international conference on Machine
learning. ACM, 2004.
[24]Paneque-Gálvez, Jaime, et al. "Small drones for community-based forest monitoring:
An assessment of their feasibility and potential in tropical areas." Forests 5.6 (2014):
1481-1507.
[25]Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." Journal of
machine learning research 12.Oct (2011): 2825-2830.
[26]Bishop, Christopher M. Pattern recognition and machine learning. springer, 2006.
[27]Rasmussen, Carl Edward. "Gaussian processes in machine learning." Summer
School on Machine Learning. Springer, Berlin, Heidelberg, 2003.
[28]Hartigan, John A., and Manchek A. Wong. "Algorithm AS 136: A k-means clustering
algorithm." Journal of the Royal Statistical Society. Series C (Applied Statistics) 28.1
(1979): 100-108.
[29]Paneque-Gálvez, Jaime, et al. "Small drones for community-based forest monitoring:
An assessment of their feasibility and potential in tropical areas." Forests 5.6 (2014):
1481-1507.
[30]Sass, Ron, et al. "Reconfigurable computing cluster (RCC) project: Investigating the
feasibility of FPGA-based petascale computing." 15th Annual IEEE Symposium on
Field-Programmable Custom Computing Machines (FCCM 2007). IEEE, 2007.
[31] Duda, Richard O., Peter E. Hart, and David G. Stork. Pattern classification. John
Wiley & Sons, 2012.
[32]Cover, Thomas M., and Peter E. Hart. "Nearest neighbor pattern
classification." IEEE transactions on information theory13.1 (1967): 21-27.
[33]Breiman, Leo. Classification and regression trees. Routledge, 2017.
[34]Haralick, Robert M., and Karthikeyan Shanmugam. "Textural features for image
classification." IEEE Transactions on systems, man, and cybernetics 6 (1973): 610-
621.
[35]Chapelle, Olivier, Bernhard Scholkopf, and Alexander Zien. "Semi-supervised
learning (chapelle, o. et al., eds.; 2006)[book reviews]." IEEE Transactions on
Neural Networks 20.3 (2009): 542-542.
[36]Zhu, Xiaojin, Zoubin Ghahramani, and John D. Lafferty. "Semi-supervised learning
using gaussian fields and harmonic functions." Proceedings of the 20th International
conference on Machine learning (ICML-03). 2003.
[37]Caruana, Rich, and Alexandru Niculescu-Mizil. "An empirical comparison of
supervised learning algorithms." Proceedings of the 23rd international conference
on Machine learning. ACM, 2006.
[38]Jain, Anil K. "Data clustering: 50 years beyond K-means." Pattern recognition
letters 31.8 (2010): 651-666.
[39]Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation
learning with deep convolutional generative adversarial networks." arXiv preprint
arXiv:1511.06434 (2015).
[40]Figueiredo, Mario A. T., and Anil K. Jain. "Unsupervised learning of finite mixture
models." IEEE Transactions on Pattern Analysis & Machine Intelligence 3 (2002):
381-396.
[41]Lovmar, Lovisa, et al. "Silhouette scores for assessment of SNP genotype clusters."
BMC genomics 6.1 (2005): 35.
[42]Collins, Robert T., Ralph Gross, and Jianbo Shi. "Silhouette-based human
identification from body shape and gait." Proceedings of fifth IEEE international
conference on automatic face gesture recognition. IEEE, 2002.
47. - 45 -
[43]Gat-Viks, Irit, Roded Sharan, and Ron Shamir. "Scoring clustering solutions by their
biological relevance." Bioinformatics 19.18 (2003): 2381-2389.
[44]Maulik, Ujjwal, and Sanghamitra Bandyopadhyay. "Performance evaluation of some
clustering algorithms and validity indices." IEEE Transactions on pattern analysis
and machine intelligence 24.12 (2002): 1650-1654.
[45]Łukasik, Szymon, et al. "Clustering using flower pollination algorithm and calinski-
harabasz index." 2016 IEEE Congress on Evolutionary Computation (CEC). IEEE,
2016.
[46]Desgraupes, Bernard. "Clustering indices." University of Paris Ouest-Lab Modal’X
1 (2013): 34.
[47]Petrovic, Slobodan. "A comparison between the silhouette index and the davies-
bouldin index in labelling ids clusters." Proceedings of the 11th Nordic Workshop of
Secure IT Systems. sn, 2006.
[48]Maulik, Ujjwal, and Sanghamitra Bandyopadhyay. "Performance evaluation of some
clustering algorithms and validity indices." IEEE Transactions on pattern analysis
and machine intelligence 24.12 (2002): 1650-1654.
[49]Petrovic, Slobodan. "A comparison between the silhouette index and the davies-
bouldin index in labelling ids clusters." Proceedings of the 11th Nordic Workshop of
Secure IT Systems. sn, 2006.
[50] https://scikit-learn.org/stable/
[51] https://www.anaconda.com/
[52] https://www.jetbrains.com/pycharm/
[53] Petrovic, Slobodan. "A comparison between the silhouette index and the davies-
bouldin index in labelling ids clusters." Proceedings of the 11th Nordic Workshop of
Secure IT Systems. sn, 2006.
[54] Bandyopadhyay, Sanghamitra, and Ujjwal Maulik. "Nonparametric genetic
clustering: comparison of validity indices." IEEE Transactions on Systems, Man, and
Cybernetics, Part C (Applications and Reviews) 31.1 (2001): 120-125.
[55]
https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consu
mption
[56] https://github.com/sarguido