SlideShare a Scribd company logo
K-MEANS
CLUSTERING
METHOD BASED
NETWORK SHARED
RESOURCES MINING
A SHORT STORY PRESENTED BY
KANCHETI SAI PRAGNA
SJSU_ID: 016698552
WHY MINING NETWORK
SHARED RESOURCES?
 The demand for data resource
sharing in internet has been
growing and this brought up
many optimization techniques in
utilizing efficiency of resources.
 At present, there are at least 15
Trillion files available on the
internet, The vast availability of
resources makes a complex task in
retrieving the relevant data
resources efficiently
 In order to solve problems of large
redundant information and
relevant data resources research
the need for data mining in
network shared data resources
arose.
Existing
Methods of
network
shared
resources
mining
• There has been a significant research done in data mining methods in relevant
data resources research and various techniques came into picture.
• clustering analysis algorithm based Method where it uses clustering analysis
algorithm to process resource data, construct the data preprocessing set, and
calculate the data feature vector.
• Another method based on multi-dimensional resource coordination and
aggregation where this technique focuses on using the data center's network
resource sharing process analysis as the basis for building a multidimensional
resource aggregation data model.
• using fuzzy logic to build multidimensional collaborative fitness functions, and
using data mining to optimize decision-making in order to increase the execution
efficiency of the data mining process.
• However, Although these methods produced some excellent results they lack in
run time efficiency, precision and they are usually complex to apply practically.
• In order to overcome above drawbacks a new method based on k means
clustering algorithm has come into picture.
CLUSTERING
WHAT IS
CLUSTERING?
 Clustering is used in assembling
bulky data into clusters or
groups that helps us to visualize
the internal structure of the
data. Basically, it is a grouping
of items based on how similar
and distinct they are to one
another
 For example, there is some
online shopping site where we
can find variety of stuffs from
electronics, clothing, books,
grocery items, cosmetic items,
accessories. Here in figure 2
describes how it looks after
clustering is done.
STAGES OF
CLUSTERING
 Raw Data
 Clustering Algorithm
 Clusters
STAGES OF CLUSTERING
 Raw Data: Raw data (which are not being processed yet) are collected from various sources on which we
want to solicit various clustering algorithm
 Clustering Algorithm: A specific algorithm is selected according to our requirements and then that very
algorithm is applied on the raw data that were being selected.
 Clusters: After soliciting the selected clustering algorithm on the raw data, we acquire our clusters.
TYPES OF
CLUSTERING
 Partitioning Method
 Density-based Method
 Hierarchical Method
 Grid-based method
 Model-based clustering method
 Constraint-based method
PARTITIONING METHOD
 In the case of partitioning clustering method,
the objects of the datasets are segregated into
numerous subsets.
 Given some examples of the partitioning
algorithms are K-means, PAM (Partitioning
AroundMedoids).
 The figure shows how clusters are formed after
applying partitioning clustering technique
DENSITY-BASED METHOD
 Density-Based Clustering method identify
distinctive clusters in the data, based on the
idea that a cluster/group in a data space is a
contiguous region of high point density,
separated from other clusters by sparse
regions.
 Basically, in this method clusters are formed or
the data spaces are partitioned by the density
of the data point in a particular region
 The figure shows how clusters are formed after
applying Density-Based Method of clustering
HIERARCHICAL METHOD
 In the case of hierarchical clustering method,
the objects of the datasets are segregated in
the hierarchical fashion of clusters or groups.
 Examples: Agglomerative Hierarchical
clustering algorithm (AGNES), Divisive
Hierarchical clustering algorithm (DIANA) etc.,
 The figure shows how clusters are formed after
applying Hierarchical Method of clustering
GRID-BASED METHOD
 In grid-based clustering method, the object
space is divided into fixed number of cells that
forms the shape of a grid like structure.
Clustering algorithm is STING (Statistical
Information Grid).
 The figure shows how clusters are formed after
applying grid-based clustering methodrid-
based method
MODEL-BASED CLUSTERING METHOD
 Model-based clustering works on the concept
of Probability Model which is a mathematical
representation of any random occurrence of
dataset. Each of the groups that would form
will have different Probability Model.
 The figure shows how clusters are formed after
applying Model-based clustering method
CONSTRAINT-BASED METHOD
 Constrained-based clustering method is a
semi-supervised learning technique where
amalgamation of small proportion of labeled
data with a large proportion of unlabeled data
occurs.
 Constrained K-means (COP-K-Means)
algorithm is one of the common algorithms
using this method
 The figure illustrates clustering using
Constraint-based method.
K-MEANS
CLUSTERING
K-MEANS CLUSTERING ALGORITHM
 The K-Means algorithm is a sort of partition-based clustering approach that belongs to the unsupervised
learning techniques. It divides a huge set of data into K number of smaller groups. The two distinct steps
of this method are described below.
 a. First phase: K centroids or centers are selected haphazardly in this phase. K should have a permanent
value. During the procedure, it cannot be changed.
 b. Second phase: Each data point is given its closest center or centroids during this phase. Euclidean
distance is used to calculate the separation between cluster centroids or centers and all data points.
 The distance between any two points, let's say point x and point y, is known as the Euclidean distance.
The separation between x and y is equal to the separation between x and y. Equation (1) states the
following for the Euclidean distance between any two randomly chosen points, x and y:
K-MEANS CLUSTERING ALGORITHM
 Algorithm for K-Means
 1. Input: Choose a database and select the value of K that is the number of clusters we want at the
end.Let
 the database be D with n number of data objects. D = {d1, d2, d3, ….,dn}
 2. Output: We will obtain an arrangement of K number of clusters.
 3. Algorithm
 (i) Randomly select the number of clusters, K.
 (ii) Choose the centre or the centroids for K clusters. The initial values of the centres are selected
 arbitrarily.
K-MEANS CLUSTERING ALGORITHM
 (iii) Arrange all data objects to the closest cluster; this is
determined with the help of Euclidean distance
 theory.
 (iv) Again calculate the centre of the cluster. This is evaluated by
taking the mean of the data objects
 present in each of the cluster individually. If there are n objects say
x1, x2, x3, …., and then the mean is
 given in equation (2)
 (v) Repeat step (iii) and (iv) until convergence. This is basically an
iterative technique
APPLICATION OF K-MEANS CLUSTERING ALGORITHM IN
MINING OF NETWORK SHARED RESOURCES
K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES
 The K-means algorithm has emerged as the most well-known and
widely used algorithm in the process of data collecting due to its
advantages of high data processing efficiency, low computational
complexity, and strong scalability.
 The data of Network shared resources is clustered in to different
classes using k-means clustering in the manner shown in the
image.
K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES
 When compared to existing methods that are mentioned above the K-means clustering algorithm has
the following advantages:
 The K-means clustering technique has a significant robustness when managing data sets. In particular,
when using the algorithm to handle the class and the class has a large gap between the data set, the
classification results are improved.
 The input order of data objects has almost no impact on the classification outcomes when numerical
data sets are classified using the K-means clustering algorithm.
K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES
 The reason is that in order to achieve the classification of the data set, the distance formula is applied to
determine the distance from each data object to the center point during the clustering process using
this technique.
 Which was not in the case of above mentioned methods where the outcomes of classification division
are hugely impacted buy the order of input objects.
 This algorithm is capable of handling big data sets. The outcomes of data clustering won't be affected if
there is data overlap between different data sets, hence this approach has good practical use.
COMPARISONS WITH EXISTING METHODS
ACCURACY
COMPARISON
 The accuracy of k-means
based method is almost
close to 97% while the other
methods could not be more
than 80% as the number of
experiments increases.
DATA MINING TIME
COMPARISON
 The average time for data
mining using K-means
clustering based method is
only 0.6s. whereas, the
average time for other
methods are almost 4.2 and
2.9 seconds.
CONCLUSION
 in order to improve the quality of network shared
resource data mining, the K-means cluster network
data mining technique has accuracy of in-depth data
mining of network shared resources by the method is
always over 94%, and the average time of in-depth
data mining is only 0.6s,.
 suggesting that this method can achieve fast and
accurate in-depth data mining of network shared
resources.
 Yet, there are still a number of challenges including
the deep mining of language and cross-cultural
resource sharing as well as the security,
personalization, and intelligence of resource data
mining to resolve.
THANK YOU

More Related Content

Similar to K- means clustering method based Data Mining of Network Shared Resources .pptx

A fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming dataA fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming data
Alexander Decker
 
F04463437
F04463437F04463437
F04463437
IOSR-JEN
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
Natasha Grant
 
47 292-298
47 292-29847 292-298
47 292-298
idescitation
 
Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...
IRJET Journal
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
ijsrd.com
 
Ensemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes ClusteringEnsemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes Clustering
IJERD Editor
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
IRJET Journal
 
Paper id 26201478
Paper id 26201478Paper id 26201478
Paper id 26201478
IJRAT
 
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
IOSR Journals
 
84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b
PRAWEEN KUMAR
 
Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...
Alexander Decker
 
G0354451
G0354451G0354451
G0354451
iosrjournals
 
Comparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data AnalysisComparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data Analysis
IOSR Journals
 
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering TechniquesFeature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
IRJET Journal
 
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...KamleshKumar394
 
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
IJECEIAES
 
Assessment of Cluster Tree Analysis based on Data Linkages
Assessment of Cluster Tree Analysis based on Data LinkagesAssessment of Cluster Tree Analysis based on Data Linkages
Assessment of Cluster Tree Analysis based on Data Linkages
journal ijrtem
 
Enhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online DataEnhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online Data
IOSR Journals
 

Similar to K- means clustering method based Data Mining of Network Shared Resources .pptx (20)

A fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming dataA fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming data
 
F04463437
F04463437F04463437
F04463437
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
 
47 292-298
47 292-29847 292-298
47 292-298
 
Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 
Ensemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes ClusteringEnsemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes Clustering
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
 
Paper id 26201478
Paper id 26201478Paper id 26201478
Paper id 26201478
 
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
 
Ir3116271633
Ir3116271633Ir3116271633
Ir3116271633
 
84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b
 
Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...
 
G0354451
G0354451G0354451
G0354451
 
Comparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data AnalysisComparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data Analysis
 
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering TechniquesFeature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
 
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
 
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
 
Assessment of Cluster Tree Analysis based on Data Linkages
Assessment of Cluster Tree Analysis based on Data LinkagesAssessment of Cluster Tree Analysis based on Data Linkages
Assessment of Cluster Tree Analysis based on Data Linkages
 
Enhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online DataEnhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online Data
 

Recently uploaded

Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 

Recently uploaded (20)

Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 

K- means clustering method based Data Mining of Network Shared Resources .pptx

  • 1. K-MEANS CLUSTERING METHOD BASED NETWORK SHARED RESOURCES MINING A SHORT STORY PRESENTED BY KANCHETI SAI PRAGNA SJSU_ID: 016698552
  • 2. WHY MINING NETWORK SHARED RESOURCES?  The demand for data resource sharing in internet has been growing and this brought up many optimization techniques in utilizing efficiency of resources.  At present, there are at least 15 Trillion files available on the internet, The vast availability of resources makes a complex task in retrieving the relevant data resources efficiently  In order to solve problems of large redundant information and relevant data resources research the need for data mining in network shared data resources arose.
  • 3. Existing Methods of network shared resources mining • There has been a significant research done in data mining methods in relevant data resources research and various techniques came into picture. • clustering analysis algorithm based Method where it uses clustering analysis algorithm to process resource data, construct the data preprocessing set, and calculate the data feature vector. • Another method based on multi-dimensional resource coordination and aggregation where this technique focuses on using the data center's network resource sharing process analysis as the basis for building a multidimensional resource aggregation data model. • using fuzzy logic to build multidimensional collaborative fitness functions, and using data mining to optimize decision-making in order to increase the execution efficiency of the data mining process. • However, Although these methods produced some excellent results they lack in run time efficiency, precision and they are usually complex to apply practically. • In order to overcome above drawbacks a new method based on k means clustering algorithm has come into picture.
  • 5. WHAT IS CLUSTERING?  Clustering is used in assembling bulky data into clusters or groups that helps us to visualize the internal structure of the data. Basically, it is a grouping of items based on how similar and distinct they are to one another  For example, there is some online shopping site where we can find variety of stuffs from electronics, clothing, books, grocery items, cosmetic items, accessories. Here in figure 2 describes how it looks after clustering is done.
  • 6. STAGES OF CLUSTERING  Raw Data  Clustering Algorithm  Clusters
  • 7. STAGES OF CLUSTERING  Raw Data: Raw data (which are not being processed yet) are collected from various sources on which we want to solicit various clustering algorithm  Clustering Algorithm: A specific algorithm is selected according to our requirements and then that very algorithm is applied on the raw data that were being selected.  Clusters: After soliciting the selected clustering algorithm on the raw data, we acquire our clusters.
  • 8. TYPES OF CLUSTERING  Partitioning Method  Density-based Method  Hierarchical Method  Grid-based method  Model-based clustering method  Constraint-based method
  • 9. PARTITIONING METHOD  In the case of partitioning clustering method, the objects of the datasets are segregated into numerous subsets.  Given some examples of the partitioning algorithms are K-means, PAM (Partitioning AroundMedoids).  The figure shows how clusters are formed after applying partitioning clustering technique
  • 10. DENSITY-BASED METHOD  Density-Based Clustering method identify distinctive clusters in the data, based on the idea that a cluster/group in a data space is a contiguous region of high point density, separated from other clusters by sparse regions.  Basically, in this method clusters are formed or the data spaces are partitioned by the density of the data point in a particular region  The figure shows how clusters are formed after applying Density-Based Method of clustering
  • 11. HIERARCHICAL METHOD  In the case of hierarchical clustering method, the objects of the datasets are segregated in the hierarchical fashion of clusters or groups.  Examples: Agglomerative Hierarchical clustering algorithm (AGNES), Divisive Hierarchical clustering algorithm (DIANA) etc.,  The figure shows how clusters are formed after applying Hierarchical Method of clustering
  • 12. GRID-BASED METHOD  In grid-based clustering method, the object space is divided into fixed number of cells that forms the shape of a grid like structure. Clustering algorithm is STING (Statistical Information Grid).  The figure shows how clusters are formed after applying grid-based clustering methodrid- based method
  • 13. MODEL-BASED CLUSTERING METHOD  Model-based clustering works on the concept of Probability Model which is a mathematical representation of any random occurrence of dataset. Each of the groups that would form will have different Probability Model.  The figure shows how clusters are formed after applying Model-based clustering method
  • 14. CONSTRAINT-BASED METHOD  Constrained-based clustering method is a semi-supervised learning technique where amalgamation of small proportion of labeled data with a large proportion of unlabeled data occurs.  Constrained K-means (COP-K-Means) algorithm is one of the common algorithms using this method  The figure illustrates clustering using Constraint-based method.
  • 16. K-MEANS CLUSTERING ALGORITHM  The K-Means algorithm is a sort of partition-based clustering approach that belongs to the unsupervised learning techniques. It divides a huge set of data into K number of smaller groups. The two distinct steps of this method are described below.  a. First phase: K centroids or centers are selected haphazardly in this phase. K should have a permanent value. During the procedure, it cannot be changed.  b. Second phase: Each data point is given its closest center or centroids during this phase. Euclidean distance is used to calculate the separation between cluster centroids or centers and all data points.  The distance between any two points, let's say point x and point y, is known as the Euclidean distance. The separation between x and y is equal to the separation between x and y. Equation (1) states the following for the Euclidean distance between any two randomly chosen points, x and y:
  • 17. K-MEANS CLUSTERING ALGORITHM  Algorithm for K-Means  1. Input: Choose a database and select the value of K that is the number of clusters we want at the end.Let  the database be D with n number of data objects. D = {d1, d2, d3, ….,dn}  2. Output: We will obtain an arrangement of K number of clusters.  3. Algorithm  (i) Randomly select the number of clusters, K.  (ii) Choose the centre or the centroids for K clusters. The initial values of the centres are selected  arbitrarily.
  • 18. K-MEANS CLUSTERING ALGORITHM  (iii) Arrange all data objects to the closest cluster; this is determined with the help of Euclidean distance  theory.  (iv) Again calculate the centre of the cluster. This is evaluated by taking the mean of the data objects  present in each of the cluster individually. If there are n objects say x1, x2, x3, …., and then the mean is  given in equation (2)  (v) Repeat step (iii) and (iv) until convergence. This is basically an iterative technique
  • 19. APPLICATION OF K-MEANS CLUSTERING ALGORITHM IN MINING OF NETWORK SHARED RESOURCES
  • 20. K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES  The K-means algorithm has emerged as the most well-known and widely used algorithm in the process of data collecting due to its advantages of high data processing efficiency, low computational complexity, and strong scalability.  The data of Network shared resources is clustered in to different classes using k-means clustering in the manner shown in the image.
  • 21. K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES  When compared to existing methods that are mentioned above the K-means clustering algorithm has the following advantages:  The K-means clustering technique has a significant robustness when managing data sets. In particular, when using the algorithm to handle the class and the class has a large gap between the data set, the classification results are improved.  The input order of data objects has almost no impact on the classification outcomes when numerical data sets are classified using the K-means clustering algorithm.
  • 22. K-MEANS-BASED DATA CLUSTERING OF NETWORK SHARED RESOURCES  The reason is that in order to achieve the classification of the data set, the distance formula is applied to determine the distance from each data object to the center point during the clustering process using this technique.  Which was not in the case of above mentioned methods where the outcomes of classification division are hugely impacted buy the order of input objects.  This algorithm is capable of handling big data sets. The outcomes of data clustering won't be affected if there is data overlap between different data sets, hence this approach has good practical use.
  • 24. ACCURACY COMPARISON  The accuracy of k-means based method is almost close to 97% while the other methods could not be more than 80% as the number of experiments increases.
  • 25. DATA MINING TIME COMPARISON  The average time for data mining using K-means clustering based method is only 0.6s. whereas, the average time for other methods are almost 4.2 and 2.9 seconds.
  • 26. CONCLUSION  in order to improve the quality of network shared resource data mining, the K-means cluster network data mining technique has accuracy of in-depth data mining of network shared resources by the method is always over 94%, and the average time of in-depth data mining is only 0.6s,.  suggesting that this method can achieve fast and accurate in-depth data mining of network shared resources.  Yet, there are still a number of challenges including the deep mining of language and cross-cultural resource sharing as well as the security, personalization, and intelligence of resource data mining to resolve.