SlideShare a Scribd company logo
Department of CT III-B.Sc-CT VI Semester: 2019-20
16ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
Course: Data Mining Sub Code: 6ED
Google Classroom: q7b4gv Programme: B.Sc-CT
Unit: I Hour : 4
Faculty: Ms. A.SATHIYA PRIYA
clustering
Unit I Basic Data Mining Tasks
Department of CT III-B.Sc-CT VI Semester: 2019-20
2
Department of Computer Technology III BSC CT SEM V Year:
2019- 20
UNIT I Basic Data Mining Tasks6ED – Data Mining
SNAP TALK
2
Department of CT III-B.Sc-CT VI Semester: 2019-20
3
Department of Computer Technology III BSC CT SEM V Year:
2019- 20
UNIT I Basic Data Mining Tasks6ED – Data Mining
ATTENDANCE
3
Department of CT III-B.Sc-CT VI Semester: 2019-20
Unit-I
Basic Data Mining Tasks - Data Mining Versus
Knowledge Discovery in Databases - Data Mining Issues
- Data Mining Matrices - Social Implications of Data
Mining - Data Mining from Data Base Perspective.
4
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
CLUSTERING
• What is Clustering?
• Clustering is the process of making a group of
abstract objects into classes of similar objects.
• A cluster of data objects can be treated as one group.
• While doing cluster analysis, we first partition the set
of data into groups based on data similarity and then
assign the labels to the groups.
5
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
Cont.,
The main advantage of clustering over classification is
that, it is adaptable to changes and helps single out
useful features that distinguish different groups.
6
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
APPLICATIONS
• Applications of Cluster Analysis
• Clustering analysis is broadly used in many
applications such as market research, pattern
recognition, data analysis, and image processing.
• Clustering can also help marketers discover distinct
groups in their customer base. And they can
characterize their customer groups based on the
purchasing patterns.
7
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
Cont.,
• In the field of biology, it can be used to derive plant
and animal taxonomies, categorize genes with similar
functionalities and gain insight into structures
inherent to populations.
• Clustering also helps in identification of areas of
similar land use in an earth observation database. It
also helps in the identification of groups of houses in
a city according to house type, value, and geographic
location.
8
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
Cont.,
• Clustering also helps in classifying documents on the
web for information discovery.
• Clustering is also used in outlier detection
applications such as detection of credit card fraud.
• As a data mining function, cluster analysis serves as a
tool to gain insight into the distribution of data to
observe characteristics of each cluster.
9
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
Requirements of Clustering in Data Mining
• He following points throw light on why clustering is
required in data mining −
• Scalability − We need highly scalable clustering
algorithms to deal with large databases.
• Ability to deal with different kinds of attributes −
Algorithms should be capable to be applied on any
kind of data such as interval-based (numerical) data,
categorical, and binary data.
10
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
Cont.,
• Discovery of clusters with attribute shape − The
clustering algorithm should be capable of detecting
clusters of arbitrary shape. They should not be
bounded to only distance measures that tend to find
spherical cluster of small sizes.
• High dimensionality − The clustering algorithm
should not only be able to handle low-dimensional
data but also the high dimensional space.
11
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
Cont.,
• Ability to deal with noisy data − Databases contain
noisy, missing or erroneous data. Some algorithms
are sensitive to such data and may lead to poor
quality clusters.
• Interpretability − The clustering results should be
interpretable, comprehensible, and usable.
12
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
Methods
• Clustering Methods
• Clustering methods can be classified into the
following categories −
• Partitioning Method
• Hierarchical Method
• Density-based Method
• Grid-Based Method
• Model-Based Method
• Constraint-based MethodPerspective.
13
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
MCQ’s
1. ______is the process of making a group of abstract
objects into classes of similar objects
2. A cluster of data objects can be treated as _______
3. In the field of ______, it can be used to derive plant
It have Ability to deal with _____data
4. The clustering result should be
interpretable,comprehensible and __________.
5. The clustering algorithm should be capable of
detecting clusters of _______shape.
14
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
MCQ’s
1. cluster
2. One group
3. Biology
4. uasble
5. arbitrary.
15
Department of Computer Science III-B.Sc-CT VI Semester: 2019-20
Unit I Basic Data Mining Tasks6ED – Data Mining
Department of CT III-B.Sc-CT VI Semester: 2019-20
THANK U
16
Department of Computer Technology III BSC CT SEM V year: 2019-
20
6ED – Data Mining UNIT I Basic Data Mining Tasks

More Related Content

Similar to Clustering, application, methods u 1

Data Mining @ Information Age
Data Mining @ Information AgeData Mining @ Information Age
Data Mining @ Information Age
IIRindia
 
How Partitioning Clustering Technique For Implementing...
How Partitioning Clustering Technique For Implementing...How Partitioning Clustering Technique For Implementing...
How Partitioning Clustering Technique For Implementing...
Nicolle Dammann
 
IJSRED-V2I3P53
IJSRED-V2I3P53IJSRED-V2I3P53
IJSRED-V2I3P53
IJSRED
 
Web Based Fuzzy Clustering Analysis
Web Based Fuzzy Clustering AnalysisWeb Based Fuzzy Clustering Analysis
Web Based Fuzzy Clustering Analysis
inventy
 
DWDM syllabus.doc
DWDM syllabus.docDWDM syllabus.doc
DWDM syllabus.doc
RitCse
 
IRJET - Exploring Agglomerative Spectral Clustering Technique Employed for...
IRJET - 	  Exploring Agglomerative Spectral Clustering Technique Employed for...IRJET - 	  Exploring Agglomerative Spectral Clustering Technique Employed for...
IRJET - Exploring Agglomerative Spectral Clustering Technique Employed for...
IRJET Journal
 
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
IJCSIS Research Publications
 
Future Skills Needs for Data and Analytics
Future Skills Needs for Data and Analytics Future Skills Needs for Data and Analytics
Future Skills Needs for Data and Analytics
Dublinked .
 
New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...
IRJET Journal
 
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
ijtsrd
 
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEWBIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
sarfraznawaz
 
1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx
braycarissa250
 
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdfR18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
Naveen Kumar
 
The Data Science Process: From Mining Raw Data to Story Visualization
The Data Science Process: From Mining Raw Data to Story VisualizationThe Data Science Process: From Mining Raw Data to Story Visualization
The Data Science Process: From Mining Raw Data to Story Visualization
Demetris Trihinas
 
Knowledge Acquisition Based on Repertory Grid Analysis System
Knowledge Acquisition Based on Repertory Grid Analysis SystemKnowledge Acquisition Based on Repertory Grid Analysis System
Knowledge Acquisition Based on Repertory Grid Analysis System
ijtsrd
 
From data mining to knowledge discovery in
From data mining to knowledge discovery inFrom data mining to knowledge discovery in
From data mining to knowledge discovery in
Raj Kumar Ranabhat
 
Data Mining Primitives, Languages & Systems
Data Mining Primitives, Languages & SystemsData Mining Primitives, Languages & Systems
Data Mining Primitives, Languages & Systems
Niloy Sikder
 
Clustering
ClusteringClustering
Clustering
Kiran Bhowmick
 
Data Mining Framework for Network Intrusion Detection using Efficient Techniques
Data Mining Framework for Network Intrusion Detection using Efficient TechniquesData Mining Framework for Network Intrusion Detection using Efficient Techniques
Data Mining Framework for Network Intrusion Detection using Efficient Techniques
IJAEMSJORNAL
 
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHOD
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHODPREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHOD
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHOD
AM Publications
 

Similar to Clustering, application, methods u 1 (20)

Data Mining @ Information Age
Data Mining @ Information AgeData Mining @ Information Age
Data Mining @ Information Age
 
How Partitioning Clustering Technique For Implementing...
How Partitioning Clustering Technique For Implementing...How Partitioning Clustering Technique For Implementing...
How Partitioning Clustering Technique For Implementing...
 
IJSRED-V2I3P53
IJSRED-V2I3P53IJSRED-V2I3P53
IJSRED-V2I3P53
 
Web Based Fuzzy Clustering Analysis
Web Based Fuzzy Clustering AnalysisWeb Based Fuzzy Clustering Analysis
Web Based Fuzzy Clustering Analysis
 
DWDM syllabus.doc
DWDM syllabus.docDWDM syllabus.doc
DWDM syllabus.doc
 
IRJET - Exploring Agglomerative Spectral Clustering Technique Employed for...
IRJET - 	  Exploring Agglomerative Spectral Clustering Technique Employed for...IRJET - 	  Exploring Agglomerative Spectral Clustering Technique Employed for...
IRJET - Exploring Agglomerative Spectral Clustering Technique Employed for...
 
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
 
Future Skills Needs for Data and Analytics
Future Skills Needs for Data and Analytics Future Skills Needs for Data and Analytics
Future Skills Needs for Data and Analytics
 
New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...
 
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
 
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEWBIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
 
1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx
 
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdfR18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
 
The Data Science Process: From Mining Raw Data to Story Visualization
The Data Science Process: From Mining Raw Data to Story VisualizationThe Data Science Process: From Mining Raw Data to Story Visualization
The Data Science Process: From Mining Raw Data to Story Visualization
 
Knowledge Acquisition Based on Repertory Grid Analysis System
Knowledge Acquisition Based on Repertory Grid Analysis SystemKnowledge Acquisition Based on Repertory Grid Analysis System
Knowledge Acquisition Based on Repertory Grid Analysis System
 
From data mining to knowledge discovery in
From data mining to knowledge discovery inFrom data mining to knowledge discovery in
From data mining to knowledge discovery in
 
Data Mining Primitives, Languages & Systems
Data Mining Primitives, Languages & SystemsData Mining Primitives, Languages & Systems
Data Mining Primitives, Languages & Systems
 
Clustering
ClusteringClustering
Clustering
 
Data Mining Framework for Network Intrusion Detection using Efficient Techniques
Data Mining Framework for Network Intrusion Detection using Efficient TechniquesData Mining Framework for Network Intrusion Detection using Efficient Techniques
Data Mining Framework for Network Intrusion Detection using Efficient Techniques
 
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHOD
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHODPREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHOD
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHOD
 

Recently uploaded

Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
terusbelajar5
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 

Recently uploaded (20)

Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 

Clustering, application, methods u 1

  • 1. Department of CT III-B.Sc-CT VI Semester: 2019-20 16ED – Data Mining Department of CT III-B.Sc-CT VI Semester: 2019-20 Course: Data Mining Sub Code: 6ED Google Classroom: q7b4gv Programme: B.Sc-CT Unit: I Hour : 4 Faculty: Ms. A.SATHIYA PRIYA clustering Unit I Basic Data Mining Tasks
  • 2. Department of CT III-B.Sc-CT VI Semester: 2019-20 2 Department of Computer Technology III BSC CT SEM V Year: 2019- 20 UNIT I Basic Data Mining Tasks6ED – Data Mining SNAP TALK 2
  • 3. Department of CT III-B.Sc-CT VI Semester: 2019-20 3 Department of Computer Technology III BSC CT SEM V Year: 2019- 20 UNIT I Basic Data Mining Tasks6ED – Data Mining ATTENDANCE 3
  • 4. Department of CT III-B.Sc-CT VI Semester: 2019-20 Unit-I Basic Data Mining Tasks - Data Mining Versus Knowledge Discovery in Databases - Data Mining Issues - Data Mining Matrices - Social Implications of Data Mining - Data Mining from Data Base Perspective. 4 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 5. Department of CT III-B.Sc-CT VI Semester: 2019-20 CLUSTERING • What is Clustering? • Clustering is the process of making a group of abstract objects into classes of similar objects. • A cluster of data objects can be treated as one group. • While doing cluster analysis, we first partition the set of data into groups based on data similarity and then assign the labels to the groups. 5 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 6. Department of CT III-B.Sc-CT VI Semester: 2019-20 Cont., The main advantage of clustering over classification is that, it is adaptable to changes and helps single out useful features that distinguish different groups. 6 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 7. Department of CT III-B.Sc-CT VI Semester: 2019-20 APPLICATIONS • Applications of Cluster Analysis • Clustering analysis is broadly used in many applications such as market research, pattern recognition, data analysis, and image processing. • Clustering can also help marketers discover distinct groups in their customer base. And they can characterize their customer groups based on the purchasing patterns. 7 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 8. Department of CT III-B.Sc-CT VI Semester: 2019-20 Cont., • In the field of biology, it can be used to derive plant and animal taxonomies, categorize genes with similar functionalities and gain insight into structures inherent to populations. • Clustering also helps in identification of areas of similar land use in an earth observation database. It also helps in the identification of groups of houses in a city according to house type, value, and geographic location. 8 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 9. Department of CT III-B.Sc-CT VI Semester: 2019-20 Cont., • Clustering also helps in classifying documents on the web for information discovery. • Clustering is also used in outlier detection applications such as detection of credit card fraud. • As a data mining function, cluster analysis serves as a tool to gain insight into the distribution of data to observe characteristics of each cluster. 9 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 10. Department of CT III-B.Sc-CT VI Semester: 2019-20 Requirements of Clustering in Data Mining • He following points throw light on why clustering is required in data mining − • Scalability − We need highly scalable clustering algorithms to deal with large databases. • Ability to deal with different kinds of attributes − Algorithms should be capable to be applied on any kind of data such as interval-based (numerical) data, categorical, and binary data. 10 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 11. Department of CT III-B.Sc-CT VI Semester: 2019-20 Cont., • Discovery of clusters with attribute shape − The clustering algorithm should be capable of detecting clusters of arbitrary shape. They should not be bounded to only distance measures that tend to find spherical cluster of small sizes. • High dimensionality − The clustering algorithm should not only be able to handle low-dimensional data but also the high dimensional space. 11 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 12. Department of CT III-B.Sc-CT VI Semester: 2019-20 Cont., • Ability to deal with noisy data − Databases contain noisy, missing or erroneous data. Some algorithms are sensitive to such data and may lead to poor quality clusters. • Interpretability − The clustering results should be interpretable, comprehensible, and usable. 12 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 13. Department of CT III-B.Sc-CT VI Semester: 2019-20 Methods • Clustering Methods • Clustering methods can be classified into the following categories − • Partitioning Method • Hierarchical Method • Density-based Method • Grid-Based Method • Model-Based Method • Constraint-based MethodPerspective. 13 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 14. Department of CT III-B.Sc-CT VI Semester: 2019-20 MCQ’s 1. ______is the process of making a group of abstract objects into classes of similar objects 2. A cluster of data objects can be treated as _______ 3. In the field of ______, it can be used to derive plant It have Ability to deal with _____data 4. The clustering result should be interpretable,comprehensible and __________. 5. The clustering algorithm should be capable of detecting clusters of _______shape. 14 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 15. Department of CT III-B.Sc-CT VI Semester: 2019-20 MCQ’s 1. cluster 2. One group 3. Biology 4. uasble 5. arbitrary. 15 Department of Computer Science III-B.Sc-CT VI Semester: 2019-20 Unit I Basic Data Mining Tasks6ED – Data Mining
  • 16. Department of CT III-B.Sc-CT VI Semester: 2019-20 THANK U 16 Department of Computer Technology III BSC CT SEM V year: 2019- 20 6ED – Data Mining UNIT I Basic Data Mining Tasks