SlideShare a Scribd company logo
1 of 16
1 I NAME OF PRESENTER
Data Mining
Ashis Kumar Chanda
Department of Computer Science and Engineering
University of Dhaka
2 I NAME OF PRESENTERCSE, DU2
Key concepts
 What is Data mining
 Why learn Data mining
 Data type
 Warehouse & OLAP
 Data Cleaning, Integration
 Associations, Item sets, Support, Confidence
3 I NAME OF PRESENTERCSE, DU3
Data Mining
 Data mining refers to Knowledge mining
from large amount of data
 Also known as “Knowledge Discovery from
Data” or KDD
 Target is to find a Hidden Pattern
4 I NAME OF PRESENTER
 We can’t get all type of information through Query
 Query not support Statistical analysis
 Again, we can apply artificial intelligence & find new
patterns or structures
CSE, DU4
Why learn data mining
Query provide values but data mining provides idea that help
to take (business ) decision
Ex: Women live at “Dhanmondi” & older than 40 years
most frequently buy “Jamdani Shari” at “Arong”
5 I NAME OF PRESENTERCSE, DU5
Data type
 Tabular (Transaction data) Most commonly
used
 Spatial Data (Remote sensing data/
encoded data)
 Tree Data ( xml )
 Graphs (www, bio-molecular)
 Sequence (DNA, activity log)
 Text, multimedia data
6 I NAME OF PRESENTERCSE, DU6
Warehouse & OLAP
Ware House
Data Source
Warehouse is an archive of information gathered from
multiple sources
Suppose a Banking database where each has a data source
that stores all transactions of that area. And all data source
will provide a clean/safe copy at Warehouse
7 I NAME OF PRESENTERCSE, DU7
Warehouse & OLAP
There is several issues about Warehouse:
 When and how to gather data
 What schema/pattern to use
 Data transformation & cleaning
 How to update
“Warehouse is a collection of data marts”
Where data mart is store of data in specialized pattern
8 I NAME OF PRESENTERCSE, DU8
Warehouse & OLAP
OLAP: Online Analytical Processing
OLAP tools support interactive analysis of summary Information
OLAP permits an analyst to view different summaries of
multidimensional data
Item name
Dress
Fig: Data Cube
9 I NAME OF PRESENTERCSE, DU9
Data cleaning
There may be some missing data, duplicate data, dirty data
So we need to data cleaning
Some methods:
 Ignore the tuple (not effective unless tuple contain many
missing attribute)
 Fill missing values (time consuming)
 Fill with a global value (like: unknown)
 Use mean attribute
 Use most probable value
10 I NAME OF PRESENTERCSE, DU10
11 I NAME OF PRESENTERCSE, DU11
Associations & Item sets
Associations:
An associations is a rule of the form if X then Y
It is denoted as X-> Y
Example: if there is an exam then I read
Item Sets:
For any rule if X->Y & Y->X Then X, Y are called item-set
Example:
People buying school books in January also by notebook
People buying school note books in January also by book
12 I NAME OF PRESENTERCSE, DU12
Support & confidence
Support:
The proportion of transactions in the data set which contains
the itemset
Confidence:
The conditional probability that an item appears in a
transaction when another item appears.
13 I NAME OF PRESENTERCSE, DU13
Support & confidence
Support for {I₁,I₂}
= support_count(I1 U I2)/ |D|
= 4/9
Confidence for I1 → I2
=support_count(I1 U I2) /
support_count(I1)
= 4/6
14 I NAME OF PRESENTERCSE, DU14
Association rules
Where, support count(AUB) is the number of transactions
containing the itemsets AUB, and support count(A) is the
number of transactions containing the itemset A.
•Association rules can be generated as follows:
1. For each frequent itemset l, generate all nonempty subsets
of l.
2. For every nonempty subset s of l, output the rule “s → (l-
s)” if support count(l)/support count(s) >= min_conf,
where min_conf is the minimum confidence threshold.
15 I NAME OF PRESENTERCSE, DU15
Summary
Basic topics: Data mining, Data cleaning, Warehouse, OLAP
Term: Association, Item-set, Support, Confidence
16 I NAME OF PRESENTERCSE, DU16
References
- Data Mining Concepts & Techniques
by J. Han & M. Kamber
- Database system Concept
by Abraham Sillberschatz, Korth, Sudarshan
- Lecture of Dr. S. Srinath
Institute of Technology at Madras, India

More Related Content

What's hot

Introduction to data pre-processing and cleaning
Introduction to data pre-processing and cleaning Introduction to data pre-processing and cleaning
Introduction to data pre-processing and cleaning Matteo Manca
 
A basic course on Research data management, part 4: caring for your data, or ...
A basic course on Research data management, part 4: caring for your data, or ...A basic course on Research data management, part 4: caring for your data, or ...
A basic course on Research data management, part 4: caring for your data, or ...Leon Osinski
 
A classification of methods for frequent pattern mining
A classification of methods for frequent pattern miningA classification of methods for frequent pattern mining
A classification of methods for frequent pattern miningIOSR Journals
 
DataVsStatistics
DataVsStatisticsDataVsStatistics
DataVsStatisticsjpheintz
 
A basic course on Research data management, part 1: what and why
A basic course on Research data management, part 1: what and whyA basic course on Research data management, part 1: what and why
A basic course on Research data management, part 1: what and whyLeon Osinski
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
Data mining nouman javed
Data mining   nouman javedData mining   nouman javed
Data mining nouman javednouman javed
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data miningEr. Nawaraj Bhandari
 
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...ijsrd.com
 
Data pre processing
Data pre processingData pre processing
Data pre processingpommurajopt
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingsuganmca14
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementDaniel JACOB
 
EDI Training Module 11: Publishing Data in the EDI Repository
EDI Training Module 11:  Publishing Data in the EDI RepositoryEDI Training Module 11:  Publishing Data in the EDI Repository
EDI Training Module 11: Publishing Data in the EDI RepositoryEnvironmental Data Initiative
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysisDataminingTools Inc
 
A Study of Various Projected Data Based Pattern Mining Algorithms
A Study of Various Projected Data Based Pattern Mining AlgorithmsA Study of Various Projected Data Based Pattern Mining Algorithms
A Study of Various Projected Data Based Pattern Mining Algorithmsijsrd.com
 
Data Warehouse By Piyush
Data Warehouse By PiyushData Warehouse By Piyush
Data Warehouse By Piyushastronish
 

What's hot (20)

Introduction to data pre-processing and cleaning
Introduction to data pre-processing and cleaning Introduction to data pre-processing and cleaning
Introduction to data pre-processing and cleaning
 
A basic course on Research data management, part 4: caring for your data, or ...
A basic course on Research data management, part 4: caring for your data, or ...A basic course on Research data management, part 4: caring for your data, or ...
A basic course on Research data management, part 4: caring for your data, or ...
 
23.database
23.database23.database
23.database
 
A classification of methods for frequent pattern mining
A classification of methods for frequent pattern miningA classification of methods for frequent pattern mining
A classification of methods for frequent pattern mining
 
DataVsStatistics
DataVsStatisticsDataVsStatistics
DataVsStatistics
 
A basic course on Research data management, part 1: what and why
A basic course on Research data management, part 1: what and whyA basic course on Research data management, part 1: what and why
A basic course on Research data management, part 1: what and why
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data mining nouman javed
Data mining   nouman javedData mining   nouman javed
Data mining nouman javed
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data mining
 
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
 
1 db terms
1 db terms1 db terms
1 db terms
 
Data pre processing
Data pre processingData pre processing
Data pre processing
 
Database
DatabaseDatabase
Database
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data structures
Data structuresData structures
Data structures
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
EDI Training Module 11: Publishing Data in the EDI Repository
EDI Training Module 11:  Publishing Data in the EDI RepositoryEDI Training Module 11:  Publishing Data in the EDI Repository
EDI Training Module 11: Publishing Data in the EDI Repository
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
A Study of Various Projected Data Based Pattern Mining Algorithms
A Study of Various Projected Data Based Pattern Mining AlgorithmsA Study of Various Projected Data Based Pattern Mining Algorithms
A Study of Various Projected Data Based Pattern Mining Algorithms
 
Data Warehouse By Piyush
Data Warehouse By PiyushData Warehouse By Piyush
Data Warehouse By Piyush
 

Viewers also liked

Mining Data from Reservoir Simulation Result
Mining Data from Reservoir Simulation ResultMining Data from Reservoir Simulation Result
Mining Data from Reservoir Simulation Resultakmaltk96
 
Keep austin weird 2014.ppt
Keep austin weird 2014.pptKeep austin weird 2014.ppt
Keep austin weird 2014.pptMelinda Brasher
 
Tornado re brand presentation (draft)(not for reproduction)
Tornado re brand presentation (draft)(not for reproduction)Tornado re brand presentation (draft)(not for reproduction)
Tornado re brand presentation (draft)(not for reproduction)Melinda Brasher
 
Test Powerpoint Upload
Test Powerpoint UploadTest Powerpoint Upload
Test Powerpoint UploadMatthew Walton
 
Test audio
Test audioTest audio
Test audioGalinaMi
 
Venticinque Aprile Un bellissimo giorno da ricordare e onorare
Venticinque   Aprile Un bellissimo giorno da ricordare e onorareVenticinque   Aprile Un bellissimo giorno da ricordare e onorare
Venticinque Aprile Un bellissimo giorno da ricordare e onorareLaura Franchini
 
Universal Design
Universal DesignUniversal Design
Universal Designsummerbloom
 
Nooges Brochure
Nooges BrochureNooges Brochure
Nooges Brochurenoogeking
 
The colors of the flag
The colors of the flagThe colors of the flag
The colors of the flagforever97
 
Mexican manufacturers inc 10 18
Mexican manufacturers inc 10 18Mexican manufacturers inc 10 18
Mexican manufacturers inc 10 18John Martino
 
Nooges-T Project
Nooges-T ProjectNooges-T Project
Nooges-T Projectnoogeking
 
Big Data vs. Smart Data: The Cook County Land Bank’s Data-Driven plan for lan...
Big Data vs. Smart Data: The Cook County Land Bank’s Data-Driven plan for lan...Big Data vs. Smart Data: The Cook County Land Bank’s Data-Driven plan for lan...
Big Data vs. Smart Data: The Cook County Land Bank’s Data-Driven plan for lan...Cook County Commissioner Bridget Gainer
 

Viewers also liked (20)

Mining Data from Reservoir Simulation Result
Mining Data from Reservoir Simulation ResultMining Data from Reservoir Simulation Result
Mining Data from Reservoir Simulation Result
 
10 flatteners
10 flatteners10 flatteners
10 flatteners
 
L’eutanasia
L’eutanasiaL’eutanasia
L’eutanasia
 
Activity in comp.
Activity in comp.Activity in comp.
Activity in comp.
 
Keep austin weird 2014.ppt
Keep austin weird 2014.pptKeep austin weird 2014.ppt
Keep austin weird 2014.ppt
 
Keep Austin Weird 2013
Keep Austin Weird 2013Keep Austin Weird 2013
Keep Austin Weird 2013
 
Tornado re brand presentation (draft)(not for reproduction)
Tornado re brand presentation (draft)(not for reproduction)Tornado re brand presentation (draft)(not for reproduction)
Tornado re brand presentation (draft)(not for reproduction)
 
Final photos
Final photosFinal photos
Final photos
 
Test Powerpoint Upload
Test Powerpoint UploadTest Powerpoint Upload
Test Powerpoint Upload
 
Test audio
Test audioTest audio
Test audio
 
Venticinque Aprile Un bellissimo giorno da ricordare e onorare
Venticinque   Aprile Un bellissimo giorno da ricordare e onorareVenticinque   Aprile Un bellissimo giorno da ricordare e onorare
Venticinque Aprile Un bellissimo giorno da ricordare e onorare
 
Universal Design
Universal DesignUniversal Design
Universal Design
 
Nooges Brochure
Nooges BrochureNooges Brochure
Nooges Brochure
 
Disney
DisneyDisney
Disney
 
The colors of the flag
The colors of the flagThe colors of the flag
The colors of the flag
 
Mexican manufacturers inc 10 18
Mexican manufacturers inc 10 18Mexican manufacturers inc 10 18
Mexican manufacturers inc 10 18
 
Nooges-T Project
Nooges-T ProjectNooges-T Project
Nooges-T Project
 
Big Data vs. Smart Data: The Cook County Land Bank’s Data-Driven plan for lan...
Big Data vs. Smart Data: The Cook County Land Bank’s Data-Driven plan for lan...Big Data vs. Smart Data: The Cook County Land Bank’s Data-Driven plan for lan...
Big Data vs. Smart Data: The Cook County Land Bank’s Data-Driven plan for lan...
 
Fotosintesis
FotosintesisFotosintesis
Fotosintesis
 
Jaguar
JaguarJaguar
Jaguar
 

Similar to Data Mining (Introduction)

Data Mining @ BSU Malolos 2019
Data Mining @ BSU Malolos 2019Data Mining @ BSU Malolos 2019
Data Mining @ BSU Malolos 2019Edwin S. Garcia
 
MS SQL SERVER: Introduction To Database Concepts
MS SQL SERVER: Introduction To Database ConceptsMS SQL SERVER: Introduction To Database Concepts
MS SQL SERVER: Introduction To Database Conceptssqlserver content
 
MS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsMS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsDataminingTools Inc
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
SIM PASCA CHAPTER 4.pdf
SIM PASCA CHAPTER 4.pdfSIM PASCA CHAPTER 4.pdf
SIM PASCA CHAPTER 4.pdfAdiSuputrq
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business IntelligenceSukirti Garg
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data WarehousingAswathy S Nair
 
DMDW Lesson 04 - Data Mining Theory
DMDW Lesson 04 - Data Mining TheoryDMDW Lesson 04 - Data Mining Theory
DMDW Lesson 04 - Data Mining TheoryJohannes Hoppe
 
Data warehousing interview questions
Data warehousing interview questionsData warehousing interview questions
Data warehousing interview questionsSatyam Jaiswal
 
Master Minds on Data Science - Arno Siebes
Master Minds on Data Science - Arno SiebesMaster Minds on Data Science - Arno Siebes
Master Minds on Data Science - Arno SiebesMedia Perspectives
 
Introduction to Data Science With R Notes
Introduction to Data Science With R NotesIntroduction to Data Science With R Notes
Introduction to Data Science With R NotesLakshmiSarvani6
 
DMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse TheoryDMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse TheoryJohannes Hoppe
 
DMML1_overview.ppt
DMML1_overview.pptDMML1_overview.ppt
DMML1_overview.pptbutest
 
Data Mining Concepts and Techniques
Data Mining Concepts and TechniquesData Mining Concepts and Techniques
Data Mining Concepts and TechniquesPratik Tambekar
 
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache HadoopA Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache HadoopIJTET Journal
 

Similar to Data Mining (Introduction) (20)

Data Mining @ BSU Malolos 2019
Data Mining @ BSU Malolos 2019Data Mining @ BSU Malolos 2019
Data Mining @ BSU Malolos 2019
 
MS SQL SERVER: Introduction To Database Concepts
MS SQL SERVER: Introduction To Database ConceptsMS SQL SERVER: Introduction To Database Concepts
MS SQL SERVER: Introduction To Database Concepts
 
MS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsMS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database Concepts
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Lec 1 introduction
Lec 1 introductionLec 1 introduction
Lec 1 introduction
 
SIM PASCA CHAPTER 4.pdf
SIM PASCA CHAPTER 4.pdfSIM PASCA CHAPTER 4.pdf
SIM PASCA CHAPTER 4.pdf
 
Data mining
Data miningData mining
Data mining
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data Warehousing
 
DMDW Lesson 04 - Data Mining Theory
DMDW Lesson 04 - Data Mining TheoryDMDW Lesson 04 - Data Mining Theory
DMDW Lesson 04 - Data Mining Theory
 
Data warehousing interview questions
Data warehousing interview questionsData warehousing interview questions
Data warehousing interview questions
 
Master Minds on Data Science - Arno Siebes
Master Minds on Data Science - Arno SiebesMaster Minds on Data Science - Arno Siebes
Master Minds on Data Science - Arno Siebes
 
Introduction to Data Science With R Notes
Introduction to Data Science With R NotesIntroduction to Data Science With R Notes
Introduction to Data Science With R Notes
 
DMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse TheoryDMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse Theory
 
Database system Handbook 4th muhammad sharif.pdf
Database system Handbook 4th muhammad sharif.pdfDatabase system Handbook 4th muhammad sharif.pdf
Database system Handbook 4th muhammad sharif.pdf
 
Database system Handbook 4th muhammad sharif.pdf
Database system Handbook 4th muhammad sharif.pdfDatabase system Handbook 4th muhammad sharif.pdf
Database system Handbook 4th muhammad sharif.pdf
 
DMML1_overview.ppt
DMML1_overview.pptDMML1_overview.ppt
DMML1_overview.ppt
 
Data Mining Concepts and Techniques
Data Mining Concepts and TechniquesData Mining Concepts and Techniques
Data Mining Concepts and Techniques
 
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache HadoopA Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
 
Data warehousing and Data mining
Data warehousing and Data mining Data warehousing and Data mining
Data warehousing and Data mining
 

More from Ashis Kumar Chanda (20)

Word 2 vector
Word 2 vectorWord 2 vector
Word 2 vector
 
Multi-class Image Classification using deep convolutional networks on extreme...
Multi-class Image Classification using deep convolutional networks on extreme...Multi-class Image Classification using deep convolutional networks on extreme...
Multi-class Image Classification using deep convolutional networks on extreme...
 
Full resolution image compression with recurrent neural networks
Full resolution image compression with  recurrent neural networksFull resolution image compression with  recurrent neural networks
Full resolution image compression with recurrent neural networks
 
Understanding Natural Language Queries over Relational Databases
Understanding Natural Language Queries over Relational DatabasesUnderstanding Natural Language Queries over Relational Databases
Understanding Natural Language Queries over Relational Databases
 
03. Agile Development
03. Agile Development03. Agile Development
03. Agile Development
 
Software Cost Estimation
Software Cost EstimationSoftware Cost Estimation
Software Cost Estimation
 
Risk Management
Risk ManagementRisk Management
Risk Management
 
Project Management
Project ManagementProject Management
Project Management
 
MVC
MVCMVC
MVC
 
Requirements engineering
Requirements engineeringRequirements engineering
Requirements engineering
 
4. UML
4. UML4. UML
4. UML
 
2. Software process
2. Software process2. Software process
2. Software process
 
1. Introduction
1. Introduction1. Introduction
1. Introduction
 
Periodic pattern mining
Periodic pattern miningPeriodic pattern mining
Periodic pattern mining
 
FPPM algorithm
FPPM algorithmFPPM algorithm
FPPM algorithm
 
Secure software design
Secure software designSecure software design
Secure software design
 
Sequential logic circuit optimization
Sequential logic circuit optimizationSequential logic circuit optimization
Sequential logic circuit optimization
 
Introduction to CS
Introduction to CSIntroduction to CS
Introduction to CS
 
Iterative deepening search
Iterative deepening searchIterative deepening search
Iterative deepening search
 
CloudBus
CloudBusCloudBus
CloudBus
 

Recently uploaded

ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacingjaychoudhary37
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacing
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 

Data Mining (Introduction)

  • 1. 1 I NAME OF PRESENTER Data Mining Ashis Kumar Chanda Department of Computer Science and Engineering University of Dhaka
  • 2. 2 I NAME OF PRESENTERCSE, DU2 Key concepts  What is Data mining  Why learn Data mining  Data type  Warehouse & OLAP  Data Cleaning, Integration  Associations, Item sets, Support, Confidence
  • 3. 3 I NAME OF PRESENTERCSE, DU3 Data Mining  Data mining refers to Knowledge mining from large amount of data  Also known as “Knowledge Discovery from Data” or KDD  Target is to find a Hidden Pattern
  • 4. 4 I NAME OF PRESENTER  We can’t get all type of information through Query  Query not support Statistical analysis  Again, we can apply artificial intelligence & find new patterns or structures CSE, DU4 Why learn data mining Query provide values but data mining provides idea that help to take (business ) decision Ex: Women live at “Dhanmondi” & older than 40 years most frequently buy “Jamdani Shari” at “Arong”
  • 5. 5 I NAME OF PRESENTERCSE, DU5 Data type  Tabular (Transaction data) Most commonly used  Spatial Data (Remote sensing data/ encoded data)  Tree Data ( xml )  Graphs (www, bio-molecular)  Sequence (DNA, activity log)  Text, multimedia data
  • 6. 6 I NAME OF PRESENTERCSE, DU6 Warehouse & OLAP Ware House Data Source Warehouse is an archive of information gathered from multiple sources Suppose a Banking database where each has a data source that stores all transactions of that area. And all data source will provide a clean/safe copy at Warehouse
  • 7. 7 I NAME OF PRESENTERCSE, DU7 Warehouse & OLAP There is several issues about Warehouse:  When and how to gather data  What schema/pattern to use  Data transformation & cleaning  How to update “Warehouse is a collection of data marts” Where data mart is store of data in specialized pattern
  • 8. 8 I NAME OF PRESENTERCSE, DU8 Warehouse & OLAP OLAP: Online Analytical Processing OLAP tools support interactive analysis of summary Information OLAP permits an analyst to view different summaries of multidimensional data Item name Dress Fig: Data Cube
  • 9. 9 I NAME OF PRESENTERCSE, DU9 Data cleaning There may be some missing data, duplicate data, dirty data So we need to data cleaning Some methods:  Ignore the tuple (not effective unless tuple contain many missing attribute)  Fill missing values (time consuming)  Fill with a global value (like: unknown)  Use mean attribute  Use most probable value
  • 10. 10 I NAME OF PRESENTERCSE, DU10
  • 11. 11 I NAME OF PRESENTERCSE, DU11 Associations & Item sets Associations: An associations is a rule of the form if X then Y It is denoted as X-> Y Example: if there is an exam then I read Item Sets: For any rule if X->Y & Y->X Then X, Y are called item-set Example: People buying school books in January also by notebook People buying school note books in January also by book
  • 12. 12 I NAME OF PRESENTERCSE, DU12 Support & confidence Support: The proportion of transactions in the data set which contains the itemset Confidence: The conditional probability that an item appears in a transaction when another item appears.
  • 13. 13 I NAME OF PRESENTERCSE, DU13 Support & confidence Support for {I₁,I₂} = support_count(I1 U I2)/ |D| = 4/9 Confidence for I1 → I2 =support_count(I1 U I2) / support_count(I1) = 4/6
  • 14. 14 I NAME OF PRESENTERCSE, DU14 Association rules Where, support count(AUB) is the number of transactions containing the itemsets AUB, and support count(A) is the number of transactions containing the itemset A. •Association rules can be generated as follows: 1. For each frequent itemset l, generate all nonempty subsets of l. 2. For every nonempty subset s of l, output the rule “s → (l- s)” if support count(l)/support count(s) >= min_conf, where min_conf is the minimum confidence threshold.
  • 15. 15 I NAME OF PRESENTERCSE, DU15 Summary Basic topics: Data mining, Data cleaning, Warehouse, OLAP Term: Association, Item-set, Support, Confidence
  • 16. 16 I NAME OF PRESENTERCSE, DU16 References - Data Mining Concepts & Techniques by J. Han & M. Kamber - Database system Concept by Abraham Sillberschatz, Korth, Sudarshan - Lecture of Dr. S. Srinath Institute of Technology at Madras, India