SlideShare a Scribd company logo

Chapter 13. Trends and Research Frontiers in Data Mining.ppt

Jiawei Han, Micheline Kamber and Jian Pei Data Mining: Concepts and Techniques, 3rd ed. The Morgan Kaufmann Series in Data Management Systems Morgan Kaufmann Publishers, July 2011. ISBN 978-0123814791

1 of 52
Download to read offline
Data Mining:
Concepts and Techniques
(3rd ed.)
Chapter 13
Subrata Kumer Paul
Assistant Professor, Dept. of CSE, BAUET
sksubrata96@gmail.com
Chapter 13. Trends and Research Frontiers in Data Mining.ppt
3
Chapter 13: Data Mining Trends and
Research Frontiers
 Mining Complex Types of Data
 Other Methodologies of Data Mining
 Data Mining Applications
 Data Mining and Society
 Data Mining Trends
 Summary
4
Mining Complex Types of Data
 Mining Sequence Data
 Mining Time Series
 Mining Symbolic Sequences
 Mining Biological Sequences
 Mining Graphs and Networks
 Mining Other Kinds of Data
5
Mining Sequence Data
 Similarity Search in Time Series Data
 Subsequence match, dimensionality reduction, query-based similarity
search, motif-based similarity search
 Regression and Trend Analysis in Time-Series Data
 long term + cyclic + seasonal variation + random movements
 Sequential Pattern Mining in Symbolic Sequences
 GSP, PrefixSpan, constraint-based sequential pattern mining
 Sequence Classification
 Feature-based vs. sequence-distance-based vs. model-based
 Alignment of Biological Sequences
 Pair-wise vs. multi-sequence alignment, substitution matirces, BLAST
 Hidden Markov Model for Biological Sequence Analysis
 Markov chain vs. hidden Markov models, forward vs. Viterbi vs. Baum-
Welch algorithms
6
Mining Graphs and Networks
 Graph Pattern Mining
 Frequent subgraph patterns, closed graph patterns, gSpan vs. CloseGraph
 Statistical Modeling of Networks
 Small world phenomenon, power law (log-tail) distribution, densification
 Clustering and Classification of Graphs and Homogeneous Networks
 Clustering: Fast Modularity vs. SCAN
 Classification: model vs. pattern-based mining
 Clustering, Ranking and Classification of Heterogeneous Networks
 RankClus, RankClass, and meta path-based, user-guided methodology
 Role Discovery and Link Prediction in Information Networks
 PathPredict
 Similarity Search and OLAP in Information Networks: PathSim, GraphCube
 Evolution of Social and Information Networks: EvoNetClus

Recommended

ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to productionHerman Wu
 
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...Simplilearn
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 5
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter 5 Data Mining:  Concepts and Techniques (3rd ed.)— Chapter 5
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 5 Salah Amean
 
Probabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsProbabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsSelman Bozkır
 
WEKA: Output Knowledge Representation
WEKA: Output Knowledge RepresentationWEKA: Output Knowledge Representation
WEKA: Output Knowledge RepresentationDataminingTools Inc
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks
 

More Related Content

What's hot

Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform LoadABDUL KHALIQ
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrievalKU Leuven
 
Design Patterns - General Introduction
Design Patterns - General IntroductionDesign Patterns - General Introduction
Design Patterns - General IntroductionAsma CHERIF
 
Introduction to WebSockets Presentation
Introduction to WebSockets PresentationIntroduction to WebSockets Presentation
Introduction to WebSockets PresentationJulien LaPointe
 
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublinm_ackermann
 
Java design patterns
Java design patternsJava design patterns
Java design patternsShawn Brito
 
Azure Data Factory Data Flows Training v005
Azure Data Factory Data Flows Training v005Azure Data Factory Data Flows Training v005
Azure Data Factory Data Flows Training v005Mark Kromer
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Simplilearn
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 
Top 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & TricksTop 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & TricksNeo4j
 

What's hot (20)

Java Servlets
Java ServletsJava Servlets
Java Servlets
 
Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform Load
 
Inverted index
Inverted indexInverted index
Inverted index
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
 
Design Patterns - General Introduction
Design Patterns - General IntroductionDesign Patterns - General Introduction
Design Patterns - General Introduction
 
03 preprocessing
03 preprocessing03 preprocessing
03 preprocessing
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Introduction to WebSockets Presentation
Introduction to WebSockets PresentationIntroduction to WebSockets Presentation
Introduction to WebSockets Presentation
 
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublin
 
Graph databases
Graph databasesGraph databases
Graph databases
 
PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
 
Java design patterns
Java design patternsJava design patterns
Java design patterns
 
Azure Data Factory Data Flows Training v005
Azure Data Factory Data Flows Training v005Azure Data Factory Data Flows Training v005
Azure Data Factory Data Flows Training v005
 
mendely.pptx
mendely.pptxmendely.pptx
mendely.pptx
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
 
REST & RESTful Web Services
REST & RESTful Web ServicesREST & RESTful Web Services
REST & RESTful Web Services
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Introduction
IntroductionIntroduction
Introduction
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Top 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & TricksTop 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & Tricks
 

Similar to Chapter 13. Trends and Research Frontiers in Data Mining.ppt

Chaper 13 trend, Han & Kamber
Chaper 13 trend, Han & KamberChaper 13 trend, Han & Kamber
Chaper 13 trend, Han & KamberHouw Liong The
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendSalah Amean
 
Data Mining Application and Trends
Data Mining Application and TrendsData Mining Application and Trends
Data Mining Application and TrendsVijayasankariS
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptPadmajaLaksh
 
Introduction.ppt
Introduction.pptIntroduction.ppt
Introduction.pptbommaiah
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data MiningAbcdDcba12
 
An Overview Of Data Mining Techniques And Applications.Pdf
An Overview Of Data Mining Techniques And Applications.PdfAn Overview Of Data Mining Techniques And Applications.Pdf
An Overview Of Data Mining Techniques And Applications.PdfMichele Thomas
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introductionbutest
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and workAmr Abd El Latief
 
Additional themes of data mining for Msc CS
Additional themes of data mining for Msc CSAdditional themes of data mining for Msc CS
Additional themes of data mining for Msc CSThanveen
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data miningDevakumar Jain
 

Similar to Chapter 13. Trends and Research Frontiers in Data Mining.ppt (20)

13 trend
13 trend13 trend
13 trend
 
Chaper 13 trend, Han & Kamber
Chaper 13 trend, Han & KamberChaper 13 trend, Han & Kamber
Chaper 13 trend, Han & Kamber
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trend
 
Data mininng trends
Data mininng trendsData mininng trends
Data mininng trends
 
Data Mining Application and Trends
Data Mining Application and TrendsData Mining Application and Trends
Data Mining Application and Trends
 
Chaper 13 trend
Chaper 13 trendChaper 13 trend
Chaper 13 trend
 
Chapter 1. Introduction.ppt
Chapter 1. Introduction.pptChapter 1. Introduction.ppt
Chapter 1. Introduction.ppt
 
Introduction to data warehouse
Introduction to data warehouseIntroduction to data warehouse
Introduction to data warehouse
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.ppt
 
Introduction.ppt
Introduction.pptIntroduction.ppt
Introduction.ppt
 
unit 1 DATA MINING.ppt
unit 1 DATA MINING.pptunit 1 DATA MINING.ppt
unit 1 DATA MINING.ppt
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
An Overview Of Data Mining Techniques And Applications.Pdf
An Overview Of Data Mining Techniques And Applications.PdfAn Overview Of Data Mining Techniques And Applications.Pdf
An Overview Of Data Mining Techniques And Applications.Pdf
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
G045033841
G045033841G045033841
G045033841
 
Additional themes of data mining for Msc CS
Additional themes of data mining for Msc CSAdditional themes of data mining for Msc CS
Additional themes of data mining for Msc CS
 
Data mining 1
Data mining 1Data mining 1
Data mining 1
 
Talk
TalkTalk
Talk
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
 

More from Subrata Kumer Paul

Chapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.pptChapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.pptSubrata Kumer Paul
 
Chapter 8. Classification Basic Concepts.ppt
Chapter 8. Classification Basic Concepts.pptChapter 8. Classification Basic Concepts.ppt
Chapter 8. Classification Basic Concepts.pptSubrata Kumer Paul
 
Chapter 12. Outlier Detection.ppt
Chapter 12. Outlier Detection.pptChapter 12. Outlier Detection.ppt
Chapter 12. Outlier Detection.pptSubrata Kumer Paul
 
Chapter 7. Advanced Frequent Pattern Mining.ppt
Chapter 7. Advanced Frequent Pattern Mining.pptChapter 7. Advanced Frequent Pattern Mining.ppt
Chapter 7. Advanced Frequent Pattern Mining.pptSubrata Kumer Paul
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptSubrata Kumer Paul
 
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptSubrata Kumer Paul
 
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...Subrata Kumer Paul
 
Chapter 5. Data Cube Technology.ppt
Chapter 5. Data Cube Technology.pptChapter 5. Data Cube Technology.ppt
Chapter 5. Data Cube Technology.pptSubrata Kumer Paul
 
Chapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptChapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptSubrata Kumer Paul
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptSubrata Kumer Paul
 
Data Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxData Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxSubrata Kumer Paul
 

More from Subrata Kumer Paul (20)

Chapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.pptChapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.ppt
 
Chapter 8. Classification Basic Concepts.ppt
Chapter 8. Classification Basic Concepts.pptChapter 8. Classification Basic Concepts.ppt
Chapter 8. Classification Basic Concepts.ppt
 
Chapter 2. Know Your Data.ppt
Chapter 2. Know Your Data.pptChapter 2. Know Your Data.ppt
Chapter 2. Know Your Data.ppt
 
Chapter 12. Outlier Detection.ppt
Chapter 12. Outlier Detection.pptChapter 12. Outlier Detection.ppt
Chapter 12. Outlier Detection.ppt
 
Chapter 7. Advanced Frequent Pattern Mining.ppt
Chapter 7. Advanced Frequent Pattern Mining.pptChapter 7. Advanced Frequent Pattern Mining.ppt
Chapter 7. Advanced Frequent Pattern Mining.ppt
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.ppt
 
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
 
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
 
Chapter 5. Data Cube Technology.ppt
Chapter 5. Data Cube Technology.pptChapter 5. Data Cube Technology.ppt
Chapter 5. Data Cube Technology.ppt
 
Chapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptChapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.ppt
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
 
Data Mining Lecture_8(a).pptx
Data Mining Lecture_8(a).pptxData Mining Lecture_8(a).pptx
Data Mining Lecture_8(a).pptx
 
Data Mining Lecture_9.pptx
Data Mining Lecture_9.pptxData Mining Lecture_9.pptx
Data Mining Lecture_9.pptx
 
Data Mining Lecture_7.pptx
Data Mining Lecture_7.pptxData Mining Lecture_7.pptx
Data Mining Lecture_7.pptx
 
Data Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxData Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptx
 
Data Mining Lecture_8(b).pptx
Data Mining Lecture_8(b).pptxData Mining Lecture_8(b).pptx
Data Mining Lecture_8(b).pptx
 
Data Mining Lecture_6.pptx
Data Mining Lecture_6.pptxData Mining Lecture_6.pptx
Data Mining Lecture_6.pptx
 
Data Mining Lecture_11.pptx
Data Mining Lecture_11.pptxData Mining Lecture_11.pptx
Data Mining Lecture_11.pptx
 
Data Mining Lecture_4.pptx
Data Mining Lecture_4.pptxData Mining Lecture_4.pptx
Data Mining Lecture_4.pptx
 
Data Mining Lecture_12.pptx
Data Mining Lecture_12.pptxData Mining Lecture_12.pptx
Data Mining Lecture_12.pptx
 

Recently uploaded

Series of training sessions by our experts for you to provide necessary insig...
Series of training sessions by our experts for you to provide necessary insig...Series of training sessions by our experts for you to provide necessary insig...
Series of training sessions by our experts for you to provide necessary insig...AshishChanchal1
 
PM24_Oral_Presentation_Template_Guidelines.pptx
PM24_Oral_Presentation_Template_Guidelines.pptxPM24_Oral_Presentation_Template_Guidelines.pptx
PM24_Oral_Presentation_Template_Guidelines.pptxnissamant
 
Pointers and Array, pointer and String.pptx
Pointers and Array, pointer and String.pptxPointers and Array, pointer and String.pptx
Pointers and Array, pointer and String.pptxAnanthi Palanisamy
 
Forged Fitting Socket Welding Standard- ASME-B16.11-2001.pdf
Forged Fitting Socket Welding Standard- ASME-B16.11-2001.pdfForged Fitting Socket Welding Standard- ASME-B16.11-2001.pdf
Forged Fitting Socket Welding Standard- ASME-B16.11-2001.pdfVikasKumar11936
 
FSMR - FIRE SAFETY MAINTENANCE REPORT FORM.pdf
FSMR - FIRE SAFETY MAINTENANCE REPORT FORM.pdfFSMR - FIRE SAFETY MAINTENANCE REPORT FORM.pdf
FSMR - FIRE SAFETY MAINTENANCE REPORT FORM.pdfArnold Saludares
 
Lesson2 Stoichiometry and mass balance.pdf
Lesson2 Stoichiometry and mass balance.pdfLesson2 Stoichiometry and mass balance.pdf
Lesson2 Stoichiometry and mass balance.pdff1002753214
 
Basic Concepts of Material Science for Electrical and Electronic Materials ...
Basic Concepts of Material Science for  Electrical and Electronic Materials  ...Basic Concepts of Material Science for  Electrical and Electronic Materials  ...
Basic Concepts of Material Science for Electrical and Electronic Materials ...PeopleFinder
 
Energy Efficient Social Housing for One Manchester
Energy Efficient Social Housing for One ManchesterEnergy Efficient Social Housing for One Manchester
Energy Efficient Social Housing for One Manchestermark alegbe
 
HB Self-Body characteristics UHV understanding
HB Self-Body characteristics UHV understandingHB Self-Body characteristics UHV understanding
HB Self-Body characteristics UHV understandingLeoRaju4
 
Laser And its Application's - Engineering Physics
Laser And its Application's - Engineering PhysicsLaser And its Application's - Engineering Physics
Laser And its Application's - Engineering PhysicsPurva Nikam
 
Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?Marian Marinov
 
Basic Instrumentation Symbols | P&ID | PFD | Gaurav Singh Rajput
Basic Instrumentation Symbols | P&ID | PFD | Gaurav Singh RajputBasic Instrumentation Symbols | P&ID | PFD | Gaurav Singh Rajput
Basic Instrumentation Symbols | P&ID | PFD | Gaurav Singh RajputGaurav Singh Rajput
 
Checklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docxChecklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docxNoman khan
 
CDE_Sustainability Performance_20240214.pdf
CDE_Sustainability Performance_20240214.pdfCDE_Sustainability Performance_20240214.pdf
CDE_Sustainability Performance_20240214.pdf8-koi
 
21 SCHEME_21EC53_VTU_MODULE-4_COMPUTER COMMUNCATION NETWORK.pdf
21 SCHEME_21EC53_VTU_MODULE-4_COMPUTER COMMUNCATION NETWORK.pdf21 SCHEME_21EC53_VTU_MODULE-4_COMPUTER COMMUNCATION NETWORK.pdf
21 SCHEME_21EC53_VTU_MODULE-4_COMPUTER COMMUNCATION NETWORK.pdfDr. Shivashankar
 
CHAPTER 1_ HTML and CSS Introduction Module
CHAPTER 1_ HTML and CSS Introduction ModuleCHAPTER 1_ HTML and CSS Introduction Module
CHAPTER 1_ HTML and CSS Introduction Modulessusera4f8281
 
B111_FTS_2011.06.01 Fuel Tank Safety.pdf
B111_FTS_2011.06.01 Fuel Tank Safety.pdfB111_FTS_2011.06.01 Fuel Tank Safety.pdf
B111_FTS_2011.06.01 Fuel Tank Safety.pdfKhoiTruong19
 
The Crystal London Arsitektur Hijau Dinda
The Crystal London Arsitektur Hijau DindaThe Crystal London Arsitektur Hijau Dinda
The Crystal London Arsitektur Hijau Dindadindapebriani27
 

Recently uploaded (20)

Series of training sessions by our experts for you to provide necessary insig...
Series of training sessions by our experts for you to provide necessary insig...Series of training sessions by our experts for you to provide necessary insig...
Series of training sessions by our experts for you to provide necessary insig...
 
PM24_Oral_Presentation_Template_Guidelines.pptx
PM24_Oral_Presentation_Template_Guidelines.pptxPM24_Oral_Presentation_Template_Guidelines.pptx
PM24_Oral_Presentation_Template_Guidelines.pptx
 
Pointers and Array, pointer and String.pptx
Pointers and Array, pointer and String.pptxPointers and Array, pointer and String.pptx
Pointers and Array, pointer and String.pptx
 
Forged Fitting Socket Welding Standard- ASME-B16.11-2001.pdf
Forged Fitting Socket Welding Standard- ASME-B16.11-2001.pdfForged Fitting Socket Welding Standard- ASME-B16.11-2001.pdf
Forged Fitting Socket Welding Standard- ASME-B16.11-2001.pdf
 
FSMR - FIRE SAFETY MAINTENANCE REPORT FORM.pdf
FSMR - FIRE SAFETY MAINTENANCE REPORT FORM.pdfFSMR - FIRE SAFETY MAINTENANCE REPORT FORM.pdf
FSMR - FIRE SAFETY MAINTENANCE REPORT FORM.pdf
 
Présentation IIRB 2024 M.Campoverde R.Duval
Présentation IIRB 2024 M.Campoverde R.DuvalPrésentation IIRB 2024 M.Campoverde R.Duval
Présentation IIRB 2024 M.Campoverde R.Duval
 
Lesson2 Stoichiometry and mass balance.pdf
Lesson2 Stoichiometry and mass balance.pdfLesson2 Stoichiometry and mass balance.pdf
Lesson2 Stoichiometry and mass balance.pdf
 
Basic Concepts of Material Science for Electrical and Electronic Materials ...
Basic Concepts of Material Science for  Electrical and Electronic Materials  ...Basic Concepts of Material Science for  Electrical and Electronic Materials  ...
Basic Concepts of Material Science for Electrical and Electronic Materials ...
 
Energy Efficient Social Housing for One Manchester
Energy Efficient Social Housing for One ManchesterEnergy Efficient Social Housing for One Manchester
Energy Efficient Social Housing for One Manchester
 
Présentation de F. Joudelat Congrès IIRB février 2024
Présentation de F. Joudelat Congrès IIRB février 2024Présentation de F. Joudelat Congrès IIRB février 2024
Présentation de F. Joudelat Congrès IIRB février 2024
 
HB Self-Body characteristics UHV understanding
HB Self-Body characteristics UHV understandingHB Self-Body characteristics UHV understanding
HB Self-Body characteristics UHV understanding
 
Laser And its Application's - Engineering Physics
Laser And its Application's - Engineering PhysicsLaser And its Application's - Engineering Physics
Laser And its Application's - Engineering Physics
 
Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?
 
Basic Instrumentation Symbols | P&ID | PFD | Gaurav Singh Rajput
Basic Instrumentation Symbols | P&ID | PFD | Gaurav Singh RajputBasic Instrumentation Symbols | P&ID | PFD | Gaurav Singh Rajput
Basic Instrumentation Symbols | P&ID | PFD | Gaurav Singh Rajput
 
Checklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docxChecklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docx
 
CDE_Sustainability Performance_20240214.pdf
CDE_Sustainability Performance_20240214.pdfCDE_Sustainability Performance_20240214.pdf
CDE_Sustainability Performance_20240214.pdf
 
21 SCHEME_21EC53_VTU_MODULE-4_COMPUTER COMMUNCATION NETWORK.pdf
21 SCHEME_21EC53_VTU_MODULE-4_COMPUTER COMMUNCATION NETWORK.pdf21 SCHEME_21EC53_VTU_MODULE-4_COMPUTER COMMUNCATION NETWORK.pdf
21 SCHEME_21EC53_VTU_MODULE-4_COMPUTER COMMUNCATION NETWORK.pdf
 
CHAPTER 1_ HTML and CSS Introduction Module
CHAPTER 1_ HTML and CSS Introduction ModuleCHAPTER 1_ HTML and CSS Introduction Module
CHAPTER 1_ HTML and CSS Introduction Module
 
B111_FTS_2011.06.01 Fuel Tank Safety.pdf
B111_FTS_2011.06.01 Fuel Tank Safety.pdfB111_FTS_2011.06.01 Fuel Tank Safety.pdf
B111_FTS_2011.06.01 Fuel Tank Safety.pdf
 
The Crystal London Arsitektur Hijau Dinda
The Crystal London Arsitektur Hijau DindaThe Crystal London Arsitektur Hijau Dinda
The Crystal London Arsitektur Hijau Dinda
 

Chapter 13. Trends and Research Frontiers in Data Mining.ppt

  • 1. Data Mining: Concepts and Techniques (3rd ed.) Chapter 13 Subrata Kumer Paul Assistant Professor, Dept. of CSE, BAUET sksubrata96@gmail.com
  • 3. 3 Chapter 13: Data Mining Trends and Research Frontiers  Mining Complex Types of Data  Other Methodologies of Data Mining  Data Mining Applications  Data Mining and Society  Data Mining Trends  Summary
  • 4. 4 Mining Complex Types of Data  Mining Sequence Data  Mining Time Series  Mining Symbolic Sequences  Mining Biological Sequences  Mining Graphs and Networks  Mining Other Kinds of Data
  • 5. 5 Mining Sequence Data  Similarity Search in Time Series Data  Subsequence match, dimensionality reduction, query-based similarity search, motif-based similarity search  Regression and Trend Analysis in Time-Series Data  long term + cyclic + seasonal variation + random movements  Sequential Pattern Mining in Symbolic Sequences  GSP, PrefixSpan, constraint-based sequential pattern mining  Sequence Classification  Feature-based vs. sequence-distance-based vs. model-based  Alignment of Biological Sequences  Pair-wise vs. multi-sequence alignment, substitution matirces, BLAST  Hidden Markov Model for Biological Sequence Analysis  Markov chain vs. hidden Markov models, forward vs. Viterbi vs. Baum- Welch algorithms
  • 6. 6 Mining Graphs and Networks  Graph Pattern Mining  Frequent subgraph patterns, closed graph patterns, gSpan vs. CloseGraph  Statistical Modeling of Networks  Small world phenomenon, power law (log-tail) distribution, densification  Clustering and Classification of Graphs and Homogeneous Networks  Clustering: Fast Modularity vs. SCAN  Classification: model vs. pattern-based mining  Clustering, Ranking and Classification of Heterogeneous Networks  RankClus, RankClass, and meta path-based, user-guided methodology  Role Discovery and Link Prediction in Information Networks  PathPredict  Similarity Search and OLAP in Information Networks: PathSim, GraphCube  Evolution of Social and Information Networks: EvoNetClus
  • 7. 7 Mining Other Kinds of Data  Mining Spatial Data  Spatial frequent/co-located patterns, spatial clustering and classification  Mining Spatiotemporal and Moving Object Data  Spatiotemporal data mining, trajectory mining, periodica, swarm, …  Mining Cyber-Physical System Data  Applications: healthcare, air-traffic control, flood simulation  Mining Multimedia Data  Social media data, geo-tagged spatial clustering, periodicity analysis, …  Mining Text Data  Topic modeling, i-topic model, integration with geo- and networked data  Mining Web Data  Web content, web structure, and web usage mining  Mining Data Streams  Dynamics, one-pass, patterns, clustering, classification, outlier detection
  • 8. 8 Chapter 13: Data Mining Trends and Research Frontiers  Mining Complex Types of Data  Other Methodologies of Data Mining  Data Mining Applications  Data Mining and Society  Data Mining Trends  Summary
  • 9. 9 Other Methodologies of Data Mining  Statistical Data Mining  Views on Data Mining Foundations  Visual and Audio Data Mining
  • 10. 10 Major Statistical Data Mining Methods  Regression  Generalized Linear Model  Analysis of Variance  Mixed-Effect Models  Factor Analysis  Discriminant Analysis  Survival Analysis
  • 11. 11 Statistical Data Mining (1)  There are many well-established statistical techniques for data analysis, particularly for numeric data  applied extensively to data from scientific experiments and data from economics and the social sciences  Regression  predict the value of a response (dependent) variable from one or more predictor (independent) variables where the variables are numeric  forms of regression: linear, multiple, weighted, polynomial, nonparametric, and robust
  • 12. 12 Scientific and Statistical Data Mining (2)  Generalized linear models  allow a categorical response variable (or some transformation of it) to be related to a set of predictor variables  similar to the modeling of a numeric response variable using linear regression  include logistic regression and Poisson regression  Mixed-effect models  For analyzing grouped data, i.e. data that can be classified according to one or more grouping variables  Typically describe relationships between a response variable and some covariates in data grouped according to one or more factors
  • 13. 13 Scientific and Statistical Data Mining (3)  Regression trees  Binary trees used for classification and prediction  Similar to decision trees:Tests are performed at the internal nodes  In a regression tree the mean of the objective attribute is computed and used as the predicted value  Analysis of variance  Analyze experimental data for two or more populations described by a numeric response variable and one or more categorical variables (factors)
  • 14. 14 Statistical Data Mining (4)  Factor analysis  determine which variables are combined to generate a given factor  e.g., for many psychiatric data, one can indirectly measure other quantities (such as test scores) that reflect the factor of interest  Discriminant analysis  predict a categorical response variable, commonly used in social science  Attempts to determine several discriminant functions (linear combinations of the independent variables) that discriminate among the groups defined by the response variable www.spss.com/datamine/factor.htm
  • 15. 15 Statistical Data Mining (5)  Time series: many methods such as autoregression, ARIMA (Autoregressive integrated moving-average modeling), long memory time-series modeling  Quality control: displays group summary charts  Survival analysis  Predicts the probability that a patient undergoing a medical treatment would survive at least to time t (life span prediction)
  • 16. 16 Other Methodologies of Data Mining  Statistical Data Mining  Views on Data Mining Foundations  Visual and Audio Data Mining
  • 17. 17 Views on Data Mining Foundations (I)  Data reduction  Basis of data mining: Reduce data representation  Trades accuracy for speed in response  Data compression  Basis of data mining: Compress the given data by encoding in terms of bits, association rules, decision trees, clusters, etc.  Probability and statistical theory  Basis of data mining: Discover joint probability distributions of random variables
  • 18. 18  Microeconomic view  A view of utility: Finding patterns that are interesting only to the extent in that they can be used in the decision-making process of some enterprise  Pattern Discovery and Inductive databases  Basis of data mining: Discover patterns occurring in the database, such as associations, classification models, sequential patterns, etc.  Data mining is the problem of performing inductive logic on databases  The task is to query the data and the theory (i.e., patterns) of the database  Popular among many researchers in database systems Views on Data Mining Foundations (II)
  • 19. 19 Other Methodologies of Data Mining  Statistical Data Mining  Views on Data Mining Foundations  Visual and Audio Data Mining
  • 20. 20 Visual Data Mining  Visualization: Use of computer graphics to create visual images which aid in the understanding of complex, often massive representations of data  Visual Data Mining: discovering implicit but useful knowledge from large data sets using visualization techniques Computer Graphics High Performance Computing Pattern Recognition Human Computer Interfaces Multimedia Systems Visual Data Mining
  • 21. 21 Visualization  Purpose of Visualization  Gain insight into an information space by mapping data onto graphical primitives  Provide qualitative overview of large data sets  Search for patterns, trends, structure, irregularities, relationships among data.  Help find interesting regions and suitable parameters for further quantitative analysis.  Provide a visual proof of computer representations derived
  • 22. 22 Visual Data Mining & Data Visualization  Integration of visualization and data mining  data visualization  data mining result visualization  data mining process visualization  interactive visual data mining  Data visualization  Data in a database or data warehouse can be viewed  at different levels of abstraction  as different combinations of attributes or dimensions  Data can be presented in various visual forms
  • 23. 23 Data Mining Result Visualization  Presentation of the results or knowledge obtained from data mining in visual forms  Examples  Scatter plots and boxplots (obtained from descriptive data mining)  Decision trees  Association rules  Clusters  Outliers  Generalized rules
  • 24. 24 Boxplots from Statsoft: Multiple Variable Combinations
  • 25. 25 Visualization of Data Mining Results in SAS Enterprise Miner: Scatter Plots
  • 26. 26 Visualization of Association Rules in SGI/MineSet 3.0
  • 27. 27 Visualization of a Decision Tree in SGI/MineSet 3.0
  • 28. 28 Visualization of Cluster Grouping in IBM Intelligent Miner
  • 29. 29 Data Mining Process Visualization  Presentation of the various processes of data mining in visual forms so that users can see  Data extraction process  Where the data is extracted  How the data is cleaned, integrated, preprocessed, and mined  Method selected for data mining  Where the results are stored  How they may be viewed
  • 30. 30 Visualization of Data Mining Processes by Clementine Understand variations with visualized data See your solution discovery process clearly
  • 31. 31 Interactive Visual Data Mining  Using visualization tools in the data mining process to help users make smart data mining decisions  Example  Display the data distribution in a set of attributes using colored sectors or columns (depending on whether the whole space is represented by either a circle or a set of columns)  Use the display to which sector should first be selected for classification and where a good split point for this sector may be
  • 32. 32 Interactive Visual Mining by Perception-Based Classification (PBC)
  • 33. 33 Audio Data Mining  Uses audio signals to indicate the patterns of data or the features of data mining results  An interesting alternative to visual mining  An inverse task of mining audio (such as music) databases which is to find patterns from audio data  Visual data mining may disclose interesting patterns using graphical displays, but requires users to concentrate on watching patterns  Instead, transform patterns into sound and music and listen to pitches, rhythms, tune, and melody in order to identify anything interesting or unusual
  • 34. 34 Chapter 13: Data Mining Trends and Research Frontiers  Mining Complex Types of Data  Other Methodologies of Data Mining  Data Mining Applications  Data Mining and Society  Data Mining Trends  Summary
  • 35. 35 Data Mining Applications  Data mining: A young discipline with broad and diverse applications  There still exists a nontrivial gap between generic data mining methods and effective and scalable data mining tools for domain-specific applications  Some application domains (briefly discussed here)  Data Mining for Financial data analysis  Data Mining for Retail and Telecommunication Industries  Data Mining in Science and Engineering  Data Mining for Intrusion Detection and Prevention  Data Mining and Recommender Systems
  • 36. 36 Data Mining for Financial Data Analysis (I)  Financial data collected in banks and financial institutions are often relatively complete, reliable, and of high quality  Design and construction of data warehouses for multidimensional data analysis and data mining  View the debt and revenue changes by month, by region, by sector, and by other factors  Access statistical information such as max, min, total, average, trend, etc.  Loan payment prediction/consumer credit policy analysis  feature selection and attribute relevance ranking  Loan payment performance  Consumer credit rating
  • 37. 37  Classification and clustering of customers for targeted marketing  multidimensional segmentation by nearest-neighbor, classification, decision trees, etc. to identify customer groups or associate a new customer to an appropriate customer group  Detection of money laundering and other financial crimes  integration of from multiple DBs (e.g., bank transactions, federal/state crime history DBs)  Tools: data visualization, linkage analysis, classification, clustering tools, outlier analysis, and sequential pattern analysis tools (find unusual access sequences) Data Mining for Financial Data Analysis (II)
  • 38. 38 Data Mining for Retail & Telcomm. Industries (I)  Retail industry: huge amounts of data on sales, customer shopping history, e-commerce, etc.  Applications of retail data mining  Identify customer buying behaviors  Discover customer shopping patterns and trends  Improve the quality of customer service  Achieve better customer retention and satisfaction  Enhance goods consumption ratios  Design more effective goods transportation and distribution policies  Telcomm. and many other industries: Share many similar goals and expectations of retail data mining
  • 39. 39 Data Mining Practice for Retail Industry  Design and construction of data warehouses  Multidimensional analysis of sales, customers, products, time, and region  Analysis of the effectiveness of sales campaigns  Customer retention: Analysis of customer loyalty  Use customer loyalty card information to register sequences of purchases of particular customers  Use sequential pattern mining to investigate changes in customer consumption or loyalty  Suggest adjustments on the pricing and variety of goods  Product recommendation and cross-reference of items  Fraudulent analysis and the identification of usual patterns  Use of visualization tools in data analysis
  • 40. 40 Data Mining in Science and Engineering  Data warehouses and data preprocessing  Resolving inconsistencies or incompatible data collected in diverse environments and different periods (e.g. eco-system studies)  Mining complex data types  Spatiotemporal, biological, diverse semantics and relationships  Graph-based and network-based mining  Links, relationships, data flow, etc.  Visualization tools and domain-specific knowledge  Other issues  Data mining in social sciences and social studies: text and social media  Data mining in computer science: monitoring systems, software bugs, network intrusion
  • 41. 41 Data Mining for Intrusion Detection and Prevention  Majority of intrusion detection and prevention systems use  Signature-based detection: use signatures, attack patterns that are preconfigured and predetermined by domain experts  Anomaly-based detection: build profiles (models of normal behavior) and detect those that are substantially deviate from the profiles  What data mining can help  New data mining algorithms for intrusion detection  Association, correlation, and discriminative pattern analysis help select and build discriminative classifiers  Analysis of stream data: outlier detection, clustering, model shifting  Distributed data mining  Visualization and querying tools
  • 42. 42 Data Mining and Recommender Systems  Recommender systems: Personalization, making product recommendations that are likely to be of interest to a user  Approaches: Content-based, collaborative, or their hybrid  Content-based: Recommends items that are similar to items the user preferred or queried in the past  Collaborative filtering: Consider a user's social environment, opinions of other customers who have similar tastes or preferences  Data mining and recommender systems  Users C × items S: extract from known to unknown ratings to predict user-item combinations  Memory-based method often uses k-nearest neighbor approach  Model-based method uses a collection of ratings to learn a model (e.g., probabilistic models, clustering, Bayesian networks, etc.)  Hybrid approaches integrate both to improve performance (e.g., using ensemble)
  • 43. 43 Chapter 13: Data Mining Trends and Research Frontiers  Mining Complex Types of Data  Other Methodologies of Data Mining  Data Mining Applications  Data Mining and Society  Data Mining Trends  Summary
  • 44. 44 Ubiquitous and Invisible Data Mining  Ubiquitous Data Mining  Data mining is used everywhere, e.g., online shopping  Ex. Customer relationship management (CRM)  Invisible Data Mining  Invisible: Data mining functions are built in daily life operations  Ex. Google search: Users may be unaware that they are examining results returned by data  Invisible data mining is highly desirable  Invisible mining needs to consider efficiency and scalability, user interaction, incorporation of background knowledge and visualization techniques, finding interesting patterns, real-time, …  Further work: Integration of data mining into existing business and scientific technologies to provide domain-specific data mining tools
  • 45. 45 Privacy, Security and Social Impacts of Data Mining  Many data mining applications do not touch personal data  E.g., meteorology, astronomy, geography, geology, biology, and other scientific and engineering data  Many DM studies are on developing scalable algorithms to find general or statistically significant patterns, not touching individuals  The real privacy concern: unconstrained access of individual records, especially privacy-sensitive information  Method 1: Removing sensitive IDs associated with the data  Method 2: Data security-enhancing methods  Multi-level security model: permit to access to only authorized level  Encryption: e.g., blind signatures, biometric encryption, and anonymous databases (personal information is encrypted and stored at different locations)  Method 3: Privacy-preserving data mining methods
  • 46. 46 Privacy-Preserving Data Mining  Privacy-preserving (privacy-enhanced or privacy-sensitive) mining:  Obtaining valid mining results without disclosing the underlying sensitive data values  Often needs trade-off between information loss and privacy  Privacy-preserving data mining methods:  Randomization (e.g., perturbation): Add noise to the data in order to mask some attribute values of records  K-anonymity and l-diversity: Alter individual records so that they cannot be uniquely identified  k-anonymity: Any given record maps onto at least k other records  l-diversity: enforcing intra-group diversity of sensitive values  Distributed privacy preservation: Data partitioned and distributed either horizontally, vertically, or a combination of both  Downgrading the effectiveness of data mining: The output of data mining may violate privacy  Modify data or mining results, e.g., hiding some association rules or slightly distorting some classification models
  • 47. 47 Chapter 13: Data Mining Trends and Research Frontiers  Mining Complex Types of Data  Other Methodologies of Data Mining  Data Mining Applications  Data Mining and Society  Data Mining Trends  Summary
  • 48. 48 Trends of Data Mining  Application exploration: Dealing with application-specific problems  Scalable and interactive data mining methods  Integration of data mining with Web search engines, database systems, data warehouse systems and cloud computing systems  Mining social and information networks  Mining spatiotemporal, moving objects and cyber-physical systems  Mining multimedia, text and web data  Mining biological and biomedical data  Data mining with software engineering and system engineering  Visual and audio data mining  Distributed data mining and real-time data stream mining  Privacy protection and information security in data mining
  • 49. 49 Chapter 13: Data Mining Trends and Research Frontiers  Mining Complex Types of Data  Other Methodologies of Data Mining  Data Mining Applications  Data Mining and Society  Data Mining Trends  Summary
  • 50. 50 Summary  We present a high-level overview of mining complex data types  Statistical data mining methods, such as regression, generalized linear models, analysis of variance, etc., are popularly adopted  Researchers also try to build theoretical foundations for data mining  Visual/audio data mining has been popular and effective  Application-based mining integrates domain-specific knowledge with data analysis techniques and provide mission-specific solutions  Ubiquitous data mining and invisible data mining are penetrating our data lives  Privacy and data security are importance issues in data mining, and privacy-preserving data mining has been developed recently  Our discussion on trends in data mining shows that data mining is a promising, young field, with great, strategic importance
  • 51. 51 References and Further Reading  The books lists a lot of references for further reading. Here we only list a few books  E. Alpaydin. Introduction to Machine Learning, 2nd ed., MIT Press, 2011  S. Chakrabarti. Mining the Web: Statistical Analysis of Hypertex and Semi-Structured Data. Morgan Kaufmann, 2002  R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification, 2ed., Wiley-Interscience, 2000  D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning about a Highly Connected World. Cambridge University Press, 2010.  U. Fayyad, G. Grinstein, and A. Wierse (eds.), Information Visualization in Data Mining and Knowledge Discovery, Morgan Kaufmann, 2001  J. Han, M. Kamber, J. Pei. Data Mining: Concepts and Techniques. Morgan Kaufmann, 3rd ed. 2011  T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed., Springer-Verlag, 2009  D. Koller and N. Friedman. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009.  B. Liu. Web Data Mining, Springer 2006.  T. M. Mitchell. Machine Learning, McGraw Hill, 1997  M. Newman. Networks: An Introduction. Oxford University Press, 2010.  P.-N. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining, Wiley, 2005  I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, 2nd ed. 2005
  • 52. 52