SlideShare a Scribd company logo
STUDIEREN UND DURCHSTARTEN. Author I:	Dip.-Inf. (FH) Johannes Hoppe Author II:	M.Sc. Johannes Hofmeister Author III:	Prof. Dr. Dieter Homeister Date:	01.04.2011 08.04.2011 15.04.2011
Data Mining Applied Author I:	Dip.-Inf. (FH) Johannes Hoppe Author II:	M.Sc. Johannes Hofmeister Author III:	Prof. Dr. Dieter Homeister Date:	01.04.2011 08.04.2011 15.04.2011
01 Applications of Data Mining 3
Applicationsof Data Mining 4
Applicationsof Data Mining Applications of Data Mining Database Marketing  Time-series prediction, detecting "trends"  Detection (of whatever is detectable) Probability Estimation  Information compression  Sensitivity Analysis  5
Applicationsof Data Mining Database Marketing(1/2) Response modeling Model for the response of specific customers.  Systematic selection of (old and potential) customers.  Advertisements and promotion based on these results. ( CRM) Visualization: "Lift chart" shows how successful the selection should be. (later topic: DM validation) 6
Lift Chart Example “For contacting 10% of customers, using no model we should get 10% of responders and using the given model we should get 30% of responders.” 7
Applicationsof Data Mining Database Marketing(2/2) Cross selling: Selling additional products to existing customers Question: Which customer might buy which other product? Uses historical purchase data  Uses credit card information, lifestyle data, demographic data, etc.  Other possible information: Did the customer query special information? How customer heard of the company?  8
Applicationsof Data Mining Database Marketing(2/2) Cross selling: Selling additional products to existing customers Results for direct marketing, mailing lists, direct advertising (Amazon)  Amazon: "Customers who bought this item also bought" and "personalized recommendations"  9
Applicationsof Data Mining Time-series prediction Time series: Stock prices, market shares, …  Extrapolation of future values  Detection of newly arising trends like customer movements to other products Own experience: German print magazines  10
Applicationsof Data Mining Detection Identification of existence or occurrence of a condition  Fraud detection:  Identifying patterns/criteria to detect credit card fraud  Estimating creditworthiness ( German Schufa)  Prediction of mail orders that will not be paid  11
Applicationsof Data Mining Detection Identification of existence or occurrence of a condition  Intrusion detection (in computer networks)  Find patterns that indicate when an attackis made on an network  e.g. clustering: small clusters are of high interest,they point to unusual cases. Definition of Classes may be useful:e.g. harmless, possible harmful,harmful, immediately close LAN  12
Applicationsof Data Mining Detection Identification of existence or occurrence of a condition   Typical difficulties Needs knowledge DM costs  Cost of missing a fraud  Cost of false positives(e.g. falsely accusing someone of fraud, company image problems) 13
Applicationsof Data Mining Probability Estimation Approximate the likelihood of an event given an observation  e.g. for classify a potential customer into an A,B,C range before any business 14
Applicationsof Data Mining Information Compression Can be viewed as a special type of estimation problem.  For a given set of data, estimate the key components that be can be used to construct the data.  15
Applicationsof Data Mining Sensitivity Analysis Understand how changes in one variable affect others.  Identify sensitivity of one variable on another(find out if dependencies exist).  16
02 Data Mining Algorithms 17
Data Mining Algorithms Data Mining Algorithms Different algorithms, different uses Combined The algorithm depends on what you want to do Not every algorithm is suited for what you want to do 18
Data Mining Algorithms Algorithms in SSAS: Groups Classification algorithms Regression algorithms Association algorithms Segmentation algorithms Sequence analysis algorithms Plug-In algorithms 19
Data Mining Algorithms Classification algorithms Predict discrete attributes Based on experience values Algorithms in SSAS: Naive Bayes Decision Trees Neural Networks 20
Data Mining Algorithms Regression algorithms Predict continuous attributes The same as classification algorithms Algorithms in SSAS Linear Regression (Line) Logistic Regression (Curve) MS Time Series 21
Data Mining Algorithms Association algorithms Predict likely combinations Find elements that occur in combination Algorithms in SSAS: MS Associtation Algorithm (Apriori) 22
Data Mining Algorithms Segmentation algorithms Also called „Clustering algorithms“ Groups data with similar properties Algorithms in SSAS: MS Clustering Algorithms (e.g. K-Means) 23
Data Mining Algorithms Sequence analysis algorithms …are clustering algorithms Consider the sorting; the sequence of values while clustering Does not group by similar properties Groups by similar sequences Algorithms in SSAS:  MS Sequence Clustering 24
Data Mining Algorithms Plug-In algorithms .NET Wrapper for COM objects Use ANY algorithm Provided as an assembly (possible workshop to create one) 25
03 Repetition - Datatypes, Contentypes 26
Repetition - Datatypes, Contentypes Applying anAlgorithm Datatypes Contenttypes 27
Repetition - Datatypes, Contentypes Datatypes Definethestructure of thevalues Availabledatatypes: Text Long Boolean Double Date 28
Repetition - Datatypes, Contentypes Contenttypes Definethebehaviour of values Discrete Continuous Discretized Key Key Sequence Key Time Ordered Cyclical 29
Repetition - Datatypes, Contentypes Contenttype: Discrete Fixed set of values Example: Commute Distance: 1-2, 2-5, 5-10 Region: Pacific, Northern America, Europe Name: … … … Boolean values are always discrete Text is most likely discrete 30
Repetition - Datatypes, Contentypes Contenttype: Continuous Unlimited set of values Infinite items possible Example Income Age Difference between Continuous and Discrete is the most important one 31
Repetition - Datatypes, Contentypes Contenttype: Discretized Continuousvaluesconvertedintodiscretevalues Examples: Income to Categories:A, B, C, … Age to groups:0-20,21-30, 31-40, … 32
Repetition - Datatypes, Contentypes Contenttype: Key Key Uniquely identifies a row Key Sequence (sequence clustering models) Series of events Sorted Key Time (time series models) Identify values on a time scale 33
Repetition - Datatypes, Contentypes Contenttype: Ordered Discretevaluesthathave a sorting order Nodistancesvisible Norelationsvisible „One Star“ to „Five Stars“ 34
Repetition - Datatypes, Contentypes Contenttype: Cyclical Discretevaluesthathave a cyclicalsorting order Example: Weekdays:  Monday, Tuesday, … Sunday, Monday, … 	1,2,3, …,7, 1, … Months 	Jan, Feb, Mar, … , Dec, Jan, … 	1, 2, 3, …, 12, 1, … 35
Available Combinations 36
04 Data Mining Algorithms - Decision Trees 37
Applied Data Mining - Decision Trees 38
Applied Data Mining - Decision Trees In General Also known as: Classification Trees Goal: Sequentially partition Data Can detect non-linear relationships Machine Learning Technique Separate into Training and Testing set Training set is created to create model based on certain criteria Test set is used to verify the model 39
Applied Data Mining - Decision Trees Tree for response of a mailing action Income > $30 000: 3,6 % Male 3,2% (Total: 4.677) Income < $30 000: 2,3 % 2,6 % respose rate (Total: 10.000 persons) Age > 40: 3,8% Female 2,1% (Total: 5,323) Age < 40: 3,2 % 40
Applied Data Mining - Decision Trees UsingtheTrainedTree Example: the management decides to mail only to groups with response rate >3.5%.  TrainedTree Males: $30 000 Response Rate: > 3,5 % Female: 40+ 41
Applied Data Mining - Decision Trees Pros Very flexible, white box Model Kiss – Keep it simple, stupid! Little preparation and resources needed Cons Can be tuned until death Long time to build Requires wisely selected training data! False training yields false results Big tree might require disk swapping(Computation might be difficult if it does not fit into main memory.)  42
Project: “DMDW Mining Test” 43
Project: “DMDW Mining Test”(explanation of one note) 44
Project: “DMDW Mining Test”(shows connections, more useful if there are more predictable values)
Project: “DMDW Mining Test”(Generic Content Tree Viewer  DMX (Data Mining Extensions))
References References for Decisions Trees Olivia Parr Rud et. al, Data Mining Cookbook - Modeling Data for Marketing, Risk, and Customer Relationship Management, Wiley, 2001 David A. Grossman, Ophir Frieder: Introductionto Data Mining, Illinois Institute of Technology 2005 Andrew W. Moore: DecisionTrees, Carnegie Mellon University, http://www.autonlab.org/tutorials/dtree16.pdf NongYe (ed.): The Handbook of Data Mining, Lawrence Erlbaum Associates, 2003 Sushimita Mitra, TinkuAcharya, Data Mining - Multimedia, Soft Computing andBioinformatics, Wiley, 2003 http://en.wikipedia.org/wiki/Classification_tree 47
05 Data Mining Algorithms - Clustering 48
Data Mining Algorithms - Clustering X 1 2 49
Data Mining Algorithms - Clustering Clustering Segmentation Algorithm Find homogenous groups within set Find similar variables for different cases Identify new relationships that were unclear before(heuristics) e.g. „Person who rides a bike to work doesn‘t live far from his workplace“ (this is not obvious) 50
51 Homogeneous Subsets Independent  Variables Description  of class classify identify X 1 2
52 Homogeneous Subsets Independent  Variables Description  of class 1. Clustering 2. Classification classify identify X 1 2
Clustering 1. Clustering Reducesdatatoclasses of equaltypes Becomefriedswiththedata Iterative Algorithm Clustering Validate Classify Apply http://msdn.microsoft.com/en-us/library/ms174879.aspx 53
Data Mining Algorithms - Clustering 2. Classification Create a Description of a group Give it a „name“ Also: Characterization 54
Process Start with random values Reuse will create different sets and different groups Different clustering technique / algorithm will create different group Reuse on same dataset, reseed Expert evaluate found classes and plausibility  Good classes used for predictions Good? 1. Clustering Evaluate, Check 2. Classify Apply (Predict) 55
Clustering MS Clustering Algorithm Combination of two algorithms K-Means – Hard!  Datapoint can be in only one cluster Expectation Maximization – Soft Datapoint has different combinations Datapoint belongs to different clusters Probability is calculated 56 Source: http://msdn.microsoft.com/en-us/library/cc280445.aspx
Clustering 57 Pros No predictable variable to choose Trains itself without much effort Easy to configure „Cons“ Interpretation is everything Good eye needed Expert has to check for plausibility
Project: “DMDW Mining Test”(strongest relations only, amount of matching cases for Region Europe)
Project: “DMDW Mining Test”(good to know: continuous attributes are shown by there arithmetic  average)
Project: “DMDW Mining Test”(comparing two clusters)
THANK YOU FOR YOUR ATTENTION 61

More Related Content

What's hot

5.4 mining sequence patterns in biological data
5.4 mining sequence patterns in biological data5.4 mining sequence patterns in biological data
5.4 mining sequence patterns in biological data
Krish_ver2
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
DataminingTools Inc
 
introduction to Data Structure and classification
 introduction to Data Structure and classification introduction to Data Structure and classification
introduction to Data Structure and classification
chauhankapil
 
Mining frequent patterns association
Mining frequent patterns associationMining frequent patterns association
Mining frequent patterns association
DeepaR42
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
MachinePulse
 
mapReduce for machine learning
mapReduce for machine learning mapReduce for machine learning
mapReduce for machine learning
Pranya Prabhakar
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax audit
Michael BENESTY
 
Chapter - 7 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 7 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; KamberChapter - 7 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 7 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
error007
 
Multidimensioal database
Multidimensioal  databaseMultidimensioal  database
Multidimensioal database
TPO TPO
 
Linear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actionsLinear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actions
Hesen Peng
 
An improvised frequent pattern tree
An improvised frequent pattern treeAn improvised frequent pattern tree
An improvised frequent pattern tree
IJDKP
 
FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce
FiDoop: Parallel Mining of Frequent Itemsets Using MapReduceFiDoop: Parallel Mining of Frequent Itemsets Using MapReduce
FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce
IJCSIS Research Publications
 
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
ijsrd.com
 
What is Machine Learning
What is Machine LearningWhat is Machine Learning
What is Machine Learning
Bhaskara Reddy Sannapureddy
 
Stock Market Prediction Using ANN
Stock Market Prediction Using ANNStock Market Prediction Using ANN
Stock Market Prediction Using ANN
Krishna Mohan Mishra
 
An intelligent scalable stock market prediction system
An intelligent scalable stock market prediction systemAn intelligent scalable stock market prediction system
An intelligent scalable stock market prediction system
Harshit Agarwal
 
Graph Tea: Simulating Tool for Graph Theory & Algorithms
Graph Tea: Simulating Tool for Graph Theory & AlgorithmsGraph Tea: Simulating Tool for Graph Theory & Algorithms
Graph Tea: Simulating Tool for Graph Theory & Algorithms
IJMTST Journal
 
Machine Learning Real Life Applications By Examples
Machine Learning Real Life Applications By ExamplesMachine Learning Real Life Applications By Examples
Machine Learning Real Life Applications By Examples
Mario Cartia
 
Lect12 graph mining
Lect12 graph miningLect12 graph mining
Lect12 graph mining
Houw Liong The
 
Graph based Approach and Clustering of Patterns (GACP) for Sequential Pattern...
Graph based Approach and Clustering of Patterns (GACP) for Sequential Pattern...Graph based Approach and Clustering of Patterns (GACP) for Sequential Pattern...
Graph based Approach and Clustering of Patterns (GACP) for Sequential Pattern...
AshishDPatel1
 

What's hot (20)

5.4 mining sequence patterns in biological data
5.4 mining sequence patterns in biological data5.4 mining sequence patterns in biological data
5.4 mining sequence patterns in biological data
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
introduction to Data Structure and classification
 introduction to Data Structure and classification introduction to Data Structure and classification
introduction to Data Structure and classification
 
Mining frequent patterns association
Mining frequent patterns associationMining frequent patterns association
Mining frequent patterns association
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
 
mapReduce for machine learning
mapReduce for machine learning mapReduce for machine learning
mapReduce for machine learning
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax audit
 
Chapter - 7 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 7 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; KamberChapter - 7 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 7 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
 
Multidimensioal database
Multidimensioal  databaseMultidimensioal  database
Multidimensioal database
 
Linear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actionsLinear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actions
 
An improvised frequent pattern tree
An improvised frequent pattern treeAn improvised frequent pattern tree
An improvised frequent pattern tree
 
FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce
FiDoop: Parallel Mining of Frequent Itemsets Using MapReduceFiDoop: Parallel Mining of Frequent Itemsets Using MapReduce
FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce
 
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
 
What is Machine Learning
What is Machine LearningWhat is Machine Learning
What is Machine Learning
 
Stock Market Prediction Using ANN
Stock Market Prediction Using ANNStock Market Prediction Using ANN
Stock Market Prediction Using ANN
 
An intelligent scalable stock market prediction system
An intelligent scalable stock market prediction systemAn intelligent scalable stock market prediction system
An intelligent scalable stock market prediction system
 
Graph Tea: Simulating Tool for Graph Theory & Algorithms
Graph Tea: Simulating Tool for Graph Theory & AlgorithmsGraph Tea: Simulating Tool for Graph Theory & Algorithms
Graph Tea: Simulating Tool for Graph Theory & Algorithms
 
Machine Learning Real Life Applications By Examples
Machine Learning Real Life Applications By ExamplesMachine Learning Real Life Applications By Examples
Machine Learning Real Life Applications By Examples
 
Lect12 graph mining
Lect12 graph miningLect12 graph mining
Lect12 graph mining
 
Graph based Approach and Clustering of Patterns (GACP) for Sequential Pattern...
Graph based Approach and Clustering of Patterns (GACP) for Sequential Pattern...Graph based Approach and Clustering of Patterns (GACP) for Sequential Pattern...
Graph based Approach and Clustering of Patterns (GACP) for Sequential Pattern...
 

Viewers also liked

DMDW Lesson 01 - Introduction
DMDW Lesson 01 - IntroductionDMDW Lesson 01 - Introduction
DMDW Lesson 01 - Introduction
Johannes Hoppe
 
Ria 09 trends_and_technologies
Ria 09 trends_and_technologiesRia 09 trends_and_technologies
Ria 09 trends_and_technologiesJohannes Hoppe
 
DMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse TheoryDMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse Theory
Johannes Hoppe
 
DMDW Extra Lesson - NoSql and MongoDB
DMDW  Extra Lesson - NoSql and MongoDBDMDW  Extra Lesson - NoSql and MongoDB
DMDW Extra Lesson - NoSql and MongoDBJohannes Hoppe
 
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
Johannes Hoppe
 
2017 - NoSQL Vorlesung Mosbach
2017 - NoSQL Vorlesung Mosbach2017 - NoSQL Vorlesung Mosbach
2017 - NoSQL Vorlesung Mosbach
Johannes Hoppe
 
NoSQL - Hands on
NoSQL - Hands onNoSQL - Hands on
NoSQL - Hands on
Johannes Hoppe
 
Exkurs: Save the pixel
Exkurs: Save the pixelExkurs: Save the pixel
Exkurs: Save the pixel
Johannes Hoppe
 

Viewers also liked (8)

DMDW Lesson 01 - Introduction
DMDW Lesson 01 - IntroductionDMDW Lesson 01 - Introduction
DMDW Lesson 01 - Introduction
 
Ria 09 trends_and_technologies
Ria 09 trends_and_technologiesRia 09 trends_and_technologies
Ria 09 trends_and_technologies
 
DMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse TheoryDMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse Theory
 
DMDW Extra Lesson - NoSql and MongoDB
DMDW  Extra Lesson - NoSql and MongoDBDMDW  Extra Lesson - NoSql and MongoDB
DMDW Extra Lesson - NoSql and MongoDB
 
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
 
2017 - NoSQL Vorlesung Mosbach
2017 - NoSQL Vorlesung Mosbach2017 - NoSQL Vorlesung Mosbach
2017 - NoSQL Vorlesung Mosbach
 
NoSQL - Hands on
NoSQL - Hands onNoSQL - Hands on
NoSQL - Hands on
 
Exkurs: Save the pixel
Exkurs: Save the pixelExkurs: Save the pixel
Exkurs: Save the pixel
 

Similar to DMDW Lesson 05 + 06 + 07 - Data Mining Applied

BI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessBI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business business
JawaherAlbaddawi
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dmsumit621
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
Amr Abd El Latief
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introductionbutest
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
Dr. Abdul Ahad Abro
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
Editor IJCATR
 
Data science technology overview
Data science technology overviewData science technology overview
Data science technology overview
Soojung Hong
 
Cyb 5675 class project final
Cyb 5675   class project finalCyb 5675   class project final
Cyb 5675 class project finalCraig Cannon
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1
Mahmoud Alfarra
 
algorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparencyalgorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparency
Paolo Missier
 
Unit 1.pptx
Unit 1.pptxUnit 1.pptx
Unit 1.pptx
DrThenmozhiSPESUMCA
 
data mining
data miningdata mining
data mining
manasa polu
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.ppt
PadmajaLaksh
 
ifip2008albashiri.pdf
ifip2008albashiri.pdfifip2008albashiri.pdf
ifip2008albashiri.pdf
KamalAlbashiri
 
What is data In your address the.docx
What is data In your address the.docxWhat is data In your address the.docx
What is data In your address the.docx
write31
 
Ci2004-10.doc
Ci2004-10.docCi2004-10.doc
Ci2004-10.docbutest
 
Massive Data Analysis- Challenges and Applications
Massive Data Analysis- Challenges and ApplicationsMassive Data Analysis- Challenges and Applications
Massive Data Analysis- Challenges and Applications
Vijay Raghavan
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)
DheerajPachauri
 
Data mining
Data miningData mining
Data mining
Daminda Herath
 

Similar to DMDW Lesson 05 + 06 + 07 - Data Mining Applied (20)

BI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessBI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business business
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dm
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 
Data science technology overview
Data science technology overviewData science technology overview
Data science technology overview
 
Cyb 5675 class project final
Cyb 5675   class project finalCyb 5675   class project final
Cyb 5675 class project final
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1
 
algorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparencyalgorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparency
 
Unit 1.pptx
Unit 1.pptxUnit 1.pptx
Unit 1.pptx
 
data mining
data miningdata mining
data mining
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.ppt
 
ifip2008albashiri.pdf
ifip2008albashiri.pdfifip2008albashiri.pdf
ifip2008albashiri.pdf
 
What is data In your address the.docx
What is data In your address the.docxWhat is data In your address the.docx
What is data In your address the.docx
 
Ci2004-10.doc
Ci2004-10.docCi2004-10.doc
Ci2004-10.doc
 
Massive Data Analysis- Challenges and Applications
Massive Data Analysis- Challenges and ApplicationsMassive Data Analysis- Challenges and Applications
Massive Data Analysis- Challenges and Applications
 
Talk
TalkTalk
Talk
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)
 
Data mining
Data miningData mining
Data mining
 

More from Johannes Hoppe

Einführung in Angular 2
Einführung in Angular 2Einführung in Angular 2
Einführung in Angular 2
Johannes Hoppe
 
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und IonicMDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
Johannes Hoppe
 
2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung Mosbach2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung Mosbach
Johannes Hoppe
 
2012-06-25 - MapReduce auf Azure
2012-06-25 - MapReduce auf Azure2012-06-25 - MapReduce auf Azure
2012-06-25 - MapReduce auf Azure
Johannes Hoppe
 
2013-06-25 - HTML5 & JavaScript Security
2013-06-25 - HTML5 & JavaScript Security2013-06-25 - HTML5 & JavaScript Security
2013-06-25 - HTML5 & JavaScript Security
Johannes Hoppe
 
2013-06-24 - Software Craftsmanship with JavaScript
2013-06-24 - Software Craftsmanship with JavaScript2013-06-24 - Software Craftsmanship with JavaScript
2013-06-24 - Software Craftsmanship with JavaScript
Johannes Hoppe
 
2013-06-15 - Software Craftsmanship mit JavaScript
2013-06-15 - Software Craftsmanship mit JavaScript2013-06-15 - Software Craftsmanship mit JavaScript
2013-06-15 - Software Craftsmanship mit JavaScript
Johannes Hoppe
 
2013 05-03 - HTML5 & JavaScript Security
2013 05-03 -  HTML5 & JavaScript Security2013 05-03 -  HTML5 & JavaScript Security
2013 05-03 - HTML5 & JavaScript Security
Johannes Hoppe
 
2013-03-23 - NoSQL Spartakiade
2013-03-23 - NoSQL Spartakiade2013-03-23 - NoSQL Spartakiade
2013-03-23 - NoSQL Spartakiade
Johannes Hoppe
 
2013 02-26 - Software Tests with Mongo db
2013 02-26 - Software Tests with Mongo db2013 02-26 - Software Tests with Mongo db
2013 02-26 - Software Tests with Mongo db
Johannes Hoppe
 
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
Johannes Hoppe
 
2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGL2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGLJohannes Hoppe
 
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und MongodbJohannes Hoppe
 
2012-09-18 - HTML5 & WebGL
2012-09-18 - HTML5 & WebGL2012-09-18 - HTML5 & WebGL
2012-09-18 - HTML5 & WebGL
Johannes Hoppe
 
2012-09-17 - WDC12: Node.js & MongoDB
2012-09-17 - WDC12: Node.js & MongoDB2012-09-17 - WDC12: Node.js & MongoDB
2012-09-17 - WDC12: Node.js & MongoDB
Johannes Hoppe
 
2012-05-14 NoSQL in .NET - mit Redis und MongoDB
2012-05-14 NoSQL in .NET - mit Redis und MongoDB2012-05-14 NoSQL in .NET - mit Redis und MongoDB
2012-05-14 NoSQL in .NET - mit Redis und MongoDB
Johannes Hoppe
 
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDBJohannes Hoppe
 
2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup Niederrhein2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup NiederrheinJohannes Hoppe
 
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
Johannes Hoppe
 
2012-01-31 NoSQL in .NET
2012-01-31 NoSQL in .NET2012-01-31 NoSQL in .NET
2012-01-31 NoSQL in .NET
Johannes Hoppe
 

More from Johannes Hoppe (20)

Einführung in Angular 2
Einführung in Angular 2Einführung in Angular 2
Einführung in Angular 2
 
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und IonicMDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
 
2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung Mosbach2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung Mosbach
 
2012-06-25 - MapReduce auf Azure
2012-06-25 - MapReduce auf Azure2012-06-25 - MapReduce auf Azure
2012-06-25 - MapReduce auf Azure
 
2013-06-25 - HTML5 & JavaScript Security
2013-06-25 - HTML5 & JavaScript Security2013-06-25 - HTML5 & JavaScript Security
2013-06-25 - HTML5 & JavaScript Security
 
2013-06-24 - Software Craftsmanship with JavaScript
2013-06-24 - Software Craftsmanship with JavaScript2013-06-24 - Software Craftsmanship with JavaScript
2013-06-24 - Software Craftsmanship with JavaScript
 
2013-06-15 - Software Craftsmanship mit JavaScript
2013-06-15 - Software Craftsmanship mit JavaScript2013-06-15 - Software Craftsmanship mit JavaScript
2013-06-15 - Software Craftsmanship mit JavaScript
 
2013 05-03 - HTML5 & JavaScript Security
2013 05-03 -  HTML5 & JavaScript Security2013 05-03 -  HTML5 & JavaScript Security
2013 05-03 - HTML5 & JavaScript Security
 
2013-03-23 - NoSQL Spartakiade
2013-03-23 - NoSQL Spartakiade2013-03-23 - NoSQL Spartakiade
2013-03-23 - NoSQL Spartakiade
 
2013 02-26 - Software Tests with Mongo db
2013 02-26 - Software Tests with Mongo db2013 02-26 - Software Tests with Mongo db
2013 02-26 - Software Tests with Mongo db
 
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
 
2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGL2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGL
 
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
 
2012-09-18 - HTML5 & WebGL
2012-09-18 - HTML5 & WebGL2012-09-18 - HTML5 & WebGL
2012-09-18 - HTML5 & WebGL
 
2012-09-17 - WDC12: Node.js & MongoDB
2012-09-17 - WDC12: Node.js & MongoDB2012-09-17 - WDC12: Node.js & MongoDB
2012-09-17 - WDC12: Node.js & MongoDB
 
2012-05-14 NoSQL in .NET - mit Redis und MongoDB
2012-05-14 NoSQL in .NET - mit Redis und MongoDB2012-05-14 NoSQL in .NET - mit Redis und MongoDB
2012-05-14 NoSQL in .NET - mit Redis und MongoDB
 
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
 
2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup Niederrhein2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup Niederrhein
 
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
 
2012-01-31 NoSQL in .NET
2012-01-31 NoSQL in .NET2012-01-31 NoSQL in .NET
2012-01-31 NoSQL in .NET
 

Recently uploaded

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 

Recently uploaded (20)

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 

DMDW Lesson 05 + 06 + 07 - Data Mining Applied

  • 1. STUDIEREN UND DURCHSTARTEN. Author I: Dip.-Inf. (FH) Johannes Hoppe Author II: M.Sc. Johannes Hofmeister Author III: Prof. Dr. Dieter Homeister Date: 01.04.2011 08.04.2011 15.04.2011
  • 2. Data Mining Applied Author I: Dip.-Inf. (FH) Johannes Hoppe Author II: M.Sc. Johannes Hofmeister Author III: Prof. Dr. Dieter Homeister Date: 01.04.2011 08.04.2011 15.04.2011
  • 3. 01 Applications of Data Mining 3
  • 5. Applicationsof Data Mining Applications of Data Mining Database Marketing Time-series prediction, detecting "trends" Detection (of whatever is detectable) Probability Estimation Information compression Sensitivity Analysis 5
  • 6. Applicationsof Data Mining Database Marketing(1/2) Response modeling Model for the response of specific customers. Systematic selection of (old and potential) customers. Advertisements and promotion based on these results. ( CRM) Visualization: "Lift chart" shows how successful the selection should be. (later topic: DM validation) 6
  • 7. Lift Chart Example “For contacting 10% of customers, using no model we should get 10% of responders and using the given model we should get 30% of responders.” 7
  • 8. Applicationsof Data Mining Database Marketing(2/2) Cross selling: Selling additional products to existing customers Question: Which customer might buy which other product? Uses historical purchase data Uses credit card information, lifestyle data, demographic data, etc. Other possible information: Did the customer query special information? How customer heard of the company? 8
  • 9. Applicationsof Data Mining Database Marketing(2/2) Cross selling: Selling additional products to existing customers Results for direct marketing, mailing lists, direct advertising (Amazon) Amazon: "Customers who bought this item also bought" and "personalized recommendations" 9
  • 10. Applicationsof Data Mining Time-series prediction Time series: Stock prices, market shares, … Extrapolation of future values Detection of newly arising trends like customer movements to other products Own experience: German print magazines 10
  • 11. Applicationsof Data Mining Detection Identification of existence or occurrence of a condition Fraud detection: Identifying patterns/criteria to detect credit card fraud Estimating creditworthiness ( German Schufa) Prediction of mail orders that will not be paid 11
  • 12. Applicationsof Data Mining Detection Identification of existence or occurrence of a condition Intrusion detection (in computer networks) Find patterns that indicate when an attackis made on an network e.g. clustering: small clusters are of high interest,they point to unusual cases. Definition of Classes may be useful:e.g. harmless, possible harmful,harmful, immediately close LAN 12
  • 13. Applicationsof Data Mining Detection Identification of existence or occurrence of a condition Typical difficulties Needs knowledge DM costs Cost of missing a fraud Cost of false positives(e.g. falsely accusing someone of fraud, company image problems) 13
  • 14. Applicationsof Data Mining Probability Estimation Approximate the likelihood of an event given an observation e.g. for classify a potential customer into an A,B,C range before any business 14
  • 15. Applicationsof Data Mining Information Compression Can be viewed as a special type of estimation problem. For a given set of data, estimate the key components that be can be used to construct the data. 15
  • 16. Applicationsof Data Mining Sensitivity Analysis Understand how changes in one variable affect others. Identify sensitivity of one variable on another(find out if dependencies exist). 16
  • 17. 02 Data Mining Algorithms 17
  • 18. Data Mining Algorithms Data Mining Algorithms Different algorithms, different uses Combined The algorithm depends on what you want to do Not every algorithm is suited for what you want to do 18
  • 19. Data Mining Algorithms Algorithms in SSAS: Groups Classification algorithms Regression algorithms Association algorithms Segmentation algorithms Sequence analysis algorithms Plug-In algorithms 19
  • 20. Data Mining Algorithms Classification algorithms Predict discrete attributes Based on experience values Algorithms in SSAS: Naive Bayes Decision Trees Neural Networks 20
  • 21. Data Mining Algorithms Regression algorithms Predict continuous attributes The same as classification algorithms Algorithms in SSAS Linear Regression (Line) Logistic Regression (Curve) MS Time Series 21
  • 22. Data Mining Algorithms Association algorithms Predict likely combinations Find elements that occur in combination Algorithms in SSAS: MS Associtation Algorithm (Apriori) 22
  • 23. Data Mining Algorithms Segmentation algorithms Also called „Clustering algorithms“ Groups data with similar properties Algorithms in SSAS: MS Clustering Algorithms (e.g. K-Means) 23
  • 24. Data Mining Algorithms Sequence analysis algorithms …are clustering algorithms Consider the sorting; the sequence of values while clustering Does not group by similar properties Groups by similar sequences Algorithms in SSAS: MS Sequence Clustering 24
  • 25. Data Mining Algorithms Plug-In algorithms .NET Wrapper for COM objects Use ANY algorithm Provided as an assembly (possible workshop to create one) 25
  • 26. 03 Repetition - Datatypes, Contentypes 26
  • 27. Repetition - Datatypes, Contentypes Applying anAlgorithm Datatypes Contenttypes 27
  • 28. Repetition - Datatypes, Contentypes Datatypes Definethestructure of thevalues Availabledatatypes: Text Long Boolean Double Date 28
  • 29. Repetition - Datatypes, Contentypes Contenttypes Definethebehaviour of values Discrete Continuous Discretized Key Key Sequence Key Time Ordered Cyclical 29
  • 30. Repetition - Datatypes, Contentypes Contenttype: Discrete Fixed set of values Example: Commute Distance: 1-2, 2-5, 5-10 Region: Pacific, Northern America, Europe Name: … … … Boolean values are always discrete Text is most likely discrete 30
  • 31. Repetition - Datatypes, Contentypes Contenttype: Continuous Unlimited set of values Infinite items possible Example Income Age Difference between Continuous and Discrete is the most important one 31
  • 32. Repetition - Datatypes, Contentypes Contenttype: Discretized Continuousvaluesconvertedintodiscretevalues Examples: Income to Categories:A, B, C, … Age to groups:0-20,21-30, 31-40, … 32
  • 33. Repetition - Datatypes, Contentypes Contenttype: Key Key Uniquely identifies a row Key Sequence (sequence clustering models) Series of events Sorted Key Time (time series models) Identify values on a time scale 33
  • 34. Repetition - Datatypes, Contentypes Contenttype: Ordered Discretevaluesthathave a sorting order Nodistancesvisible Norelationsvisible „One Star“ to „Five Stars“ 34
  • 35. Repetition - Datatypes, Contentypes Contenttype: Cyclical Discretevaluesthathave a cyclicalsorting order Example: Weekdays: Monday, Tuesday, … Sunday, Monday, … 1,2,3, …,7, 1, … Months Jan, Feb, Mar, … , Dec, Jan, … 1, 2, 3, …, 12, 1, … 35
  • 37. 04 Data Mining Algorithms - Decision Trees 37
  • 38. Applied Data Mining - Decision Trees 38
  • 39. Applied Data Mining - Decision Trees In General Also known as: Classification Trees Goal: Sequentially partition Data Can detect non-linear relationships Machine Learning Technique Separate into Training and Testing set Training set is created to create model based on certain criteria Test set is used to verify the model 39
  • 40. Applied Data Mining - Decision Trees Tree for response of a mailing action Income > $30 000: 3,6 % Male 3,2% (Total: 4.677) Income < $30 000: 2,3 % 2,6 % respose rate (Total: 10.000 persons) Age > 40: 3,8% Female 2,1% (Total: 5,323) Age < 40: 3,2 % 40
  • 41. Applied Data Mining - Decision Trees UsingtheTrainedTree Example: the management decides to mail only to groups with response rate >3.5%. TrainedTree Males: $30 000 Response Rate: > 3,5 % Female: 40+ 41
  • 42. Applied Data Mining - Decision Trees Pros Very flexible, white box Model Kiss – Keep it simple, stupid! Little preparation and resources needed Cons Can be tuned until death Long time to build Requires wisely selected training data! False training yields false results Big tree might require disk swapping(Computation might be difficult if it does not fit into main memory.) 42
  • 44. Project: “DMDW Mining Test”(explanation of one note) 44
  • 45. Project: “DMDW Mining Test”(shows connections, more useful if there are more predictable values)
  • 46. Project: “DMDW Mining Test”(Generic Content Tree Viewer  DMX (Data Mining Extensions))
  • 47. References References for Decisions Trees Olivia Parr Rud et. al, Data Mining Cookbook - Modeling Data for Marketing, Risk, and Customer Relationship Management, Wiley, 2001 David A. Grossman, Ophir Frieder: Introductionto Data Mining, Illinois Institute of Technology 2005 Andrew W. Moore: DecisionTrees, Carnegie Mellon University, http://www.autonlab.org/tutorials/dtree16.pdf NongYe (ed.): The Handbook of Data Mining, Lawrence Erlbaum Associates, 2003 Sushimita Mitra, TinkuAcharya, Data Mining - Multimedia, Soft Computing andBioinformatics, Wiley, 2003 http://en.wikipedia.org/wiki/Classification_tree 47
  • 48. 05 Data Mining Algorithms - Clustering 48
  • 49. Data Mining Algorithms - Clustering X 1 2 49
  • 50. Data Mining Algorithms - Clustering Clustering Segmentation Algorithm Find homogenous groups within set Find similar variables for different cases Identify new relationships that were unclear before(heuristics) e.g. „Person who rides a bike to work doesn‘t live far from his workplace“ (this is not obvious) 50
  • 51. 51 Homogeneous Subsets Independent Variables Description of class classify identify X 1 2
  • 52. 52 Homogeneous Subsets Independent Variables Description of class 1. Clustering 2. Classification classify identify X 1 2
  • 53. Clustering 1. Clustering Reducesdatatoclasses of equaltypes Becomefriedswiththedata Iterative Algorithm Clustering Validate Classify Apply http://msdn.microsoft.com/en-us/library/ms174879.aspx 53
  • 54. Data Mining Algorithms - Clustering 2. Classification Create a Description of a group Give it a „name“ Also: Characterization 54
  • 55. Process Start with random values Reuse will create different sets and different groups Different clustering technique / algorithm will create different group Reuse on same dataset, reseed Expert evaluate found classes and plausibility Good classes used for predictions Good? 1. Clustering Evaluate, Check 2. Classify Apply (Predict) 55
  • 56. Clustering MS Clustering Algorithm Combination of two algorithms K-Means – Hard! Datapoint can be in only one cluster Expectation Maximization – Soft Datapoint has different combinations Datapoint belongs to different clusters Probability is calculated 56 Source: http://msdn.microsoft.com/en-us/library/cc280445.aspx
  • 57. Clustering 57 Pros No predictable variable to choose Trains itself without much effort Easy to configure „Cons“ Interpretation is everything Good eye needed Expert has to check for plausibility
  • 58. Project: “DMDW Mining Test”(strongest relations only, amount of matching cases for Region Europe)
  • 59. Project: “DMDW Mining Test”(good to know: continuous attributes are shown by there arithmetic average)
  • 60. Project: “DMDW Mining Test”(comparing two clusters)
  • 61. THANK YOU FOR YOUR ATTENTION 61