SlideShare a Scribd company logo
1 of 30
January 20, 2018 Data Mining: Concepts and Techniques1
Data Mining:
Concepts and
Techniques
— Slides for Textbook —
— Chapter 4 —
©Jiawei Han and Micheline Kamber
Intelligent Database Systems Research Lab
School of Computing Science
Simon Fraser University, Canada
http://www.cs.sfu.ca
January 20, 2018 Data Mining: Concepts and Techniques2
Chapter 4: Data Mining Primitives,
Languages, and System Architectures
 Data mining primitives: What defines a data
mining task?
 A data mining query language
 Design graphical user interfaces based on a
data mining query language
 Architecture of data mining systems
 Summary
January 20, 2018 Data Mining: Concepts and Techniques3
Why Data Mining Primitives and
Languages?
 Finding all the patterns autonomously in a database?
— unrealistic because the patterns could be too many
but uninteresting
 Data mining should be an interactive process
 User directs what to be mined
 Users must be provided with a set of primitives to be
used to communicate with the data mining system
 Incorporating these primitives in a data mining query
language
 More flexible user interaction
 Foundation for design of graphical user interface
 Standardization of data mining industry and practice
January 20, 2018 Data Mining: Concepts and Techniques4
What Defines a Data Mining Task ?
 Task-relevant data
 Type of knowledge to be mined
 Background knowledge
 Pattern interestingness measurements
 Visualization of discovered patterns
January 20, 2018 Data Mining: Concepts and Techniques5
Task-Relevant Data (Minable View)
 Database or data warehouse name
 Database tables or data warehouse cubes
 Condition for data selection
 Relevant attributes or dimensions
 Data grouping criteria
January 20, 2018 Data Mining: Concepts and Techniques6
Types of knowledge to be mined
 Characterization
 Discrimination
 Association
 Classification/prediction
 Clustering
 Outlier analysis
 Other data mining tasks
January 20, 2018 Data Mining: Concepts and Techniques7
Background Knowledge:
Concept Hierarchies
 Schema hierarchy
 E.g., street < city < province_or_state <
country
 Set-grouping hierarchy
 E.g., {20-39} = young, {40-59} = middle_aged
 Operation-derived hierarchy
 email address: login-name < department <
university < country
 Rule-based hierarchy
 low_profit_margin (X) <= price(X, P1) and
cost (X, P2) and (P1 - P2) < $50
January 20, 2018 Data Mining: Concepts and Techniques8
Measurements of Pattern
Interestingness
 Simplicity
e.g., (association) rule length, (decision) tree size
 Certainty
e.g., confidence, P(A|B) = n(A and B)/ n (B),
classification reliability or accuracy, certainty factor,
rule strength, rule quality, discriminating weight, etc.
 Utility
potential usefulness, e.g., support (association),
noise threshold (description)
 Novelty
not previously known, surprising (used to remove
redundant rules, e.g., Canada vs. Vancouver rule
implication support ratio
January 20, 2018 Data Mining: Concepts and Techniques9
Visualization of Discovered Patterns
 Different backgrounds/usages may require different forms
of representation
 E.g., rules, tables, crosstabs, pie/bar chart etc.
 Concept hierarchy is also important
 Discovered knowledge might be more understandable
when represented at high level of abstraction
 Interactive drill up/down, pivoting, slicing and dicing
provide different perspective to data
 Different kinds of knowledge require different
representation: association, classification, clustering, etc.
January 20, 2018 Data Mining: Concepts and Techniques10
Chapter 4: Data Mining Primitives,
Languages, and System Architectures
 Data mining primitives: What defines a data
mining task?
 A data mining query language
 Design graphical user interfaces based on a
data mining query language
 Architecture of data mining systems
 Summary
January 20, 2018 Data Mining: Concepts and Techniques11
A Data Mining Query Language
(DMQL)
 Motivation
 A DMQL can provide the ability to support ad-hoc and
interactive data mining
 By providing a standardized language like SQL

Hope to achieve a similar effect like that SQL has on relational
database

Foundation for system development and evolution

Facilitate information exchange, technology transfer,
commercialization and wide acceptance
 Design
 DMQL is designed with the primitives described earlier
January 20, 2018 Data Mining: Concepts and Techniques12
Syntax for DMQL
 Syntax for specification of
 task-relevant data
 the kind of knowledge to be mined
 concept hierarchy specification
 interestingness measure
 pattern presentation and visualization
 Putting it all together — a DMQL query
January 20, 2018 Data Mining: Concepts and Techniques13
Syntax for task-relevant data
specification
 use database database_name, or use data
warehouse data_warehouse_name
 from relation(s)/cube(s) [where condition]
 in relevance to att_or_dim_list
 order by order_list
 group by grouping_list
 having condition
January 20, 2018 Data Mining: Concepts and Techniques14
Specification of task-relevant data
January 20, 2018 Data Mining: Concepts and Techniques15
Syntax for specifying the kind of
knowledge to be mined
 Characterization
Mine_Knowledge_Specification  ::=
mine characteristics [as pattern_name]
analyze measure(s)
 Discrimination
Mine_Knowledge_Specification  ::=
mine comparison [as pattern_name]
for target_class where target_condition 
{versus contrast_class_i where
contrast_condition_i} 
analyze measure(s)
 Association
Mine_Knowledge_Specification  ::=
mine associations [as pattern_name]
January 20, 2018 Data Mining: Concepts and Techniques16
Syntax for specifying the kind of
knowledge to be mined (cont.)
Classification
Mine_Knowledge_Specification  ::=
mine classification [as pattern_name]
analyze classifying_attribute_or_dimension
Prediction
Mine_Knowledge_Specification  ::=
mine prediction [as pattern_name]
analyze prediction_attribute_or_dimension
{set {attribute_or_dimension_i= value_i}}
January 20, 2018 Data Mining: Concepts and Techniques17
Syntax for concept hierarchy
specification
 To specify what concept hierarchies to use
use hierarchy <hierarchy> for <attribute_or_dimension>
 We use different syntax to define different type of hierarchies
 schema hierarchies
define hierarchy time_hierarchy on date as [date,month
quarter,year]
 set-grouping hierarchies
define hierarchy age_hierarchy for age on customer as
level1: {young, middle_aged, senior} <
level0: all
level2: {20, ..., 39} < level1: young
level2: {40, ..., 59} < level1: middle_aged
level2: {60, ..., 89} < level1: senior
January 20, 2018 Data Mining: Concepts and Techniques18
Syntax for concept hierarchy
specification (Cont.)
 operation-derived hierarchies
define hierarchy age_hierarchy for age on customer
as
{age_category(1), ..., age_category(5)} :=
cluster(default, age, 5) < all(age)
 rule-based hierarchies
define hierarchy profit_margin_hierarchy on item as
level_1: low_profit_margin < level_0: all
if (price - cost)< $50
level_1: medium-profit_margin < level_0: all
if ((price - cost) > $50) and ((price - cost)
<= $250))
level_1: high_profit_margin < level_0: all
if (price - cost) > $250
January 20, 2018 Data Mining: Concepts and Techniques19
Syntax for interestingness measure
specification
 Interestingness measures and thresholds can be
specified by the user with the statement:
with <interest_measure_name>  threshold =
threshold_value
 Example:
with support threshold = 0.05
with confidence threshold = 0.7 
January 20, 2018 Data Mining: Concepts and Techniques20
Syntax for pattern presentation and
visualization specification
 We have syntax which allows users to specify the display
of discovered patterns in one or more forms
display as <result_form>
 To facilitate interactive viewing at different concept level,
the following syntax is defined:
Multilevel_Manipulation  ::=   roll up on
attribute_or_dimension
| drill down on
attribute_or_dimension
| add attribute_or_dimension
| drop
attribute_or_dimension
January 20, 2018 Data Mining: Concepts and Techniques21
Putting it all together: the full
specification of a DMQL query
use database AllElectronics_db
use hierarchy location_hierarchy for B.address
mine characteristics as customerPurchasing
analyze count%
in relevance to C.age, I.type, I.place_made
from customer C, item I, purchases P, items_sold S,
works_at W, branch
where I.item_ID = S.item_ID and S.trans_ID = P.trans_ID
and P.cust_ID = C.cust_ID and P.method_paid =
``AmEx''
and P.empl_ID = W.empl_ID and W.branch_ID =
B.branch_ID and B.address = ``Canada" and I.price
>= 100
with noise threshold = 0.05
display as table
January 20, 2018 Data Mining: Concepts and Techniques22
Other Data Mining Languages
& Standardization Efforts
 Association rule language specifications
 MSQL (Imielinski & Virmani’99)
 MineRule (Meo Psaila and Ceri’96)
 Query flocks based on Datalog syntax (Tsur et al’98)
 OLEDB for DM (Microsoft’2000)
 Based on OLE, OLE DB, OLE DB for OLAP
 Integrating DBMS, data warehouse and data mining
 CRISP-DM (CRoss-Industry Standard Process for Data Mining)
 Providing a platform and process structure for effective data
mining
 Emphasizing on deploying data mining technology to solve
business problems
January 20, 2018 Data Mining: Concepts and Techniques23
Chapter 4: Data Mining Primitives,
Languages, and System Architectures
 Data mining primitives: What defines a data
mining task?
 A data mining query language
 Design graphical user interfaces based on a
data mining query language
 Architecture of data mining systems
 Summary
January 20, 2018 Data Mining: Concepts and Techniques24
Designing Graphical User Interfaces
based on a data mining query language
 What tasks should be considered in the design GUIs
based on a data mining query language?
 Data collection and data mining query composition
 Presentation of discovered patterns
 Hierarchy specification and manipulation
 Manipulation of data mining primitives
 Interactive multilevel mining
 Other miscellaneous information
January 20, 2018 Data Mining: Concepts and Techniques25
Chapter 4: Data Mining Primitives,
Languages, and System Architectures
 Data mining primitives: What defines a data
mining task?
 A data mining query language
 Design graphical user interfaces based on a
data mining query language
 Architecture of data mining systems
 Summary
January 20, 2018 Data Mining: Concepts and Techniques26
Data Mining System Architectures
 Coupling data mining system with DB/DW system
 No coupling—flat file processing, not recommended
 Loose coupling

Fetching data from DB/DW
 Semi-tight coupling—enhanced DM performance

Provide efficient implement a few data mining primitives in a
DB/DW system, e.g., sorting, indexing, aggregation, histogram
analysis, multiway join, precomputation of some stat functions
 Tight coupling—A uniform information processing
environment

DM is smoothly integrated into a DB/DW system, mining query
is optimized based on mining query, indexing, query
processing methods, etc.
January 20, 2018 Data Mining: Concepts and Techniques27
Chapter 4: Data Mining Primitives,
Languages, and System Architectures
 Data mining primitives: What defines a data
mining task?
 A data mining query language
 Design graphical user interfaces based on a
data mining query language
 Architecture of data mining systems
 Summary
January 20, 2018 Data Mining: Concepts and Techniques28
Summary
 Five primitives for specification of a data mining task
 task-relevant data
 kind of knowledge to be mined
 background knowledge
 interestingness measures
 knowledge presentation and visualization techniques
to be used for displaying the discovered patterns
 Data mining query languages
 DMQL, MS/OLEDB for DM, etc.
 Data mining system architecture
 No coupling, loose coupling, semi-tight coupling, tight
coupling
January 20, 2018 Data Mining: Concepts and Techniques29
References
 E. Baralis and G. Psaila. Designing templates for mining association rules. Journal of Intelligent
Information Systems, 9:7-32, 1997.
 Microsoft Corp., OLEDB for Data Mining, version 1.0, http://www.microsoft.com/data/oledb/dm,
Aug. 2000.
 J. Han, Y. Fu, W. Wang, K. Koperski, and O. R. Zaiane, “DMQL: A Data Mining Query Language
for Relational Databases”, DMKD'96, Montreal, Canada, June 1996.
 T. Imielinski and A. Virmani. MSQL: A query language for database mining. Data Mining and
Knowledge Discovery, 3:373-408, 1999.
 M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A.I. Verkamo. Finding interesting rules
from large sets of discovered association rules. CIKM’94, Gaithersburg, Maryland, Nov. 1994.
 R. Meo, G. Psaila, and S. Ceri. A new SQL-like operator for mining association rules. VLDB'96,
pages 122-133, Bombay, India, Sept. 1996.
 A. Silberschatz and A. Tuzhilin. What makes patterns interesting in knowledge discovery systems.
IEEE Trans. on Knowledge and Data Engineering, 8:970-974, Dec. 1996.
 S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational
database systems: Alternatives and implications. SIGMOD'98, Seattle, Washington, June 1998.
 D. Tsur, J. D. Ullman, S. Abitboul, C. Clifton, R. Motwani, and S. Nestorov. Query flocks: A
generalization of association-rule mining. SIGMOD'98, Seattle, Washington, June 1998.
January 20, 2018 Data Mining: Concepts and Techniques30
http://www.cs.sfu.ca/~han
Thank you !!!Thank you !!!

More Related Content

What's hot

Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an IntroductionAli Abbasi
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryHoang Nguyen
 
Data mining and its concepts
Data mining and its conceptsData mining and its concepts
Data mining and its conceptsBharadwaj Sharma
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data miningDevakumar Jain
 
Its all about data mining
Its all about data miningIts all about data mining
Its all about data miningJason Rodrigues
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data MiningAmritanshu Mehra
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data miningHadi Fadlallah
 
Data mining-2
Data mining-2Data mining-2
Data mining-2Nit Hik
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kambererror007
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.pptneelamoberoi1030
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process Shuvra Ghosh
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processingVijayasankariS
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Conceptsdataminers.ir
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data MiningSi Krishan
 

What's hot (20)

Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an Introduction
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and its concepts
Data mining and its conceptsData mining and its concepts
Data mining and its concepts
 
Ch 1 intro_dw
Ch 1 intro_dwCh 1 intro_dw
Ch 1 intro_dw
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
 
Its all about data mining
Its all about data miningIts all about data mining
Its all about data mining
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
 
Data mining-2
Data mining-2Data mining-2
Data mining-2
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
Data mininng trends
Data mininng trendsData mininng trends
Data mininng trends
 
Kdd process
Kdd processKdd process
Kdd process
 
Introduction to DataMining
Introduction to DataMiningIntroduction to DataMining
Introduction to DataMining
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Data mining
Data miningData mining
Data mining
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 

Similar to 4lang

Data Mining Presentation on Science Day 2023
Data Mining Presentation on Science Day 2023Data Mining Presentation on Science Day 2023
Data Mining Presentation on Science Day 2023SakshiTiwari490123
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dmsumit621
 
Data Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesData Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesDeepaR42
 
Data Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notesData Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notesasnaparveen414
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Miningdataminers.ir
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical Universitybutest
 
Overview of business intelligence
Overview of business intelligenceOverview of business intelligence
Overview of business intelligenceAhsan Kabir
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
 
Module Overview Careers in Analytics In this module, we .docx
Module Overview  Careers in Analytics In this module, we .docxModule Overview  Careers in Analytics In this module, we .docx
Module Overview Careers in Analytics In this module, we .docxaudeleypearl
 
Module Overview Careers in Analytics In this module, we .docx
Module Overview  Careers in Analytics In this module, we .docxModule Overview  Careers in Analytics In this module, we .docx
Module Overview Careers in Analytics In this module, we .docxroushhsiu
 

Similar to 4lang (20)

10appl
10appl10appl
10appl
 
Using the LEADing Data Reference Content
Using the LEADing Data Reference ContentUsing the LEADing Data Reference Content
Using the LEADing Data Reference Content
 
Data Mining Presentation on Science Day 2023
Data Mining Presentation on Science Day 2023Data Mining Presentation on Science Day 2023
Data Mining Presentation on Science Day 2023
 
Data-Mining-2.ppt
Data-Mining-2.pptData-Mining-2.ppt
Data-Mining-2.ppt
 
Data mining-2
Data mining-2Data mining-2
Data mining-2
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dm
 
big-data-anallytics.pptx
big-data-anallytics.pptxbig-data-anallytics.pptx
big-data-anallytics.pptx
 
Data Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesData Mining : Concepts and Techniques
Data Mining : Concepts and Techniques
 
Data Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notesData Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notes
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Part1
Part1Part1
Part1
 
5desc
5desc5desc
5desc
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical University
 
3prep
3prep3prep
3prep
 
Overview of business intelligence
Overview of business intelligenceOverview of business intelligence
Overview of business intelligence
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
6asso
6asso6asso
6asso
 
9cmplx
9cmplx9cmplx
9cmplx
 
Module Overview Careers in Analytics In this module, we .docx
Module Overview  Careers in Analytics In this module, we .docxModule Overview  Careers in Analytics In this module, we .docx
Module Overview Careers in Analytics In this module, we .docx
 
Module Overview Careers in Analytics In this module, we .docx
Module Overview  Careers in Analytics In this module, we .docxModule Overview  Careers in Analytics In this module, we .docx
Module Overview Careers in Analytics In this module, we .docx
 

Recently uploaded

Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.soniya singh
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Roomishabajaj13
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirtrahman018755
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)Damian Radcliffe
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girladitipandeya
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Servicegwenoracqe6
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC
 

Recently uploaded (20)

Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girls
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 

4lang

  • 1. January 20, 2018 Data Mining: Concepts and Techniques1 Data Mining: Concepts and Techniques — Slides for Textbook — — Chapter 4 — ©Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser University, Canada http://www.cs.sfu.ca
  • 2. January 20, 2018 Data Mining: Concepts and Techniques2 Chapter 4: Data Mining Primitives, Languages, and System Architectures  Data mining primitives: What defines a data mining task?  A data mining query language  Design graphical user interfaces based on a data mining query language  Architecture of data mining systems  Summary
  • 3. January 20, 2018 Data Mining: Concepts and Techniques3 Why Data Mining Primitives and Languages?  Finding all the patterns autonomously in a database? — unrealistic because the patterns could be too many but uninteresting  Data mining should be an interactive process  User directs what to be mined  Users must be provided with a set of primitives to be used to communicate with the data mining system  Incorporating these primitives in a data mining query language  More flexible user interaction  Foundation for design of graphical user interface  Standardization of data mining industry and practice
  • 4. January 20, 2018 Data Mining: Concepts and Techniques4 What Defines a Data Mining Task ?  Task-relevant data  Type of knowledge to be mined  Background knowledge  Pattern interestingness measurements  Visualization of discovered patterns
  • 5. January 20, 2018 Data Mining: Concepts and Techniques5 Task-Relevant Data (Minable View)  Database or data warehouse name  Database tables or data warehouse cubes  Condition for data selection  Relevant attributes or dimensions  Data grouping criteria
  • 6. January 20, 2018 Data Mining: Concepts and Techniques6 Types of knowledge to be mined  Characterization  Discrimination  Association  Classification/prediction  Clustering  Outlier analysis  Other data mining tasks
  • 7. January 20, 2018 Data Mining: Concepts and Techniques7 Background Knowledge: Concept Hierarchies  Schema hierarchy  E.g., street < city < province_or_state < country  Set-grouping hierarchy  E.g., {20-39} = young, {40-59} = middle_aged  Operation-derived hierarchy  email address: login-name < department < university < country  Rule-based hierarchy  low_profit_margin (X) <= price(X, P1) and cost (X, P2) and (P1 - P2) < $50
  • 8. January 20, 2018 Data Mining: Concepts and Techniques8 Measurements of Pattern Interestingness  Simplicity e.g., (association) rule length, (decision) tree size  Certainty e.g., confidence, P(A|B) = n(A and B)/ n (B), classification reliability or accuracy, certainty factor, rule strength, rule quality, discriminating weight, etc.  Utility potential usefulness, e.g., support (association), noise threshold (description)  Novelty not previously known, surprising (used to remove redundant rules, e.g., Canada vs. Vancouver rule implication support ratio
  • 9. January 20, 2018 Data Mining: Concepts and Techniques9 Visualization of Discovered Patterns  Different backgrounds/usages may require different forms of representation  E.g., rules, tables, crosstabs, pie/bar chart etc.  Concept hierarchy is also important  Discovered knowledge might be more understandable when represented at high level of abstraction  Interactive drill up/down, pivoting, slicing and dicing provide different perspective to data  Different kinds of knowledge require different representation: association, classification, clustering, etc.
  • 10. January 20, 2018 Data Mining: Concepts and Techniques10 Chapter 4: Data Mining Primitives, Languages, and System Architectures  Data mining primitives: What defines a data mining task?  A data mining query language  Design graphical user interfaces based on a data mining query language  Architecture of data mining systems  Summary
  • 11. January 20, 2018 Data Mining: Concepts and Techniques11 A Data Mining Query Language (DMQL)  Motivation  A DMQL can provide the ability to support ad-hoc and interactive data mining  By providing a standardized language like SQL  Hope to achieve a similar effect like that SQL has on relational database  Foundation for system development and evolution  Facilitate information exchange, technology transfer, commercialization and wide acceptance  Design  DMQL is designed with the primitives described earlier
  • 12. January 20, 2018 Data Mining: Concepts and Techniques12 Syntax for DMQL  Syntax for specification of  task-relevant data  the kind of knowledge to be mined  concept hierarchy specification  interestingness measure  pattern presentation and visualization  Putting it all together — a DMQL query
  • 13. January 20, 2018 Data Mining: Concepts and Techniques13 Syntax for task-relevant data specification  use database database_name, or use data warehouse data_warehouse_name  from relation(s)/cube(s) [where condition]  in relevance to att_or_dim_list  order by order_list  group by grouping_list  having condition
  • 14. January 20, 2018 Data Mining: Concepts and Techniques14 Specification of task-relevant data
  • 15. January 20, 2018 Data Mining: Concepts and Techniques15 Syntax for specifying the kind of knowledge to be mined  Characterization Mine_Knowledge_Specification  ::= mine characteristics [as pattern_name] analyze measure(s)  Discrimination Mine_Knowledge_Specification  ::= mine comparison [as pattern_name] for target_class where target_condition  {versus contrast_class_i where contrast_condition_i}  analyze measure(s)  Association Mine_Knowledge_Specification  ::= mine associations [as pattern_name]
  • 16. January 20, 2018 Data Mining: Concepts and Techniques16 Syntax for specifying the kind of knowledge to be mined (cont.) Classification Mine_Knowledge_Specification  ::= mine classification [as pattern_name] analyze classifying_attribute_or_dimension Prediction Mine_Knowledge_Specification  ::= mine prediction [as pattern_name] analyze prediction_attribute_or_dimension {set {attribute_or_dimension_i= value_i}}
  • 17. January 20, 2018 Data Mining: Concepts and Techniques17 Syntax for concept hierarchy specification  To specify what concept hierarchies to use use hierarchy <hierarchy> for <attribute_or_dimension>  We use different syntax to define different type of hierarchies  schema hierarchies define hierarchy time_hierarchy on date as [date,month quarter,year]  set-grouping hierarchies define hierarchy age_hierarchy for age on customer as level1: {young, middle_aged, senior} < level0: all level2: {20, ..., 39} < level1: young level2: {40, ..., 59} < level1: middle_aged level2: {60, ..., 89} < level1: senior
  • 18. January 20, 2018 Data Mining: Concepts and Techniques18 Syntax for concept hierarchy specification (Cont.)  operation-derived hierarchies define hierarchy age_hierarchy for age on customer as {age_category(1), ..., age_category(5)} := cluster(default, age, 5) < all(age)  rule-based hierarchies define hierarchy profit_margin_hierarchy on item as level_1: low_profit_margin < level_0: all if (price - cost)< $50 level_1: medium-profit_margin < level_0: all if ((price - cost) > $50) and ((price - cost) <= $250)) level_1: high_profit_margin < level_0: all if (price - cost) > $250
  • 19. January 20, 2018 Data Mining: Concepts and Techniques19 Syntax for interestingness measure specification  Interestingness measures and thresholds can be specified by the user with the statement: with <interest_measure_name>  threshold = threshold_value  Example: with support threshold = 0.05 with confidence threshold = 0.7 
  • 20. January 20, 2018 Data Mining: Concepts and Techniques20 Syntax for pattern presentation and visualization specification  We have syntax which allows users to specify the display of discovered patterns in one or more forms display as <result_form>  To facilitate interactive viewing at different concept level, the following syntax is defined: Multilevel_Manipulation  ::=   roll up on attribute_or_dimension | drill down on attribute_or_dimension | add attribute_or_dimension | drop attribute_or_dimension
  • 21. January 20, 2018 Data Mining: Concepts and Techniques21 Putting it all together: the full specification of a DMQL query use database AllElectronics_db use hierarchy location_hierarchy for B.address mine characteristics as customerPurchasing analyze count% in relevance to C.age, I.type, I.place_made from customer C, item I, purchases P, items_sold S, works_at W, branch where I.item_ID = S.item_ID and S.trans_ID = P.trans_ID and P.cust_ID = C.cust_ID and P.method_paid = ``AmEx'' and P.empl_ID = W.empl_ID and W.branch_ID = B.branch_ID and B.address = ``Canada" and I.price >= 100 with noise threshold = 0.05 display as table
  • 22. January 20, 2018 Data Mining: Concepts and Techniques22 Other Data Mining Languages & Standardization Efforts  Association rule language specifications  MSQL (Imielinski & Virmani’99)  MineRule (Meo Psaila and Ceri’96)  Query flocks based on Datalog syntax (Tsur et al’98)  OLEDB for DM (Microsoft’2000)  Based on OLE, OLE DB, OLE DB for OLAP  Integrating DBMS, data warehouse and data mining  CRISP-DM (CRoss-Industry Standard Process for Data Mining)  Providing a platform and process structure for effective data mining  Emphasizing on deploying data mining technology to solve business problems
  • 23. January 20, 2018 Data Mining: Concepts and Techniques23 Chapter 4: Data Mining Primitives, Languages, and System Architectures  Data mining primitives: What defines a data mining task?  A data mining query language  Design graphical user interfaces based on a data mining query language  Architecture of data mining systems  Summary
  • 24. January 20, 2018 Data Mining: Concepts and Techniques24 Designing Graphical User Interfaces based on a data mining query language  What tasks should be considered in the design GUIs based on a data mining query language?  Data collection and data mining query composition  Presentation of discovered patterns  Hierarchy specification and manipulation  Manipulation of data mining primitives  Interactive multilevel mining  Other miscellaneous information
  • 25. January 20, 2018 Data Mining: Concepts and Techniques25 Chapter 4: Data Mining Primitives, Languages, and System Architectures  Data mining primitives: What defines a data mining task?  A data mining query language  Design graphical user interfaces based on a data mining query language  Architecture of data mining systems  Summary
  • 26. January 20, 2018 Data Mining: Concepts and Techniques26 Data Mining System Architectures  Coupling data mining system with DB/DW system  No coupling—flat file processing, not recommended  Loose coupling  Fetching data from DB/DW  Semi-tight coupling—enhanced DM performance  Provide efficient implement a few data mining primitives in a DB/DW system, e.g., sorting, indexing, aggregation, histogram analysis, multiway join, precomputation of some stat functions  Tight coupling—A uniform information processing environment  DM is smoothly integrated into a DB/DW system, mining query is optimized based on mining query, indexing, query processing methods, etc.
  • 27. January 20, 2018 Data Mining: Concepts and Techniques27 Chapter 4: Data Mining Primitives, Languages, and System Architectures  Data mining primitives: What defines a data mining task?  A data mining query language  Design graphical user interfaces based on a data mining query language  Architecture of data mining systems  Summary
  • 28. January 20, 2018 Data Mining: Concepts and Techniques28 Summary  Five primitives for specification of a data mining task  task-relevant data  kind of knowledge to be mined  background knowledge  interestingness measures  knowledge presentation and visualization techniques to be used for displaying the discovered patterns  Data mining query languages  DMQL, MS/OLEDB for DM, etc.  Data mining system architecture  No coupling, loose coupling, semi-tight coupling, tight coupling
  • 29. January 20, 2018 Data Mining: Concepts and Techniques29 References  E. Baralis and G. Psaila. Designing templates for mining association rules. Journal of Intelligent Information Systems, 9:7-32, 1997.  Microsoft Corp., OLEDB for Data Mining, version 1.0, http://www.microsoft.com/data/oledb/dm, Aug. 2000.  J. Han, Y. Fu, W. Wang, K. Koperski, and O. R. Zaiane, “DMQL: A Data Mining Query Language for Relational Databases”, DMKD'96, Montreal, Canada, June 1996.  T. Imielinski and A. Virmani. MSQL: A query language for database mining. Data Mining and Knowledge Discovery, 3:373-408, 1999.  M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A.I. Verkamo. Finding interesting rules from large sets of discovered association rules. CIKM’94, Gaithersburg, Maryland, Nov. 1994.  R. Meo, G. Psaila, and S. Ceri. A new SQL-like operator for mining association rules. VLDB'96, pages 122-133, Bombay, India, Sept. 1996.  A. Silberschatz and A. Tuzhilin. What makes patterns interesting in knowledge discovery systems. IEEE Trans. on Knowledge and Data Engineering, 8:970-974, Dec. 1996.  S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational database systems: Alternatives and implications. SIGMOD'98, Seattle, Washington, June 1998.  D. Tsur, J. D. Ullman, S. Abitboul, C. Clifton, R. Motwani, and S. Nestorov. Query flocks: A generalization of association-rule mining. SIGMOD'98, Seattle, Washington, June 1998.
  • 30. January 20, 2018 Data Mining: Concepts and Techniques30 http://www.cs.sfu.ca/~han Thank you !!!Thank you !!!