SlideShare a Scribd company logo
1 of 17
The 5th International Joint
conference on Rules and Reasoning
Sept. 8–15, 2021
(virtually in) Leuven, Belgium
An Answer Set Programming based framework for High-Utility
Pattern Mining extended with Facets and Advanced Utility
Functions
Francesco Cauteruccio and Giorgio Terracina
DEMACS, University of Calabria, Italy
{cauteruccio, terracina}@mat.unical.it
Outline
• Context and Motivation
• Proposed Framework
• ASP Approach
• Experimental Evaluation
• Conclusion
1/17
Context and Motivation
• Pattern Mining is one of the most studied data mining branches
• Find interesting patterns (set of items) in a database of transactions
• Frequent pattern mining, sequential pattern mining, etc…
• High-Utility Pattern Mining (HUPM)
• Find patterns having a high-utility (w.r.t. some utility measure)
• Example: in a sales database, the utility of a pattern may be represented by the profit of some items bought
together.
• Basic assumption: each item is associated with one, static utility .
• However…
• The utility of an item can be defined from very different point of views,
• Transactions are not only flat lists of items but they can provide different level of abstractions.
Pattern Mining and High-Utility Pattern Mining
2/17
Context and Motivation
• We present a framework for HUPM extending basic notions and introducing:
• A higher level of abstraction with transaction set representation
• For each transaction, an Object, a Container and a Database level of aggregation can be defined.
• The notion of facet
• A facet is an attribute which can be associated with an item, a transaction, an object or a container;
• Each element may be characterized by more than one facet.
• A taxonomy of extended utility functions
• Based on the database structure and facets,
• They can be combined in several ways to fit different notions of utility.
An extended framework for HUPM
3/17
Proposed Framework
• The database representation is multi-layer
• 𝐷𝑎𝑡𝑎𝑏𝑎𝑠𝑒 → 𝐶𝑜𝑛𝑡𝑎𝑖𝑛𝑒𝑟 → 𝑂𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛
• Given a database 𝐷 and a set of transactions {𝑇!, … , 𝑇"}
• 𝐷 is organized as a set of containers 𝐶 = {𝐶!, … , 𝐶"}
• 𝐶# can be associated with a set of objects 𝑂 = {𝑂!, … , 𝑂$}
• 𝑂% contains a set of transactions {𝑇!, … , 𝑇&}
Extending the database structure
Running example depicting a sales database
5/17
Proposed Framework
• The utility of 𝑖 may be defined from
different perspectives: facets.
• Each item 𝑖 can be associated with
one or more facets.
• Facets can also be defined for
transactions, objects and containers.
• To describe facets, we use the notion
of utility vector.
Facets
6/17
Proposed Framework
• Item utility vector for an item 𝑖
• 𝐼𝑈! = [𝑖𝑢", 𝑖𝑢#, … , 𝑖𝑢$], and 𝑖𝑢% describes a certain facet of 𝑖
• Transaction utility vector for a transaction 𝑇#
• 𝑇𝑈&!
= [𝑡𝑢", 𝑡𝑢#, … , 𝑡𝑢'], and 𝑡𝑢% describes a certain facet of 𝑇(
• Internal utility of 𝑖 in 𝑇( is available and denoted as 𝑞(𝑖, 𝑇()
• Object utility vector
• 𝑂𝑈) = 𝑜𝑢", 𝑜𝑢#, … , 𝑜𝑢* , and 𝑜𝑢% describes a certain facet of 𝑂
• Container utility vector
• 𝐶𝑈+ = [𝑐𝑢", 𝑐𝑢#, … , 𝑐𝑢,], and 𝑐𝑢% describes a certain facet of 𝐶
Facets and utility vectors
facets: price, weight
7/17
Proposed Framework
• Intra-pattern utility function
• Let 𝑃 be a pattern with 𝑟 items, and 𝑇( be a transaction containing it,
• Let 𝐼𝑈𝑆 = {𝐼𝑈", … , 𝐼𝑈-} be the set of item utility vectors of 𝑃
• The intra-pattern utility function 𝐼𝑈&!
= 𝑖𝑝(𝑃, 𝑇(, 𝐼𝑈𝑆) generates an
unique item utility vector for the pattern occurrence.
• 𝑖𝑝 ⋅ can be any function combining the utilities across the facets,
such as SUM, MAX, MIN, AVG, etc…
• Pattern utility function
• The occurrence utility vector for a pattern 𝑃 is 𝑂𝑐𝑐𝑈&!
=
[𝐼𝑈&!
, 𝑇𝑈&!
, 𝑂𝑈&!
, 𝐶𝑈&!
]
• The pattern utility vector 𝑈. is the collection of all the occurrence
utility vectors of 𝑃: 𝑈. = ⋃&!∈&"
𝑂𝑐𝑐𝑈&!
• 𝑈. is a matrix (occurrence utility vectors × facets).
Advanced utility functions
In this example, 𝑖𝑝 = ∑ 𝑓𝑎𝑐𝑒𝑡 × 𝑖𝑛𝑡𝑒𝑟𝑛𝑎𝑙𝑈𝑡𝑖𝑙𝑖𝑡𝑦
8/17
Proposed Framework
• The utility 𝑢 of a pattern 𝑃 can be obtained as an arbitrary
combination of the values of 𝑈! using a function 𝑢(𝑃)
• 𝑢(𝑃) can be classified as
• Horizontal first 𝑢 𝑃 = 𝑓?(𝑓@(𝑢.)) combines by row (facets), then by
column (occurrences)
• Vertical first 𝑢 𝑃 = 𝑓@(𝑓?(𝑢.)) combines by column (occurrences),
then by row (facets)
• Mixed 𝑢 𝑃 = 𝑓 𝑢. combines the values at once
• All of these classifications can be further classified in
• Inter-transaction utility
• Pattern-vs-object utility
• Pattern-vs-container utility
• These can be exploited to define utility measures relating
item/transaction facets and one of the object/container facets,
such as Pearson and Multiple correlation.
Advanced utility functions
As an example, suppose we select 𝑓# =
𝑓𝑖𝑙𝑡𝑒𝑟 ⋅ and 𝑓$ = max ⋅ , that is we first
filter for a single facet, and then we take
the maximum across all the occurrences.
9/17
Proposed Framework
• Given a pattern 𝑃, we say that 𝑃 is an extended high-utility pattern if its utility 𝑢(𝑃) is greater than
a minimum threshold 𝑡ℎ$ and it occurs in at least 𝑡ℎ% transactions.
• The problem of extended high-utility pattern mining (e-HUPM) is to discover all the extended
high-utility patterns in a given database 𝐷.
The e–HUPM problem
10/17
ASP Approach
• We want to provide as much flexibility as possible in the definition of what is a useful pattern.
• Encoding the problem in Answer Set Programming (ASP) helps achieving the desidered flexibility
and modularity.
• Classic guess-and-check scenario
• We generate one answer set for each valid pattern,
• Complex utility functions are executed by means of external functions (e.g., in DLVHEX, WASP, clingo)
• Pattern validity criteria and filter can be easily applied by encoding them as rules.
The why and the how
11/17
ASP Approach
The encoding
%%% Input schema:
%container(ContainerId)
%object(ObjectId,ContainerId)
%transaction(Tid, ObjectId)
%item(Item, Tid, Position, Q)
%itemUtilityVector(Item, I1, ..., Il)
%transactionUtilityVector(Tid, T1, ..., Tm)
%objectUtilityVector(ObjectId, O1, ..., On)
%containerUtilityVector(ContainerId, C1, ..., Co)
%%% Parameters
occurrencesThreshold(...). utilityTreshold(...).
%%% Item pre-filtering
usefulItem(I):- item(I,_,_,_),....any condition on the items.
%%% Candidate pattern generation
{inCandidatePattern(I)}:- usefulItem(I).
%%% Occurrences computation and check
inTransaction(Tid):- transaction(Tid,_), not incomplete(Tid).
incomplete(TiD):- transaction(Tid,_), inCandidatePattern(I), not contains(I,Tid).
contains(I,Tid):- item(I,Tid,_,_).
:- #count{ Tid : inTransaction(Tid)}=N, N < Tho, occurrencesThreshold(Tho).
%%% Utility computation
patternItemUtilityVectors(Tid,Item,I1,...,Il,Q):- inCandidatePattern(Item),
itemUtilityVector(Item, I1, ..., Il), inTransaction(Tid), item(Item, Tid, Position, Q).
intraPatternUtilityVector(Tid,I1,...,Il):-
&computeIntraPatternUtility[patternItemUtilityVectors](Tid,I1,...,Il).
occurrenceUtilityVector(Tid,I1,...,Il,T1,...Tm,O1,...On,C1,...,Co):-
inTransaction(Tid), intraPatternUtilityVector(Tid,I1,...,Il)
transactionUtilityVector(Tid, T1, ..., Tm), transaction(Tid, ObjectId),
objectUtilityVector(ObjectId, O1, ..., On), object(ObjectId , ContainerId)
containerUtilityVector(ContainerId, C1, ..., Co).
:- &computeUtility[occurrenceUtilityVector](U), U < Thu, utilityTreshold(Thu).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30 12/17
Experimental Evaluation
• Aspect-based sentiment analysis of scientific reviews [1]
• Each review is annotated with 8 different aspects: appropriataness, clarity, originality,
empirical/theoretical soudness, meaningful comparison, substance, impact and recommendation
• Each aspect has assigned one sentiment value from {𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒, 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒, 𝑛𝑒𝑢𝑡𝑟𝑎𝑙, 𝑎𝑏𝑠𝑒𝑛𝑡}.
• 814 papers and a total of 1148 reviews.
• 2230 annotated lines with a total of 15124 distinct words.
Dataset
[1] Chakraborty, S., Goyal, P., Mukherjee, A.: Aspect-based sentiment analysis of scientific reviews. In: JCDL ’20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, Virtual Event, China, August 1-5,
2020. pp. 207–216 (2020), ACM
13/17
Experimental Evaluation
• Average running time
• 𝑢(𝑃): disagreement between originality and decision (% of
pattern occurrences showing a positive originality and a
reject decision, thresholds: [15, 50, 75, 100])
Quantitative analysis
• Comparison with HUPM systems
• No analysis involving more than one
elements can be devised with such
systems
• Here, the utility of an item is the absolute
value of the appropriateness aspect in
the sentence it first appears in.
14/17
Experimental Evaluation
• 𝑢 𝑃 : Pearson correlation between an aspect 𝑋
and the final decision on the corresponding
paper.
• «The set of review sentences containing 𝑃 are
characterized by a direct correlation between the
sentiment on 𝑋 and the final decision on the
corresponding paper»
• 𝑢(𝑃): Multiple correlation between two aspects
𝑋 and 𝑌 and the final decision on the
corresponding paper.
Qualitative analysis
15/17
Conclusion
• We introduced a general framework for HUPM with several extensions.
• The framework allows to work with multi-dimensional data and with different utility measures.
• A versatile and modular ASP encoding has been developed.
• We employed a real use case on paper review to carry out both quantitative and qualitative
analyses.
• Facets and advanced utility functions help reducing the amount of relevant patterns
• Useful in providing deep insights on the data.
• Not an ending point!
• Apply the framework to new contexts,
• Derive ad-hoc algorithms for the extended utility functions.
16/17
Thanks for your attention!
These slides available at https://bit.ly/ehupm-ruleml21
Francesco Cauteruccio
Research Fellow @ DEMACS, University of Calabria
cauteruccio@mat.unical.it
francescocauteruccio.info
@finalfire

More Related Content

What's hot

The science behind predictive analytics a text mining perspective
The science behind predictive analytics  a text mining perspectiveThe science behind predictive analytics  a text mining perspective
The science behind predictive analytics a text mining perspectiveankurpandeyinfo
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methodsrajshreemuthiah
 
A Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionA Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionIOSR Journals
 
Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...ranjit banshpal
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Modelsguest0edcaf
 
2 introductory slides
2 introductory slides2 introductory slides
2 introductory slidestafosepsdfasg
 

What's hot (12)

The science behind predictive analytics a text mining perspective
The science behind predictive analytics  a text mining perspectiveThe science behind predictive analytics  a text mining perspective
The science behind predictive analytics a text mining perspective
 
2.mathematics for machine learning
2.mathematics for machine learning2.mathematics for machine learning
2.mathematics for machine learning
 
4 module 3 --
4 module 3 --4 module 3 --
4 module 3 --
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methods
 
A Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionA Modified KS-test for Feature Selection
A Modified KS-test for Feature Selection
 
Matlab:Regression
Matlab:RegressionMatlab:Regression
Matlab:Regression
 
Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
 
2 introductory slides
2 introductory slides2 introductory slides
2 introductory slides
 
7 decision tree
7 decision tree7 decision tree
7 decision tree
 
3 module 2
3 module 23 module 2
3 module 2
 
Matlab Data And Statistics
Matlab Data And StatisticsMatlab Data And Statistics
Matlab Data And Statistics
 

Similar to An Answer Set Programming based framework for High-Utility Pattern Mining extended with Facets and Advanced Utility Functions

Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structuressonykhan3
 
UNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptxUNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptxBhagyasriPatel2
 
11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code executionAlexander Decker
 
Query optimization to improve performance of the code execution
Query optimization to improve performance of the code executionQuery optimization to improve performance of the code execution
Query optimization to improve performance of the code executionAlexander Decker
 
Predicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionPredicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionChittagong Independent University
 
Programming in python
Programming in pythonProgramming in python
Programming in pythonIvan Rojas
 
Introduction to Design Patterns in Javascript
Introduction to Design Patterns in JavascriptIntroduction to Design Patterns in Javascript
Introduction to Design Patterns in JavascriptSanthosh Kumar Srinivasan
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
Data structure and algorithm.
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm. Abdul salam
 
Data Structure and Algorithms
Data Structure and AlgorithmsData Structure and Algorithms
Data Structure and Algorithmsiqbalphy1
 
cs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxcs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxGopalPatidar13
 
Data Analysis – Technical learnings
Data Analysis – Technical learningsData Analysis – Technical learnings
Data Analysis – Technical learningsInvenkLearn
 
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberPreservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberJISC KeepIt project
 
2021 icse reducedsylabiix-computer applications
2021 icse reducedsylabiix-computer applications2021 icse reducedsylabiix-computer applications
2021 icse reducedsylabiix-computer applicationsVahabshaik Shai
 
Data Structures - Lecture 1 - Unit 1.pptx
Data Structures  - Lecture 1 - Unit 1.pptxData Structures  - Lecture 1 - Unit 1.pptx
Data Structures - Lecture 1 - Unit 1.pptxDanielNesaKumarC
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReducesscdotopen
 
Algorithms Analysis.pdf
Algorithms Analysis.pdfAlgorithms Analysis.pdf
Algorithms Analysis.pdfShaistaRiaz4
 

Similar to An Answer Set Programming based framework for High-Utility Pattern Mining extended with Facets and Advanced Utility Functions (20)

Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structures
 
UNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptxUNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptx
 
11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution
 
Query optimization to improve performance of the code execution
Query optimization to improve performance of the code executionQuery optimization to improve performance of the code execution
Query optimization to improve performance of the code execution
 
Predicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionPredicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector Regression
 
Programming in python
Programming in pythonProgramming in python
Programming in python
 
Introduction to Design Patterns in Javascript
Introduction to Design Patterns in JavascriptIntroduction to Design Patterns in Javascript
Introduction to Design Patterns in Javascript
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Data structure and algorithm.
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm.
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 
algo 1.ppt
algo 1.pptalgo 1.ppt
algo 1.ppt
 
Data Structure and Algorithms
Data Structure and AlgorithmsData Structure and Algorithms
Data Structure and Algorithms
 
cs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxcs 601 - lecture 1.pptx
cs 601 - lecture 1.pptx
 
Data Analysis – Technical learnings
Data Analysis – Technical learningsData Analysis – Technical learnings
Data Analysis – Technical learnings
 
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberPreservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
 
Query processing System
Query processing SystemQuery processing System
Query processing System
 
2021 icse reducedsylabiix-computer applications
2021 icse reducedsylabiix-computer applications2021 icse reducedsylabiix-computer applications
2021 icse reducedsylabiix-computer applications
 
Data Structures - Lecture 1 - Unit 1.pptx
Data Structures  - Lecture 1 - Unit 1.pptxData Structures  - Lecture 1 - Unit 1.pptx
Data Structures - Lecture 1 - Unit 1.pptx
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduce
 
Algorithms Analysis.pdf
Algorithms Analysis.pdfAlgorithms Analysis.pdf
Algorithms Analysis.pdf
 

Recently uploaded

Fun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfFun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfhoangquan21999
 
Electricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 studentsElectricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 studentslevieagacer
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...kevin8smith
 
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 RpWASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 RpSérgio Sacani
 
TEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdfTEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdfmarcuskenyatta275
 
Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Fabiano Dalpiaz
 
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...mikehavy0
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Ansari Aashif Raza Mohd Imtiyaz
 
GBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) MetabolismGBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) MetabolismAreesha Ahmad
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyAreesha Ahmad
 
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxBiochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxjayabahari688
 
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfFORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfSuchita Rawat
 
In-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxIn-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxMAGOTI ERNEST
 
Lubrication System in forced feed system
Lubrication System in forced feed systemLubrication System in forced feed system
Lubrication System in forced feed systemADB online India
 
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday LifeGBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday LifeAreesha Ahmad
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfStart Project
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneySérgio Sacani
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!University of Hertfordshire
 

Recently uploaded (20)

Fun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfFun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdf
 
Electricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 studentsElectricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 students
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
 
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 RpWASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
 
TEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdfTEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdf
 
HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPTHIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
 
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
 
Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Information science research with large language models: between science and ...
Information science research with large language models: between science and ...
 
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
 
GBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) MetabolismGBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) Metabolism
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) Enzymology
 
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxBiochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
 
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfFORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
 
In-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxIn-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptx
 
Lubrication System in forced feed system
Lubrication System in forced feed systemLubrication System in forced feed system
Lubrication System in forced feed system
 
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday LifeGBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdf
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
 

An Answer Set Programming based framework for High-Utility Pattern Mining extended with Facets and Advanced Utility Functions

  • 1. The 5th International Joint conference on Rules and Reasoning Sept. 8–15, 2021 (virtually in) Leuven, Belgium An Answer Set Programming based framework for High-Utility Pattern Mining extended with Facets and Advanced Utility Functions Francesco Cauteruccio and Giorgio Terracina DEMACS, University of Calabria, Italy {cauteruccio, terracina}@mat.unical.it
  • 2. Outline • Context and Motivation • Proposed Framework • ASP Approach • Experimental Evaluation • Conclusion 1/17
  • 3. Context and Motivation • Pattern Mining is one of the most studied data mining branches • Find interesting patterns (set of items) in a database of transactions • Frequent pattern mining, sequential pattern mining, etc… • High-Utility Pattern Mining (HUPM) • Find patterns having a high-utility (w.r.t. some utility measure) • Example: in a sales database, the utility of a pattern may be represented by the profit of some items bought together. • Basic assumption: each item is associated with one, static utility . • However… • The utility of an item can be defined from very different point of views, • Transactions are not only flat lists of items but they can provide different level of abstractions. Pattern Mining and High-Utility Pattern Mining 2/17
  • 4. Context and Motivation • We present a framework for HUPM extending basic notions and introducing: • A higher level of abstraction with transaction set representation • For each transaction, an Object, a Container and a Database level of aggregation can be defined. • The notion of facet • A facet is an attribute which can be associated with an item, a transaction, an object or a container; • Each element may be characterized by more than one facet. • A taxonomy of extended utility functions • Based on the database structure and facets, • They can be combined in several ways to fit different notions of utility. An extended framework for HUPM 3/17
  • 5. Proposed Framework • The database representation is multi-layer • 𝐷𝑎𝑡𝑎𝑏𝑎𝑠𝑒 → 𝐶𝑜𝑛𝑡𝑎𝑖𝑛𝑒𝑟 → 𝑂𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛 • Given a database 𝐷 and a set of transactions {𝑇!, … , 𝑇"} • 𝐷 is organized as a set of containers 𝐶 = {𝐶!, … , 𝐶"} • 𝐶# can be associated with a set of objects 𝑂 = {𝑂!, … , 𝑂$} • 𝑂% contains a set of transactions {𝑇!, … , 𝑇&} Extending the database structure Running example depicting a sales database 5/17
  • 6. Proposed Framework • The utility of 𝑖 may be defined from different perspectives: facets. • Each item 𝑖 can be associated with one or more facets. • Facets can also be defined for transactions, objects and containers. • To describe facets, we use the notion of utility vector. Facets 6/17
  • 7. Proposed Framework • Item utility vector for an item 𝑖 • 𝐼𝑈! = [𝑖𝑢", 𝑖𝑢#, … , 𝑖𝑢$], and 𝑖𝑢% describes a certain facet of 𝑖 • Transaction utility vector for a transaction 𝑇# • 𝑇𝑈&! = [𝑡𝑢", 𝑡𝑢#, … , 𝑡𝑢'], and 𝑡𝑢% describes a certain facet of 𝑇( • Internal utility of 𝑖 in 𝑇( is available and denoted as 𝑞(𝑖, 𝑇() • Object utility vector • 𝑂𝑈) = 𝑜𝑢", 𝑜𝑢#, … , 𝑜𝑢* , and 𝑜𝑢% describes a certain facet of 𝑂 • Container utility vector • 𝐶𝑈+ = [𝑐𝑢", 𝑐𝑢#, … , 𝑐𝑢,], and 𝑐𝑢% describes a certain facet of 𝐶 Facets and utility vectors facets: price, weight 7/17
  • 8. Proposed Framework • Intra-pattern utility function • Let 𝑃 be a pattern with 𝑟 items, and 𝑇( be a transaction containing it, • Let 𝐼𝑈𝑆 = {𝐼𝑈", … , 𝐼𝑈-} be the set of item utility vectors of 𝑃 • The intra-pattern utility function 𝐼𝑈&! = 𝑖𝑝(𝑃, 𝑇(, 𝐼𝑈𝑆) generates an unique item utility vector for the pattern occurrence. • 𝑖𝑝 ⋅ can be any function combining the utilities across the facets, such as SUM, MAX, MIN, AVG, etc… • Pattern utility function • The occurrence utility vector for a pattern 𝑃 is 𝑂𝑐𝑐𝑈&! = [𝐼𝑈&! , 𝑇𝑈&! , 𝑂𝑈&! , 𝐶𝑈&! ] • The pattern utility vector 𝑈. is the collection of all the occurrence utility vectors of 𝑃: 𝑈. = ⋃&!∈&" 𝑂𝑐𝑐𝑈&! • 𝑈. is a matrix (occurrence utility vectors × facets). Advanced utility functions In this example, 𝑖𝑝 = ∑ 𝑓𝑎𝑐𝑒𝑡 × 𝑖𝑛𝑡𝑒𝑟𝑛𝑎𝑙𝑈𝑡𝑖𝑙𝑖𝑡𝑦 8/17
  • 9. Proposed Framework • The utility 𝑢 of a pattern 𝑃 can be obtained as an arbitrary combination of the values of 𝑈! using a function 𝑢(𝑃) • 𝑢(𝑃) can be classified as • Horizontal first 𝑢 𝑃 = 𝑓?(𝑓@(𝑢.)) combines by row (facets), then by column (occurrences) • Vertical first 𝑢 𝑃 = 𝑓@(𝑓?(𝑢.)) combines by column (occurrences), then by row (facets) • Mixed 𝑢 𝑃 = 𝑓 𝑢. combines the values at once • All of these classifications can be further classified in • Inter-transaction utility • Pattern-vs-object utility • Pattern-vs-container utility • These can be exploited to define utility measures relating item/transaction facets and one of the object/container facets, such as Pearson and Multiple correlation. Advanced utility functions As an example, suppose we select 𝑓# = 𝑓𝑖𝑙𝑡𝑒𝑟 ⋅ and 𝑓$ = max ⋅ , that is we first filter for a single facet, and then we take the maximum across all the occurrences. 9/17
  • 10. Proposed Framework • Given a pattern 𝑃, we say that 𝑃 is an extended high-utility pattern if its utility 𝑢(𝑃) is greater than a minimum threshold 𝑡ℎ$ and it occurs in at least 𝑡ℎ% transactions. • The problem of extended high-utility pattern mining (e-HUPM) is to discover all the extended high-utility patterns in a given database 𝐷. The e–HUPM problem 10/17
  • 11. ASP Approach • We want to provide as much flexibility as possible in the definition of what is a useful pattern. • Encoding the problem in Answer Set Programming (ASP) helps achieving the desidered flexibility and modularity. • Classic guess-and-check scenario • We generate one answer set for each valid pattern, • Complex utility functions are executed by means of external functions (e.g., in DLVHEX, WASP, clingo) • Pattern validity criteria and filter can be easily applied by encoding them as rules. The why and the how 11/17
  • 12. ASP Approach The encoding %%% Input schema: %container(ContainerId) %object(ObjectId,ContainerId) %transaction(Tid, ObjectId) %item(Item, Tid, Position, Q) %itemUtilityVector(Item, I1, ..., Il) %transactionUtilityVector(Tid, T1, ..., Tm) %objectUtilityVector(ObjectId, O1, ..., On) %containerUtilityVector(ContainerId, C1, ..., Co) %%% Parameters occurrencesThreshold(...). utilityTreshold(...). %%% Item pre-filtering usefulItem(I):- item(I,_,_,_),....any condition on the items. %%% Candidate pattern generation {inCandidatePattern(I)}:- usefulItem(I). %%% Occurrences computation and check inTransaction(Tid):- transaction(Tid,_), not incomplete(Tid). incomplete(TiD):- transaction(Tid,_), inCandidatePattern(I), not contains(I,Tid). contains(I,Tid):- item(I,Tid,_,_). :- #count{ Tid : inTransaction(Tid)}=N, N < Tho, occurrencesThreshold(Tho). %%% Utility computation patternItemUtilityVectors(Tid,Item,I1,...,Il,Q):- inCandidatePattern(Item), itemUtilityVector(Item, I1, ..., Il), inTransaction(Tid), item(Item, Tid, Position, Q). intraPatternUtilityVector(Tid,I1,...,Il):- &computeIntraPatternUtility[patternItemUtilityVectors](Tid,I1,...,Il). occurrenceUtilityVector(Tid,I1,...,Il,T1,...Tm,O1,...On,C1,...,Co):- inTransaction(Tid), intraPatternUtilityVector(Tid,I1,...,Il) transactionUtilityVector(Tid, T1, ..., Tm), transaction(Tid, ObjectId), objectUtilityVector(ObjectId, O1, ..., On), object(ObjectId , ContainerId) containerUtilityVector(ContainerId, C1, ..., Co). :- &computeUtility[occurrenceUtilityVector](U), U < Thu, utilityTreshold(Thu). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 12/17
  • 13. Experimental Evaluation • Aspect-based sentiment analysis of scientific reviews [1] • Each review is annotated with 8 different aspects: appropriataness, clarity, originality, empirical/theoretical soudness, meaningful comparison, substance, impact and recommendation • Each aspect has assigned one sentiment value from {𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒, 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒, 𝑛𝑒𝑢𝑡𝑟𝑎𝑙, 𝑎𝑏𝑠𝑒𝑛𝑡}. • 814 papers and a total of 1148 reviews. • 2230 annotated lines with a total of 15124 distinct words. Dataset [1] Chakraborty, S., Goyal, P., Mukherjee, A.: Aspect-based sentiment analysis of scientific reviews. In: JCDL ’20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, Virtual Event, China, August 1-5, 2020. pp. 207–216 (2020), ACM 13/17
  • 14. Experimental Evaluation • Average running time • 𝑢(𝑃): disagreement between originality and decision (% of pattern occurrences showing a positive originality and a reject decision, thresholds: [15, 50, 75, 100]) Quantitative analysis • Comparison with HUPM systems • No analysis involving more than one elements can be devised with such systems • Here, the utility of an item is the absolute value of the appropriateness aspect in the sentence it first appears in. 14/17
  • 15. Experimental Evaluation • 𝑢 𝑃 : Pearson correlation between an aspect 𝑋 and the final decision on the corresponding paper. • «The set of review sentences containing 𝑃 are characterized by a direct correlation between the sentiment on 𝑋 and the final decision on the corresponding paper» • 𝑢(𝑃): Multiple correlation between two aspects 𝑋 and 𝑌 and the final decision on the corresponding paper. Qualitative analysis 15/17
  • 16. Conclusion • We introduced a general framework for HUPM with several extensions. • The framework allows to work with multi-dimensional data and with different utility measures. • A versatile and modular ASP encoding has been developed. • We employed a real use case on paper review to carry out both quantitative and qualitative analyses. • Facets and advanced utility functions help reducing the amount of relevant patterns • Useful in providing deep insights on the data. • Not an ending point! • Apply the framework to new contexts, • Derive ad-hoc algorithms for the extended utility functions. 16/17
  • 17. Thanks for your attention! These slides available at https://bit.ly/ehupm-ruleml21 Francesco Cauteruccio Research Fellow @ DEMACS, University of Calabria cauteruccio@mat.unical.it francescocauteruccio.info @finalfire