SlideShare a Scribd company logo
1 of 51
Download to read offline
Semantic Meta-Mining of Knowledge Discovery
Processes
Agnieszka Lawrynowicz
collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario,
Claudia d’Amato, Raul Palma and others - see acknowledgements
Poznan University of Technology
June 11, 2015
ADAA Seminar
Silesian University of Technology
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Outline
Semantic data mining
Pattern discovery with Fr-ONT-Qu
Meta-mining of KD processes
▸ e-LICO Intelligent Discovery Assistant
▸ Data Mining OPtimization Ontology
▸ Semantic meta-mining
Summary and future work
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Introduction: data mining
Input: a data table, text documents, ...
Output: a model, a pattern set
DATA$MINING$
Model,$pa0erns$
data$
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Introduction: using background knowledge in data mining
Using background knowledge in data mining has been extensively
researched
hierarchy/taxonomy of attributes (Michalski et al., 1986, Srikant,
Agrawal, 1995)
Inductive Logic Programming (Muggleton, 1991, Lavrac and
Dzeroski, 1994)
relational learning (Quinlan, 1993, de Raedt, 2008)
semantic data mining tutorial @ ECML/PKDD’2011 (Lavrac,
Vavpetic, Lawrynowicz, Potoniec, Hilario, Kalousis)
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Introduction: relational data mining
Input: a relational database, a graph, a set of logical facts, ...
Output: a model, a pattern set
RELATIONAL)
DATA)MINING)
Model,)pa4erns)
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Semantic data mining
Input:
a data table, text documents, Web pages, a relational database, a
graph, a set of logical facts, ...
one or more ontologies
Output: a model, a pattern set
SEMANTIC)
DATA)MINING)
Model,)pa3erns)
Data)
Ontologies)
annota;ons)
mappings)
vocabulary)reBuse)
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Fr-ONT-Qu
algorithm for mining patterns in RDF(s) data
patterns expressed as SPARQL queries
consists of: a refinement operator and a strategy to select best
patterns for further refinement
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Overview
Input of the algorithm:
a declarative bias (B) to limit a search space (i.e. classes and
properties to use) and maximal number of iterations
2 thresholds: for keeping good enough patterns and for refining best
patterns
several quality measures to select for thresholds (e.g. support on KB)
beam search size
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Example
B: classes: PassengerTrain, CargoTrain, property: hasEngine
1 Refine every pattern from the previous iteration by adding a single
restriction for a variable already existing in the pattern. E.g. for
patern {?x a :Train.}, its refinements are:
▸ {?x a :Train . ?x a :CargoTrain.}
▸ {?x a :Train . ?x a :PassengerTrain}
▸ {?x a :Train . ?x :hasEngine ?y}
2 Evaluate patterns (with some quality measure as support on a data
set) and select only the best ones
3 Repeat steps 1-2 as long as there are patterns for refinement and
maximal number of iterations is not exceeded
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Trie data structure
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Pattern based classification 1/2
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Pattern based classification 2/2
We learn features that are optimized with regard to the (classification) task
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Propositionalisation 1/2
Pa#erns	
  
	
  
1)  ?x	
  a	
  :Train	
  .	
  ?x	
  :hasCar	
  ?y	
  
2)  ?x	
  a	
  :Train	
  .	
  ?x	
  :hasCar	
  ?y	
  .	
  ?y	
  :hasShape	
  :rectangle	
  
3)  ?x	
  a	
  :Train	
  .	
  ?x	
  :hasCar	
  ?y	
  .	
  ?y	
  :wheels	
  :three	
  
4)  …	
  
Dataset	
  (Michalski’s	
  train	
  problem,	
  1977)	
  
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Propositionalisation 2/2
In this way, learned features may be consumed by any out-of-the-shelf
’attribute-value’ classification algorithm
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
What is RapidMiner? 1/2
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
What is RapidMiner? 2/2
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
What is RapidMiner? 2/2
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
RMonto - plugin to RapidMiner
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Comparative experiments on classification of semantic data
1/2
we considered published work with available results and datasets
(including ESWC 2008 best paper, ESWC 2012 best paper)
various types of methods: kernel methods, statistical relational
classifier, concept learning algorithms
we strictly followed the tasks, protocols and experimental setups of
the methods
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Comparative experiments on classification of semantic data
2/2
For classification task Fr-ONT-Qu outperformed state-of-art approaches to
classification of Semantic Web data
(see: ”Pattern based feature construction in semantic data mining” by A.
Lawrynowicz, J. Potoniec, IJSWIS 10(1), 2014):
kernel methods Bloehdorn et al. (2007), Loesch et al. (ESWC 2012
best paper) on SWRC AIFB dataset,
statistical relational classifier SPARQL-ML by Kiefer et al (ESWC
2008 best paper) on SWRC AIFB dataset and OWLS-TC v2.1
dataset,
concept learning algorithms DL-FOIL by Fanizzi et al (2008),
DL-Learner cutting-edge CELOE variant by Lehmann (2009) on all
measures on datasets BioPax, NTN, Financial
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Overview of meta-learning
Meta-learning: learning to learn
application of machine learning techniques to meta-data about past
machine learning experiments;
the goal: to modify some aspect of the learning process to improve
the performance of the resulting model;
meta-mining: meta-learning applied to full data mining process
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Overview of the e-LICO system
.,+1B/0DF'4;)<'<!=1)/*'.0!<!*1<'1;!'?)BB!0!*1'=/D./*!*1<'/B'1;!'!"#$%&'+0=;)1!=1>0!'G()E>0!'7H'+*?'
<;/I<';/I'1;!A')*1!0+=1'1/'+=;)!J!'1;!'><!0K<'L*/I,!?E!'?)<=/J!0A'E/+,F''
4;!'!"#$%&')*B0+<10>=1>0!'G?!.)=1!?')*'1;!'B)E>0!'>*?!0'1;!'?+<;!?',)*!H')<'1;!'D!+*<'MA'I;)=;'1;!'
?+1+"D)*)*E'.,+1B/0D')<'?!,)J!0!?'1/'<=)!*1)<1<F'4;!')**/J+1)J!'=/0!''/B'1;!'!"#$%&'.,+1B/0D')<'1;!'
!"#$%%&'$"#( )&*+,-$./( 0**&*#1"#' G$NOP' +M/J!' 1;!' ?+<;!?' ,)*!H' I)1;' )1<' .,+**!0' +*?' D!1+",!+0*!0F'
Q/I!J!0P'1/'?!,)J!0'1;!'?+1+"D)*)*E'.,+1B/0D'1/')1<'<=)!*1)<1'><!0<P'1;!0!'+0!'<!J!0+,'/1;!0'<!0J)=!<'
+*?'=/D./*!*1<F'()E>0!'7'<;/I<'+*'/J!0J)!I'/B'!"#$%&R<'=/D./*!*1<'+*?';/I'1;!A')*1!0+=1'I)1;'
!+=;'/1;!0F'
'
()E>0!'7F'&J!0J)!I'/B'1;!'!"#$%&'<A<1!DF''
4;!0!'+0!'1I/'><!0"B+=)*E'=/D./*!*1<'B/0'1;!'!"#$%&'.,+1B/0DS'1;!<!'+,,/I'<=)!*1)<1<'1/'+==!<<'?+1+"
D)*)*E' /.!0+1/0<' +*?T/0' /1;!0' ?+1+' .0/=!<<)*E' <!0J)=!<P' 1/' =/D./<!' 1;!D' )*1/' I/0LB,/I<' +*?'
!U!=>1!' 1;!DP' =/,,!=1)*E' 1;!' 0!<>,1<' B/0' )*1!0.0!1+1)/*' /0' B>01;!0' +*+,A<)<F' 4;!<!' 1I/' =!*10+,'
)*B0+<10>=1>0!'=/D./*!*1<'+0!V'Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
IDA architecture
!"##
$%&''()#
goal
data
*
DM Workflow
Ontology (DMWF)
$)+,&,-%-./0##
1&'2()#
planned workflows
ranked workflows
3 4
5(6&'/0#
7(8&97-'()#
meta-mined model
:
DM Optimization
Ontology (DMOP)
;7<=#
;>#
training meta-data
?
top ranked
workflows
@
INTELLIGENT DISCOVERY ASSISTANT
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Ontology in computer science
“engineering artefact [...]“ (Guarino 98)
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Ontology in computer science
“engineering artefact [...]“ (Guarino 98)
“An ontology is a
formal specification Á machine interpretation
of a shared Á group of people, consensus
conceptualization Á abstract model of phenomena, concepts
of a domain of interest“ Á domain knowledge
(Gruber 93)
Ontologia = formal specification of a terminology (from a particular
domain)
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Data Mining OPtimization Ontology (DMOP)
the primary goal of DMOP is to support all decision-making steps
that determine the outcome of the data mining process;
development started in EU FP7 project e-LICO (2009-2012);
DMOP v5.5: 723 classes, 111 properties, 4291 axioms;
highly axiomatized;
represented in Web Ontology Language (OWL 2);
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Competency questions
”Given a data mining task/data set, which of the valid or applicable
workflows/algorithms will yield optimal results (or at least better results
than the others)?”
”Given a set of candidate workflows/algorithms for a given task/data
set, which data set/workflow/algorithm characteristics should be
taken into account in order to select the most appropriate one?”
and others more fine-grained, e.g.:
”Which induction algorithms should I use (or avoid) when my dataset
has many more variables than instances?”
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Architecture of DMOP knowledge base and its satellite
triple stores
TBox%
DMOP%
ABox%
Operator%DB%
DMEX(DB1%%%%DMEX(DB2%%…%%%DMEX(DBk%
OWL2%
RDF%
Triple%
Store%
Formal%Conceptual%Framework%%
of%Data%Mining%Domain%
Accepted%Knowledge%of%DM%
Tasks,%Algorithms,%Operators%%
Specific%DM%ApplicaFons%
Datasets,%Workflows,%Results%
MetaHminer’s%training%data%
MetaHminer’s%prior%%
DM%knowledge%
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
The core concepts of DMOP
Fig. 1. The core concepts of DMOP.
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
DMOP: algorithm representation
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Alignment of DMOP with DOLCE 1/3
Two main reasons to align DMOP with a foundational ontology:
considerations about attributes and data properties; extant
non-foundational ontology solutions were partial re-inventions of how
they are treated in a foundational ontology;
reuse of the ontology’s object properties;
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Alignment of DMOP with DOLCE 2/3
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Alignment of DMOP with DOLCE 3/3
Perdurant: DM-Experiment and DM-Operation are subclasses of
dolce:process;
Endurant: most DM classes, such as algorithm, software, strategy,
task, and optimization problem, are subclasses of
dolce:non-physical-endurant;
Quality: characteristics and parameters of DM entities made
subclasses of dolce:abstract-quality;
Abstract: for identifying discrete values, classes added as subclasses
of dolce:abstract-region;
object properties: DMOP reuses mainly DOLCE’s parthood, quality,
and quale relations;
each of the four DOLCE main branches have been used.
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Qualities and attributes 1/4
How to handle ’attributes’ in OWL ontologies, and, in a broader context,
measurements?
easy way: attribute is a binary functional relation between a class and
a datatype
Elephant ⊑ =1 hasWeight.integer
Elephant ⊑ =1 hasWeightPrecise.real
Elephant ⊑ =1 hasWeightImperial.integer (in lbs)
building into one’s ontology application decisions about how to store
the data (and in which unit it is)
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Qualities and attributes 2/4
How to handle ’attributes’ in OWL ontologies, and, in a broader context,
measurements?
more elaborate way: unfold the notion of an object’s property (e.g.
weight) from one attribute/OWL data property into at least two
properties: one OWL object property from the object to the ’reified
attribute’ (“quality property” represented as an OWL class) and
another property to the value(s)
▸ favoured in foundational ontologies;
▸ solves the problem of non-reusability of the ’attribute’ and prevents
duplication of data properties;
▸ neither ontology has any solution to represent actual values and units
of measurements;
measurements for DMOP more alike values for parameters;
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Qualities and attributes 3/4
DM-Data
dolce:non-physical-endurant dolce:abstract
DataType DataFormat
dolce:quality
dolce:region
dolce:abstract-regiondolce:quale
dolce:abstract-quality
anyType
hasDataValue
Characteristic Parameter
hasDataType
hasDataType
dolce:has-quale
dolce:particular
dolce:has-quality
dolce:q-location
TableFormat
DataTable hasTableFormat
DataCharacteristic
has-quality
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Qualities and attributes 4/4
ModelingAlgorithm ⊑ =1 has-quality.LearningPolicy
LearningPolicy is a dolce:quality
LearningPolicy ⊑ =1 has-quale.Eager-Lazy
Eager-Lazy is a subclass of dolce:abstract-region
Eager-Lazy ⊑ ≤ 1 hasDataValue.anyType
In this way, the ontology can be linked to many different applications, who
even may use different data types, yet still agree on the meaning of the
characteristics and parameters (’attributes’) of the algorithms, tasks, and
other DM endurants.
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Meta-modeling in DMOP 1/4
only processes (executions of workflows) and operations (executions
of operators) consume inputs and produce outputs
DM algorithms (as well as operators and workflows) can only specify
the type of input or output
inputs and outputs (DM-Dataset and DM-Hypothesis class hierarchy,
respectively) are modeled as subclasses of IO-Object class
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Meta-modeling in DMOP 2/4
DM algorithms: classes or individuals? Individuals.
Problem: expressing types of inputs/outputs associated with
algorithm
”C4.5 specifiesInputClass CategoricalLabeledDataSet” 
Individual Class
(instance of DM-Algorithm) (subclass of DM-Hypothesis)
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Meta-modeling in DMOP 3/4
Initial solution: one artificial class per each single algorithm with a
single instance corresponding to this particular algorithm
Problem: hasInput, hasOutput, specifiesInputClass,
specifiesOutputClass—assigned a common range—IO-Object
”C4.5 specifiesInputClass Iris” ?
Individual Individual
(instance of DM-Algorithm) (instance of DM-Hypothesis)
Iris is a concrete dataset. Clearly, any DM algorithm is not designed
to handle only a particular dataset.
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Meta-modeling in DMOP 4/4
Final solution: weak form of punning available in OWL 2
IO-Class: meta-class—the class of all classes of input and output
objects
”C4.5 specifiesInputClass CategoricalLabeledDataSet” 
Individual Individual
(instance of DM-Algorithm) (instance of IO-Class)
”DM-Process hasInput some CategoricalLabeledDataSet” 
Class Class
(subclass of dolce:process) (subclass of IO-Object)
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
DMOP: further details
Data Mining Optimization Ontology. C. Maria Keet, Agnieszka
Lawrynowicz, Claudia d’Amato, Alexandros Kalousis, Phong Nguyen, Raul
Palma, Robert Stevens, and Melanie Hilario, Journal of Web Semantics,
DOI: 10.1016/j.websem.2015.01.001
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Recap: Propositionalisation
Pa#erns	
  
	
  
1)  ?x	
  a	
  :Train	
  .	
  ?x	
  :hasCar	
  ?y	
  
2)  ?x	
  a	
  :Train	
  .	
  ?x	
  :hasCar	
  ?y	
  .	
  ?y	
  :hasShape	
  :rectangle	
  
3)  ?x	
  a	
  :Train	
  .	
  ?x	
  :hasCar	
  ?y	
  .	
  ?y	
  :wheels	
  :three	
  
4)  …	
  
Dataset	
  (Michalski’s	
  train	
  problem,	
  1977)	
  
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
RapidMiner XML based workflow representation
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Importing RapidMiner worfklows to DMOP based RDF
format
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Propositionalisation
Workflow	
  pa*erns	
  
	
  
	
  
Dataset	
  
DMOP-­‐based	
  RDF	
  
repository	
  of	
  DM	
  
processes	
  
Results of experiments. Below we present the results of experimental evaluation of Fr-ONT-Qu
in the meta-mining scenario. In the experiments, we used OWLIM SE (v5.3.5849) as an
underlying reasoning engine and a semantic store with the owl2-rl-reduced-optimized ruleset.
The choice of such a ruleset was motivated by the expressivity of our background knowledge
base, e.g. existence of object property chains. During each cycle of cross-validation, Fr-ONT-Qu
discovered around 2000 patterns, and redundant patterns were subsequently pruned. We discuss
some of the discovered patterns below (for compactness denoting by Bd the body of the base
pattern used in the experiments). The first example pattern:
Q1 = select distinct ?x where { Bd ∪
?opex2!dmop:executes ?front0 .!
?opex2!dmop:executes rm:RM-Decision_Tree .!
?opex2!dmop:hasParameterSetting ?front1.!
?front0!dmop:executes rm:DM-Operator .!
?front0!dmop:implements ?front2 .!!!
?front2 a dmop:DM-Algorithm .
?front2 a dmop:InductionAlgorithm .!!!
?front2 a dmop:ModelingAlgorithm .!!!
?front2 a dmop:ClassificationModelingAlgorithm .!!!
?front2 a dmop:ClassificationTreeInductionAlgorithm .!}!
was mined when Fr-ONT-Qu traversed down the algorithm classes hierarchy specializing
variable ?front2. In this way, it is possible to abstract from the level of operators (algorithm
implementations) to the level of algorithms and their taxonomy. For instance, both rm:RM-
Decision_Tree and weka:Weka-J48 operators implement a classification tree induction
algorithm and one may generalize over it. The patterns containing class hierarchies provide
similar expressivity to this of patterns mined in so-called generalized association rule mining.
The following pattern covers only those workflows that contain ‘Decision Tree’ operator,
for which the parameter minimal size for split has value between 2 and 5.5:
Q2 = select distinct ?x where { Bd ∪
?opex2!dmop:executes ?front0 .!
?opex2!dmop:executes rm:RM-Decision_Tree .!
?opex2!dmop:hasParameterSetting ?front1.!
?front0!dmop:executes rm:DM-Operator .!
?front1!dmop:setsValueOf ?front2.!
?front1!dmop:hasValue ?front3.!
filter(2.000000 = xsd:double(?front3)  xsd:double(?front3) = 16.000000) .
?front2!dmop:hasParameterKey 'minimal_size_for_split'.!
?front1!dmop:hasValue ?front3.!
filter(2.000000 = xsd:double(?front3)  xsd:double(?front3) = 9.000000) .
?front1!dmop:hasValue ?front3.!
filter(2.000000 = xsd:double(?front3)  xsd:double(?front3) = 5.500000) . }
Dataset	
  characteris3cs	
  
…	
  
Features	
  
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Semantic meta-mining experimental setup
baseline DM experiment set: 1581 RapidMiner workflows solving a
predictive modeling task on 11 UCI datasets
dataset characteristics meta-data stored in DMEX-DB containing
over 85 million of RDF triples
workflow patterns represented as SPARQL queries using DMOP
entities
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
The inside of X-Validation operator with the workflow for
training and evaluating the pattern-based model
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Semantic meta-mining results
McNemar’s test for pairs of classifiers performed with the null
hypothesis that a classifier built using dataset characteristics and a
mined pattern set has the same error rate as the baseline that used
dataset characteristics and only the names of the machine learning
DM operators
Test confirmed that classifiers trained using workflow patterns
performed significantly better (accuracy 0.927) than the baseline
(accuracy 0.890)
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Summary and future work
RMonto RapidMiner plugin, all experimental data and (meta-mining)
workflows are publicly available:
http://www.myexperiment.org/packs/421.html,
http://semantic.cs.put.poznan.pl/fr-ont/
LeoLOD project - Learning and Evolving Ontologies from Linked
Open Data (2013-2015)
▸ project funded by Foundation for Polish Science under the POMOST
program,
▸ Fr-ONT-Qu re-adapted for ontology learning,
▸ DMOP used to model provenance metadata (in industry: treaceability)
of ontology learning workflows
DMOP is being aligned to OPMW (Open Provenance Model for
Workflows)
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50
Acknowledgements
Foundation for Polish Science under the POMOST programme,
cofinanced from European Union, Regional Development Fund (No
POMOST/2013-7/8) (2013-2015)
EU FP7 ICT-2007.4.4 (No 231519) ”e-LICO: An e-Laboratory for
Interdisciplinary Collaborative Research in Data Mining and
Data-Intensive Science” (2009-2012)
RMonto, Meta-mining experiments, LeoLOD plugin done jointly with
Jedrzej Potoniec
Contributors to the development of DMOP and/or other e-LICO
infrastructure used in the research described in this presentation:
Melanie Hilario, C. Maria Keet, Claudia d’Amato, Huyen Do, Simon
Fischer, Dragan Gamberger, Lina Al-Jadir, Simon Jupp, Alexandros
Kalousis, Joerg Uwe-Kietz, Petra Kralj Novak, Babak Mougouie,
Phong Nguyen, Raul Palma, Floarea Serban, Robert Stevens, Anze
Vavpetic, Jun Wang, Derry Wijaya, Adam Woznica
Thanks to Veli Bicer for sharing the AIFB dataset
Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes
June 11, 2015 ADAA Semina
/ 50

More Related Content

What's hot

A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesBesnik Fetahu
 
OU Rise library analytics viz
OU Rise library analytics vizOU Rise library analytics viz
OU Rise library analytics vizTony Hirst
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Lihua Zhao
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionLihua Zhao
 
AlphaPy: A Data Science Pipeline in Python
AlphaPy: A Data Science Pipeline in PythonAlphaPy: A Data Science Pipeline in Python
AlphaPy: A Data Science Pipeline in PythonMark Conway
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Rinke Hoekstra
 
Trust Models for RDF Data: Semantics and Complexity - AAAI2015
Trust Models for RDF Data: Semantics and Complexity - AAAI2015Trust Models for RDF Data: Semantics and Complexity - AAAI2015
Trust Models for RDF Data: Semantics and Complexity - AAAI2015Valeria Fionda
 
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.tomasknap
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph MaintenancePaul Groth
 
SemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic WebSemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic WebAdrian Paschke
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Andre Freitas
 
Using Text Comprehension Model for Learning Concepts, Context, and Topic of...
Using Text Comprehension Model for  Learning Concepts, Context, and Topic  of...Using Text Comprehension Model for  Learning Concepts, Context, and Topic  of...
Using Text Comprehension Model for Learning Concepts, Context, and Topic of...Kent State University
 
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...Paolo Nesi
 
Publishing Math Lecture Notes as Linked Data
Publishing Math Lecture Notes as Linked DataPublishing Math Lecture Notes as Linked Data
Publishing Math Lecture Notes as Linked DataChristoph Lange
 
Capturing the context: one small(ish step for modellers, one giant leap for m...
Capturing the context: one small(ish step for modellers, one giant leap for m...Capturing the context: one small(ish step for modellers, one giant leap for m...
Capturing the context: one small(ish step for modellers, one giant leap for m...FAIRDOM
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflowsSSSW
 
Project Proposal Topics Modeling (Ir)
Project Proposal    Topics Modeling (Ir)Project Proposal    Topics Modeling (Ir)
Project Proposal Topics Modeling (Ir)Svitlana volkova
 
Survey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search EnginesSurvey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search EnginesYu Liu
 
From Search to Predictions in Tagged Information Spaces
From Search to Predictions in Tagged Information SpacesFrom Search to Predictions in Tagged Information Spaces
From Search to Predictions in Tagged Information SpacesChristoph Trattner
 

What's hot (20)

CV
CVCV
CV
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
 
OU Rise library analytics viz
OU Rise library analytics vizOU Rise library analytics viz
OU Rise library analytics viz
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge Acquisition
 
AlphaPy: A Data Science Pipeline in Python
AlphaPy: A Data Science Pipeline in PythonAlphaPy: A Data Science Pipeline in Python
AlphaPy: A Data Science Pipeline in Python
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
 
Trust Models for RDF Data: Semantics and Complexity - AAAI2015
Trust Models for RDF Data: Semantics and Complexity - AAAI2015Trust Models for RDF Data: Semantics and Complexity - AAAI2015
Trust Models for RDF Data: Semantics and Complexity - AAAI2015
 
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
SemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic WebSemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic Web
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
 
Using Text Comprehension Model for Learning Concepts, Context, and Topic of...
Using Text Comprehension Model for  Learning Concepts, Context, and Topic  of...Using Text Comprehension Model for  Learning Concepts, Context, and Topic  of...
Using Text Comprehension Model for Learning Concepts, Context, and Topic of...
 
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
 
Publishing Math Lecture Notes as Linked Data
Publishing Math Lecture Notes as Linked DataPublishing Math Lecture Notes as Linked Data
Publishing Math Lecture Notes as Linked Data
 
Capturing the context: one small(ish step for modellers, one giant leap for m...
Capturing the context: one small(ish step for modellers, one giant leap for m...Capturing the context: one small(ish step for modellers, one giant leap for m...
Capturing the context: one small(ish step for modellers, one giant leap for m...
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
Project Proposal Topics Modeling (Ir)
Project Proposal    Topics Modeling (Ir)Project Proposal    Topics Modeling (Ir)
Project Proposal Topics Modeling (Ir)
 
Survey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search EnginesSurvey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search Engines
 
From Search to Predictions in Tagged Information Spaces
From Search to Predictions in Tagged Information SpacesFrom Search to Predictions in Tagged Information Spaces
From Search to Predictions in Tagged Information Spaces
 

Similar to Semantic Meta-Mining of Knowledge Discovery Processes

Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...Agnieszka Ławrynowicz
 
Semantic data mining: an ontology based approach
Semantic data mining: an ontology based approachSemantic data mining: an ontology based approach
Semantic data mining: an ontology based approachAgnieszka Ławrynowicz
 
Big Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday LearningBig Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday LearningStefan Dietze
 
Camp 4-data workshop presentation
Camp 4-data workshop presentationCamp 4-data workshop presentation
Camp 4-data workshop presentationPaolo Missier
 
Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...Mathieu d'Aquin
 
The lifecycle of reproducible science data and what provenance has got to do ...
The lifecycle of reproducible science data and what provenance has got to do ...The lifecycle of reproducible science data and what provenance has got to do ...
The lifecycle of reproducible science data and what provenance has got to do ...Paolo Missier
 
Moving beyond sameAs with PLATO: Partonomy detection for Linked Data
Moving beyond sameAs with PLATO: Partonomy detection for Linked DataMoving beyond sameAs with PLATO: Partonomy detection for Linked Data
Moving beyond sameAs with PLATO: Partonomy detection for Linked DataPrateek Jain
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsCarole Goble
 
2021_01_15 «Learning Analytics for Large Scale Data».
2021_01_15 «Learning Analytics for Large Scale Data».2021_01_15 «Learning Analytics for Large Scale Data».
2021_01_15 «Learning Analytics for Large Scale Data».eMadrid network
 
data mining
data miningdata mining
data mininguoitc
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?andrea huang
 
Data and Knowledge as Commodities
Data and Knowledge as CommoditiesData and Knowledge as Commodities
Data and Knowledge as CommoditiesMathieu d'Aquin
 
Dealing with Open Domain Data
Dealing with Open Domain DataDealing with Open Domain Data
Dealing with Open Domain DataMathieu d'Aquin
 
Semantic Technologies in Learning Environments
Semantic Technologies in Learning EnvironmentsSemantic Technologies in Learning Environments
Semantic Technologies in Learning EnvironmentsDragan Gasevic
 
Ph. D. Final Dissertation SLides
Ph. D. Final Dissertation SLidesPh. D. Final Dissertation SLides
Ph. D. Final Dissertation SLidesEmanuele Panigati
 
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the WebRetrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the WebStefan Dietze
 
Alexander Vodyaho & Nataly Zhukova — Implementation of Agile Concepts in Reco...
Alexander Vodyaho & Nataly Zhukova — Implementation of Agile Concepts in Reco...Alexander Vodyaho & Nataly Zhukova — Implementation of Agile Concepts in Reco...
Alexander Vodyaho & Nataly Zhukova — Implementation of Agile Concepts in Reco...AIST
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introductionbutest
 
Big Data, Beyond the Data Center
Big Data, Beyond the Data CenterBig Data, Beyond the Data Center
Big Data, Beyond the Data CenterGilles Fedak
 

Similar to Semantic Meta-Mining of Knowledge Discovery Processes (20)

Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...
 
Semantic data mining: an ontology based approach
Semantic data mining: an ontology based approachSemantic data mining: an ontology based approach
Semantic data mining: an ontology based approach
 
Big Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday LearningBig Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday Learning
 
Camp 4-data workshop presentation
Camp 4-data workshop presentationCamp 4-data workshop presentation
Camp 4-data workshop presentation
 
Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...
 
The lifecycle of reproducible science data and what provenance has got to do ...
The lifecycle of reproducible science data and what provenance has got to do ...The lifecycle of reproducible science data and what provenance has got to do ...
The lifecycle of reproducible science data and what provenance has got to do ...
 
Moving beyond sameAs with PLATO: Partonomy detection for Linked Data
Moving beyond sameAs with PLATO: Partonomy detection for Linked DataMoving beyond sameAs with PLATO: Partonomy detection for Linked Data
Moving beyond sameAs with PLATO: Partonomy detection for Linked Data
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of Scientists
 
2021_01_15 «Learning Analytics for Large Scale Data».
2021_01_15 «Learning Analytics for Large Scale Data».2021_01_15 «Learning Analytics for Large Scale Data».
2021_01_15 «Learning Analytics for Large Scale Data».
 
data mining
data miningdata mining
data mining
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?
 
Data and Knowledge as Commodities
Data and Knowledge as CommoditiesData and Knowledge as Commodities
Data and Knowledge as Commodities
 
Dealing with Open Domain Data
Dealing with Open Domain DataDealing with Open Domain Data
Dealing with Open Domain Data
 
Semantic Technologies in Learning Environments
Semantic Technologies in Learning EnvironmentsSemantic Technologies in Learning Environments
Semantic Technologies in Learning Environments
 
Ph. D. Final Dissertation SLides
Ph. D. Final Dissertation SLidesPh. D. Final Dissertation SLides
Ph. D. Final Dissertation SLides
 
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the WebRetrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
 
Alexander Vodyaho & Nataly Zhukova — Implementation of Agile Concepts in Reco...
Alexander Vodyaho & Nataly Zhukova — Implementation of Agile Concepts in Reco...Alexander Vodyaho & Nataly Zhukova — Implementation of Agile Concepts in Reco...
Alexander Vodyaho & Nataly Zhukova — Implementation of Agile Concepts in Reco...
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
Big Data, Beyond the Data Center
Big Data, Beyond the Data CenterBig Data, Beyond the Data Center
Big Data, Beyond the Data Center
 

Recently uploaded

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 

Recently uploaded (20)

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 

Semantic Meta-Mining of Knowledge Discovery Processes

  • 1. Semantic Meta-Mining of Knowledge Discovery Processes Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others - see acknowledgements Poznan University of Technology June 11, 2015 ADAA Seminar Silesian University of Technology Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 2. Outline Semantic data mining Pattern discovery with Fr-ONT-Qu Meta-mining of KD processes ▸ e-LICO Intelligent Discovery Assistant ▸ Data Mining OPtimization Ontology ▸ Semantic meta-mining Summary and future work Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 3. Introduction: data mining Input: a data table, text documents, ... Output: a model, a pattern set DATA$MINING$ Model,$pa0erns$ data$ Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 4. Introduction: using background knowledge in data mining Using background knowledge in data mining has been extensively researched hierarchy/taxonomy of attributes (Michalski et al., 1986, Srikant, Agrawal, 1995) Inductive Logic Programming (Muggleton, 1991, Lavrac and Dzeroski, 1994) relational learning (Quinlan, 1993, de Raedt, 2008) semantic data mining tutorial @ ECML/PKDD’2011 (Lavrac, Vavpetic, Lawrynowicz, Potoniec, Hilario, Kalousis) Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 5. Introduction: relational data mining Input: a relational database, a graph, a set of logical facts, ... Output: a model, a pattern set RELATIONAL) DATA)MINING) Model,)pa4erns) Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 6. Semantic data mining Input: a data table, text documents, Web pages, a relational database, a graph, a set of logical facts, ... one or more ontologies Output: a model, a pattern set SEMANTIC) DATA)MINING) Model,)pa3erns) Data) Ontologies) annota;ons) mappings) vocabulary)reBuse) Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 7. Fr-ONT-Qu algorithm for mining patterns in RDF(s) data patterns expressed as SPARQL queries consists of: a refinement operator and a strategy to select best patterns for further refinement Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 8. Overview Input of the algorithm: a declarative bias (B) to limit a search space (i.e. classes and properties to use) and maximal number of iterations 2 thresholds: for keeping good enough patterns and for refining best patterns several quality measures to select for thresholds (e.g. support on KB) beam search size Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 9. Example B: classes: PassengerTrain, CargoTrain, property: hasEngine 1 Refine every pattern from the previous iteration by adding a single restriction for a variable already existing in the pattern. E.g. for patern {?x a :Train.}, its refinements are: ▸ {?x a :Train . ?x a :CargoTrain.} ▸ {?x a :Train . ?x a :PassengerTrain} ▸ {?x a :Train . ?x :hasEngine ?y} 2 Evaluate patterns (with some quality measure as support on a data set) and select only the best ones 3 Repeat steps 1-2 as long as there are patterns for refinement and maximal number of iterations is not exceeded Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 10. Trie data structure Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 11. Pattern based classification 1/2 Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 12. Pattern based classification 2/2 We learn features that are optimized with regard to the (classification) task Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 13. Propositionalisation 1/2 Pa#erns     1)  ?x  a  :Train  .  ?x  :hasCar  ?y   2)  ?x  a  :Train  .  ?x  :hasCar  ?y  .  ?y  :hasShape  :rectangle   3)  ?x  a  :Train  .  ?x  :hasCar  ?y  .  ?y  :wheels  :three   4)  …   Dataset  (Michalski’s  train  problem,  1977)   Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 14. Propositionalisation 2/2 In this way, learned features may be consumed by any out-of-the-shelf ’attribute-value’ classification algorithm Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 15. What is RapidMiner? 1/2 Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 16. What is RapidMiner? 2/2 Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 17. What is RapidMiner? 2/2 Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 18. RMonto - plugin to RapidMiner Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 19. Comparative experiments on classification of semantic data 1/2 we considered published work with available results and datasets (including ESWC 2008 best paper, ESWC 2012 best paper) various types of methods: kernel methods, statistical relational classifier, concept learning algorithms we strictly followed the tasks, protocols and experimental setups of the methods Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 20. Comparative experiments on classification of semantic data 2/2 For classification task Fr-ONT-Qu outperformed state-of-art approaches to classification of Semantic Web data (see: ”Pattern based feature construction in semantic data mining” by A. Lawrynowicz, J. Potoniec, IJSWIS 10(1), 2014): kernel methods Bloehdorn et al. (2007), Loesch et al. (ESWC 2012 best paper) on SWRC AIFB dataset, statistical relational classifier SPARQL-ML by Kiefer et al (ESWC 2008 best paper) on SWRC AIFB dataset and OWLS-TC v2.1 dataset, concept learning algorithms DL-FOIL by Fanizzi et al (2008), DL-Learner cutting-edge CELOE variant by Lehmann (2009) on all measures on datasets BioPax, NTN, Financial Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 21. Overview of meta-learning Meta-learning: learning to learn application of machine learning techniques to meta-data about past machine learning experiments; the goal: to modify some aspect of the learning process to improve the performance of the resulting model; meta-mining: meta-learning applied to full data mining process Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 22. Overview of the e-LICO system .,+1B/0DF'4;)<'<!=1)/*'.0!<!*1<'1;!'?)BB!0!*1'=/D./*!*1<'/B'1;!'!"#$%&'+0=;)1!=1>0!'G()E>0!'7H'+*?' <;/I<';/I'1;!A')*1!0+=1'1/'+=;)!J!'1;!'><!0K<'L*/I,!?E!'?)<=/J!0A'E/+,F'' 4;!'!"#$%&')*B0+<10>=1>0!'G?!.)=1!?')*'1;!'B)E>0!'>*?!0'1;!'?+<;!?',)*!H')<'1;!'D!+*<'MA'I;)=;'1;!' ?+1+"D)*)*E'.,+1B/0D')<'?!,)J!0!?'1/'<=)!*1)<1<F'4;!')**/J+1)J!'=/0!''/B'1;!'!"#$%&'.,+1B/0D')<'1;!' !"#$%%&'$"#( )&*+,-$./( 0**&*#1"#' G$NOP' +M/J!' 1;!' ?+<;!?' ,)*!H' I)1;' )1<' .,+**!0' +*?' D!1+",!+0*!0F' Q/I!J!0P'1/'?!,)J!0'1;!'?+1+"D)*)*E'.,+1B/0D'1/')1<'<=)!*1)<1'><!0<P'1;!0!'+0!'<!J!0+,'/1;!0'<!0J)=!<' +*?'=/D./*!*1<F'()E>0!'7'<;/I<'+*'/J!0J)!I'/B'!"#$%&R<'=/D./*!*1<'+*?';/I'1;!A')*1!0+=1'I)1;' !+=;'/1;!0F' ' ()E>0!'7F'&J!0J)!I'/B'1;!'!"#$%&'<A<1!DF'' 4;!0!'+0!'1I/'><!0"B+=)*E'=/D./*!*1<'B/0'1;!'!"#$%&'.,+1B/0DS'1;!<!'+,,/I'<=)!*1)<1<'1/'+==!<<'?+1+" D)*)*E' /.!0+1/0<' +*?T/0' /1;!0' ?+1+' .0/=!<<)*E' <!0J)=!<P' 1/' =/D./<!' 1;!D' )*1/' I/0LB,/I<' +*?' !U!=>1!' 1;!DP' =/,,!=1)*E' 1;!' 0!<>,1<' B/0' )*1!0.0!1+1)/*' /0' B>01;!0' +*+,A<)<F' 4;!<!' 1I/' =!*10+,' )*B0+<10>=1>0!'=/D./*!*1<'+0!V'Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 23. IDA architecture !"## $%&''()# goal data * DM Workflow Ontology (DMWF) $)+,&,-%-./0## 1&'2()# planned workflows ranked workflows 3 4 5(6&'/0# 7(8&97-'()# meta-mined model : DM Optimization Ontology (DMOP) ;7<=# ;># training meta-data ? top ranked workflows @ INTELLIGENT DISCOVERY ASSISTANT Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 24. Ontology in computer science “engineering artefact [...]“ (Guarino 98) Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 25. Ontology in computer science “engineering artefact [...]“ (Guarino 98) “An ontology is a formal specification Á machine interpretation of a shared Á group of people, consensus conceptualization Á abstract model of phenomena, concepts of a domain of interest“ Á domain knowledge (Gruber 93) Ontologia = formal specification of a terminology (from a particular domain) Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 26. Data Mining OPtimization Ontology (DMOP) the primary goal of DMOP is to support all decision-making steps that determine the outcome of the data mining process; development started in EU FP7 project e-LICO (2009-2012); DMOP v5.5: 723 classes, 111 properties, 4291 axioms; highly axiomatized; represented in Web Ontology Language (OWL 2); Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 27. Competency questions ”Given a data mining task/data set, which of the valid or applicable workflows/algorithms will yield optimal results (or at least better results than the others)?” ”Given a set of candidate workflows/algorithms for a given task/data set, which data set/workflow/algorithm characteristics should be taken into account in order to select the most appropriate one?” and others more fine-grained, e.g.: ”Which induction algorithms should I use (or avoid) when my dataset has many more variables than instances?” Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 28. Architecture of DMOP knowledge base and its satellite triple stores TBox% DMOP% ABox% Operator%DB% DMEX(DB1%%%%DMEX(DB2%%…%%%DMEX(DBk% OWL2% RDF% Triple% Store% Formal%Conceptual%Framework%% of%Data%Mining%Domain% Accepted%Knowledge%of%DM% Tasks,%Algorithms,%Operators%% Specific%DM%ApplicaFons% Datasets,%Workflows,%Results% MetaHminer’s%training%data% MetaHminer’s%prior%% DM%knowledge% Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 29. The core concepts of DMOP Fig. 1. The core concepts of DMOP. Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 30. DMOP: algorithm representation Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 31. Alignment of DMOP with DOLCE 1/3 Two main reasons to align DMOP with a foundational ontology: considerations about attributes and data properties; extant non-foundational ontology solutions were partial re-inventions of how they are treated in a foundational ontology; reuse of the ontology’s object properties; Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 32. Alignment of DMOP with DOLCE 2/3 Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 33. Alignment of DMOP with DOLCE 3/3 Perdurant: DM-Experiment and DM-Operation are subclasses of dolce:process; Endurant: most DM classes, such as algorithm, software, strategy, task, and optimization problem, are subclasses of dolce:non-physical-endurant; Quality: characteristics and parameters of DM entities made subclasses of dolce:abstract-quality; Abstract: for identifying discrete values, classes added as subclasses of dolce:abstract-region; object properties: DMOP reuses mainly DOLCE’s parthood, quality, and quale relations; each of the four DOLCE main branches have been used. Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 34. Qualities and attributes 1/4 How to handle ’attributes’ in OWL ontologies, and, in a broader context, measurements? easy way: attribute is a binary functional relation between a class and a datatype Elephant ⊑ =1 hasWeight.integer Elephant ⊑ =1 hasWeightPrecise.real Elephant ⊑ =1 hasWeightImperial.integer (in lbs) building into one’s ontology application decisions about how to store the data (and in which unit it is) Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 35. Qualities and attributes 2/4 How to handle ’attributes’ in OWL ontologies, and, in a broader context, measurements? more elaborate way: unfold the notion of an object’s property (e.g. weight) from one attribute/OWL data property into at least two properties: one OWL object property from the object to the ’reified attribute’ (“quality property” represented as an OWL class) and another property to the value(s) ▸ favoured in foundational ontologies; ▸ solves the problem of non-reusability of the ’attribute’ and prevents duplication of data properties; ▸ neither ontology has any solution to represent actual values and units of measurements; measurements for DMOP more alike values for parameters; Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 36. Qualities and attributes 3/4 DM-Data dolce:non-physical-endurant dolce:abstract DataType DataFormat dolce:quality dolce:region dolce:abstract-regiondolce:quale dolce:abstract-quality anyType hasDataValue Characteristic Parameter hasDataType hasDataType dolce:has-quale dolce:particular dolce:has-quality dolce:q-location TableFormat DataTable hasTableFormat DataCharacteristic has-quality Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 37. Qualities and attributes 4/4 ModelingAlgorithm ⊑ =1 has-quality.LearningPolicy LearningPolicy is a dolce:quality LearningPolicy ⊑ =1 has-quale.Eager-Lazy Eager-Lazy is a subclass of dolce:abstract-region Eager-Lazy ⊑ ≤ 1 hasDataValue.anyType In this way, the ontology can be linked to many different applications, who even may use different data types, yet still agree on the meaning of the characteristics and parameters (’attributes’) of the algorithms, tasks, and other DM endurants. Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 38. Meta-modeling in DMOP 1/4 only processes (executions of workflows) and operations (executions of operators) consume inputs and produce outputs DM algorithms (as well as operators and workflows) can only specify the type of input or output inputs and outputs (DM-Dataset and DM-Hypothesis class hierarchy, respectively) are modeled as subclasses of IO-Object class Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 39. Meta-modeling in DMOP 2/4 DM algorithms: classes or individuals? Individuals. Problem: expressing types of inputs/outputs associated with algorithm ”C4.5 specifiesInputClass CategoricalLabeledDataSet” Individual Class (instance of DM-Algorithm) (subclass of DM-Hypothesis) Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 40. Meta-modeling in DMOP 3/4 Initial solution: one artificial class per each single algorithm with a single instance corresponding to this particular algorithm Problem: hasInput, hasOutput, specifiesInputClass, specifiesOutputClass—assigned a common range—IO-Object ”C4.5 specifiesInputClass Iris” ? Individual Individual (instance of DM-Algorithm) (instance of DM-Hypothesis) Iris is a concrete dataset. Clearly, any DM algorithm is not designed to handle only a particular dataset. Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 41. Meta-modeling in DMOP 4/4 Final solution: weak form of punning available in OWL 2 IO-Class: meta-class—the class of all classes of input and output objects ”C4.5 specifiesInputClass CategoricalLabeledDataSet” Individual Individual (instance of DM-Algorithm) (instance of IO-Class) ”DM-Process hasInput some CategoricalLabeledDataSet” Class Class (subclass of dolce:process) (subclass of IO-Object) Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 42. DMOP: further details Data Mining Optimization Ontology. C. Maria Keet, Agnieszka Lawrynowicz, Claudia d’Amato, Alexandros Kalousis, Phong Nguyen, Raul Palma, Robert Stevens, and Melanie Hilario, Journal of Web Semantics, DOI: 10.1016/j.websem.2015.01.001 Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 43. Recap: Propositionalisation Pa#erns     1)  ?x  a  :Train  .  ?x  :hasCar  ?y   2)  ?x  a  :Train  .  ?x  :hasCar  ?y  .  ?y  :hasShape  :rectangle   3)  ?x  a  :Train  .  ?x  :hasCar  ?y  .  ?y  :wheels  :three   4)  …   Dataset  (Michalski’s  train  problem,  1977)   Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 44. RapidMiner XML based workflow representation Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 45. Importing RapidMiner worfklows to DMOP based RDF format Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 46. Propositionalisation Workflow  pa*erns       Dataset   DMOP-­‐based  RDF   repository  of  DM   processes   Results of experiments. Below we present the results of experimental evaluation of Fr-ONT-Qu in the meta-mining scenario. In the experiments, we used OWLIM SE (v5.3.5849) as an underlying reasoning engine and a semantic store with the owl2-rl-reduced-optimized ruleset. The choice of such a ruleset was motivated by the expressivity of our background knowledge base, e.g. existence of object property chains. During each cycle of cross-validation, Fr-ONT-Qu discovered around 2000 patterns, and redundant patterns were subsequently pruned. We discuss some of the discovered patterns below (for compactness denoting by Bd the body of the base pattern used in the experiments). The first example pattern: Q1 = select distinct ?x where { Bd ∪ ?opex2!dmop:executes ?front0 .! ?opex2!dmop:executes rm:RM-Decision_Tree .! ?opex2!dmop:hasParameterSetting ?front1.! ?front0!dmop:executes rm:DM-Operator .! ?front0!dmop:implements ?front2 .!!! ?front2 a dmop:DM-Algorithm . ?front2 a dmop:InductionAlgorithm .!!! ?front2 a dmop:ModelingAlgorithm .!!! ?front2 a dmop:ClassificationModelingAlgorithm .!!! ?front2 a dmop:ClassificationTreeInductionAlgorithm .!}! was mined when Fr-ONT-Qu traversed down the algorithm classes hierarchy specializing variable ?front2. In this way, it is possible to abstract from the level of operators (algorithm implementations) to the level of algorithms and their taxonomy. For instance, both rm:RM- Decision_Tree and weka:Weka-J48 operators implement a classification tree induction algorithm and one may generalize over it. The patterns containing class hierarchies provide similar expressivity to this of patterns mined in so-called generalized association rule mining. The following pattern covers only those workflows that contain ‘Decision Tree’ operator, for which the parameter minimal size for split has value between 2 and 5.5: Q2 = select distinct ?x where { Bd ∪ ?opex2!dmop:executes ?front0 .! ?opex2!dmop:executes rm:RM-Decision_Tree .! ?opex2!dmop:hasParameterSetting ?front1.! ?front0!dmop:executes rm:DM-Operator .! ?front1!dmop:setsValueOf ?front2.! ?front1!dmop:hasValue ?front3.! filter(2.000000 = xsd:double(?front3) xsd:double(?front3) = 16.000000) . ?front2!dmop:hasParameterKey 'minimal_size_for_split'.! ?front1!dmop:hasValue ?front3.! filter(2.000000 = xsd:double(?front3) xsd:double(?front3) = 9.000000) . ?front1!dmop:hasValue ?front3.! filter(2.000000 = xsd:double(?front3) xsd:double(?front3) = 5.500000) . } Dataset  characteris3cs   …   Features   Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 47. Semantic meta-mining experimental setup baseline DM experiment set: 1581 RapidMiner workflows solving a predictive modeling task on 11 UCI datasets dataset characteristics meta-data stored in DMEX-DB containing over 85 million of RDF triples workflow patterns represented as SPARQL queries using DMOP entities Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 48. The inside of X-Validation operator with the workflow for training and evaluating the pattern-based model Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 49. Semantic meta-mining results McNemar’s test for pairs of classifiers performed with the null hypothesis that a classifier built using dataset characteristics and a mined pattern set has the same error rate as the baseline that used dataset characteristics and only the names of the machine learning DM operators Test confirmed that classifiers trained using workflow patterns performed significantly better (accuracy 0.927) than the baseline (accuracy 0.890) Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 50. Summary and future work RMonto RapidMiner plugin, all experimental data and (meta-mining) workflows are publicly available: http://www.myexperiment.org/packs/421.html, http://semantic.cs.put.poznan.pl/fr-ont/ LeoLOD project - Learning and Evolving Ontologies from Linked Open Data (2013-2015) ▸ project funded by Foundation for Polish Science under the POMOST program, ▸ Fr-ONT-Qu re-adapted for ontology learning, ▸ DMOP used to model provenance metadata (in industry: treaceability) of ontology learning workflows DMOP is being aligned to OPMW (Open Provenance Model for Workflows) Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50
  • 51. Acknowledgements Foundation for Polish Science under the POMOST programme, cofinanced from European Union, Regional Development Fund (No POMOST/2013-7/8) (2013-2015) EU FP7 ICT-2007.4.4 (No 231519) ”e-LICO: An e-Laboratory for Interdisciplinary Collaborative Research in Data Mining and Data-Intensive Science” (2009-2012) RMonto, Meta-mining experiments, LeoLOD plugin done jointly with Jedrzej Potoniec Contributors to the development of DMOP and/or other e-LICO infrastructure used in the research described in this presentation: Melanie Hilario, C. Maria Keet, Claudia d’Amato, Huyen Do, Simon Fischer, Dragan Gamberger, Lina Al-Jadir, Simon Jupp, Alexandros Kalousis, Joerg Uwe-Kietz, Petra Kralj Novak, Babak Mougouie, Phong Nguyen, Raul Palma, Floarea Serban, Robert Stevens, Anze Vavpetic, Jun Wang, Derry Wijaya, Adam Woznica Thanks to Veli Bicer for sharing the AIFB dataset Agnieszka Lawrynowicz collaboration with Jedrzej Potoniec, Maria C. Keet, Melanie Hilario, Claudia d’Amato, Raul Palma and others -Semantic Meta-Mining of Knowledge Discovery Processes June 11, 2015 ADAA Semina / 50