SlideShare a Scribd company logo
1 of 31
Download to read offline
Data Mining OPtimization Ontology and its application
to meta-mining of knowledge discovery processes
Agnieszka Lawrynowicz
collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato,
Jedrzej Potoniec and others - see acknowledgements
Poznan University of Technology, Poland
25th September 2014
OEG group seminar at Universidad Polit´ecnica de Madrid (UPM)
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Outline
Overview of DMOP: purpose, scope, core classes
Modeling issues
▸ meta-modeling in DMOP;
▸ alignment of DMOP with the DOLCE foundational ontology;
▸ qualities and attributes;
▸ property chains in DMOP;
▸ other modeling considerations;
Meta-mining of KDD processes
▸ RapidMiner
▸ RMOnto
▸ Fr-ONT-Qu
▸ experimental evaluation
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Data Mining OPtimization Ontology (DMOP)
the primary goal of DMOP is to support all decision-making steps
that determine the outcome of the data mining process;
development started in EU FP7 project e-LICO (2009-2012);
DMOP v5.4: ∼ 750 classes, ∼ 200 properties, ∼ 3200 axioms;
highly axiomatized;
using almost all of OWL 2 features;
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Overview of meta-learning
Meta-learning: learning to learn
application of machine learning techniques to meta-data about past
machine learning experiments;
the goal: to modify some aspect of the learning process to improve
the performance of the resulting model;
meta-mining: meta-learning applied to full data mining process
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Overview of the e-LICO system
.,+1B/0DF'4;)<'<!=1)/*'.0!<!*1<'1;!'?)BB!0!*1'=/D./*!*1<'/B'1;!'!"#$%&'+0=;)1!=1>0!'G()E>0!'7H'+*?'
<;/I<';/I'1;!A')*1!0+=1'1/'+=;)!J!'1;!'><!0K<'L*/I,!?E!'?)<=/J!0A'E/+,F''
4;!'!"#$%&')*B0+<10>=1>0!'G?!.)=1!?')*'1;!'B)E>0!'>*?!0'1;!'?+<;!?',)*!H')<'1;!'D!+*<'MA'I;)=;'1;!'
?+1+"D)*)*E'.,+1B/0D')<'?!,)J!0!?'1/'<=)!*1)<1<F'4;!')**/J+1)J!'=/0!''/B'1;!'!"#$%&'.,+1B/0D')<'1;!'
!"#$%%&'$"#( )&*+,-$./( 0**&*#1"#' G$NOP' +M/J!' 1;!' ?+<;!?' ,)*!H' I)1;' )1<' .,+**!0' +*?' D!1+",!+0*!0F'
Q/I!J!0P'1/'?!,)J!0'1;!'?+1+"D)*)*E'.,+1B/0D'1/')1<'<=)!*1)<1'><!0<P'1;!0!'+0!'<!J!0+,'/1;!0'<!0J)=!<'
+*?'=/D./*!*1<F'()E>0!'7'<;/I<'+*'/J!0J)!I'/B'!"#$%&R<'=/D./*!*1<'+*?';/I'1;!A')*1!0+=1'I)1;'
!+=;'/1;!0F'
'
()E>0!'7F'&J!0J)!I'/B'1;!'!"#$%&'<A<1!DF''
4;!0!'+0!'1I/'><!0"B+=)*E'=/D./*!*1<'B/0'1;!'!"#$%&'.,+1B/0DS'1;!<!'+,,/I'<=)!*1)<1<'1/'+==!<<'?+1+"
D)*)*E' /.!0+1/0<' +*?T/0' /1;!0' ?+1+' .0/=!<<)*E' <!0J)=!<P' 1/' =/D./<!' 1;!D' )*1/' I/0LB,/I<' +*?'
!U!=>1!' 1;!DP' =/,,!=1)*E' 1;!' 0!<>,1<' B/0' )*1!0.0!1+1)/*' /0' B>01;!0' +*+,A<)<F' 4;!<!' 1I/' =!*10+,'
)*B0+<10>=1>0!'=/D./*!*1<'+0!V'Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Competency questions
”Given a data mining task/data set, which of the valid or applicable
workflows/algorithms will yield optimal results (or at least better results
than the others)?”
”Given a set of candidate workflows/algorithms for a given task/data
set, which data set/workflow/algorithm characteristics should be
taken into account in order to select the most appropriate one?”
and others more fine-grained, e.g.:
”Which induction algorithms should I use (or avoid) when my dataset
has many more variables than instances?”
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Architecture of DMOP knowledge base and its satellite
triple stores
12 e-LICO
Figure 5: Architecture of DMOP knowledge base and its satellite triple stores
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
The core concepts of DMOP
Fig. 1. The core concepts of DMOP.
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Meta-modeling in DMOP 1/4
only processes (executions of workflows) and operations (executions
of operators) consume inputs and produce outputs
DM algorithms (as well as operators and workflows) can only specify
the type of input or output
inputs and outputs (DM-Dataset and DM-Hypothesis class hierarchy,
respectively) are modeled as subclasses of IO-Object class
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Meta-modeling in DMOP 2/4
DM algorithms: classes or individuals? Individuals.
Problem: expressing types of inputs/outputs associated with
algorithm
”C4.5 specifiesInputClass CategoricalLabeledDataSet” 
Individual Class
(instance of DM-Algorithm) (subclass of DM-Hypothesis)
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Meta-modeling in DMOP 3/4
Initial solution: one artificial class per each single algorithm with a
single instance corresponding to this particular algorithm
Problem: hasInput, hasOutput, specifiesInputClass,
specifiesOutputClass—assigned a common range—IO-Object
”C4.5 specifiesInputClass Iris” ?
Individual Individual
(instance of DM-Algorithm) (instance of DM-Hypothesis)
Iris is a concrete dataset. Clearly, any DM algorithm is not designed
to handle only a particular dataset.
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Meta-modeling in DMOP 4/4
Final solution: weak form of punning available in OWL 2
IO-Class: meta-class—the class of all classes of input and output
objects
”C4.5 specifiesInputClass CategoricalLabeledDataSet” 
Individual Individual
(instance of DM-Algorithm) (instance of IO-Class)
”DM-Process hasInput some CategoricalLabeledDataSet” 
Class Class
(subclass of dolce:process) (subclass of IO-Object)
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Alignment of DMOP with DOLCE 1/2
Two main reasons to align DMOP with a foundational ontology:
considerations about attributes and data properties; extant
non-foundational ontology solutions were partial re-inventions of how
they are treated in a foundational ontology;
reuse of the ontology’s object properties;
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Alignment of DMOP with DOLCE 2/2
Perdurant: DM-Experiment and DM-Operation are subclasses of
dolce:process;
Endurant: most DM classes, such as algorithm, software, strategy,
task, and optimization problem, are subclasses of
dolce:non-physical-endurant;
Quality: characteristics and parameters of DM entities made
subclasses of dolce:abstract-quality;
Abstract: for identifying discrete values, classes added as subclasses
of dolce:abstract-region;
object properties: DMOP reuses mainly DOLCE’s parthood, quality,
and quale relations;
each of the four DOLCE main branches have been used.
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Qualities and attributes 1/4
How to handle ’attributes’ in OWL ontologies, and, in a broader context,
measurements?
easy way: attribute is a binary functional relation between a class and
a datatype
Elephant ⊑ =1 hasWeight.integer
Elephant ⊑ =1 hasWeightPrecise.real
Elephant ⊑ =1 hasWeightImperial.integer (in lbs)
building into one’s ontology application decisions about how to store
the data (and in which unit it is)
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Qualities and attributes 2/4
How to handle ’attributes’ in OWL ontologies, and, in a broader context,
measurements?
more elaborate way: unfold the notion of an object’s property (e.g.
weight) from one attribute/OWL data property into at least two
properties: one OWL object property from the object to the ’reified
attribute’ (“quality property” represented as an OWL class) and
another property to the value(s)
▸ favoured in foundational ontologies;
▸ solves the problem of non-reusability of the ’attribute’ and prevents
duplication of data properties;
▸ neither ontology has any solution to represent actual values and units
of measurements;
measurements for DMOP more alike values for parameters;
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Qualities and attributes 3/4
DM-Data
dolce:non-physical-endurant dolce:abstract
DataType DataFormat
dolce:quality
dolce:region
dolce:abstract-regiondolce:quale
dolce:abstract-quality
anyType
hasDataValue
Characteristic Parameter
hasDataType
hasDataType
dolce:has-quale
dolce:particular
dolce:has-quality
dolce:q-location
TableFormat
DataTable hasTableFormat
DataCharacteristic
has-quality
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Qualities and attributes 4/4
ModelingAlgorithm ⊑ =1 has-quality.LearningPolicy
LearningPolicy is a dolce:quality
LearningPolicy ⊑ =1 has-quale.Eager-Lazy
Eager-Lazy is a subclass of dolce:abstract-region
Eager-Lazy ⊑ ≤ 1 hasDataValue.anyType
In this way, the ontology can be linked to many different applications, who
even may use different data types, yet still agree on the meaning of the
characteristics and parameters (’attributes’) of the algorithms, tasks, and
other DM endurants.
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Property chains
DMOP has 11 property chains;
principal issues in declaring safe property chains (guaranteed not to
cause unsatisfiable classes or other undesirable deductions), are
declaring and choosing properties, and their domain and range axioms;
all investigated in detail in (Keet, EKAW ’2012) and adjusted were
necessary;
Example: hasMainTable ○ hasFeature ⊑ hasFeature
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Other modeling considerations
several other OWL 2 features were used;
ObjectInverseOf;
“object property characteristics” used sparingly, and only the basic
‘functional’ characteristic asserted;
local reflexivity investigated on a subsumes property for instances in
DMOP v5.2, but eventually modeled differently with classes and
metamodeling/punning;
DOLCE’s parthood is transitive, should be transitive in DMOP; but it
was discovered after the release of v5.3 that the object property copy
function in Prot´eg´e does not copy any property characteristics;
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
What is RapidMiner? 1/2
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
What is RapidMiner? 2/2
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
What is RapidMiner? 2/2
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
RMonto - plugin to RapidMiner
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Fr-ONT-Qu
algorithm for mining patterns in RDF(s) data
patterns expressed as SPARQL queries
2 thresholds: for keeping good enough patterns and for refining best
patterns
several quality measures to select for thresholds (e.g. support on KB)
for classification task outperformed state-of-art approaches to
classification of Semantic Web data on tasks with available results
and datasets (see: ”Pattern based feature construction in semantic
data mining” by A. Lawrynowicz, J. Potoniec, IJSWIS 10(1), 2014):
▸ kernel methods Bloehdorn et al. (2007), Loesch et al. (ESWC 2012
best paper) on SWRC AIFB dataset,
▸ statistical relational classifier SPARQL-ML by Kiefer et al (ESWC 2008
best paper) on SWRC AIFB dataset and OWLS-TC v2.1 dataset,
▸ concept learning algorithms DL-FOIL by Fanizzi et al (2008),
DL-Learner cutting-edge CELOE variant by Lehmann (2009) on all
measures on datasets BioPax, NTN, Financial
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Fr-ONT-Qu - pattern based classification
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Fr-ONT-Qu - trie data structure
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Semantic meta-mining experimental setup
baseline DM experiment set: 1581 RapidMiner workflows solving a
predictive modeling task on 11 UCI datasets
dataset characteristics meta-data stored in DMEX-DB containing
over 85 million of RDF triples
workflow patterns represented as SPARQL queries using DMOP
entities
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Semantic meta-mining results
McNemar’s test for pairs of classifiers performed with the null
hypothesis that a classifier built using dataset characteristics and a
mined pattern set has the same error rate as the baseline that used
dataset characteristics and only the names of the machine learning
DM operators
Test confirmed that classifiers trained using workflow patterns
performed significantly better (accuracy 0.927) than the baseline
(accuracy 0.890)
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Acknowledgements
EU FP7 ICT-2007.4.4 (No 231519) ”e-LICO: An e-Laboratory for
Interdisciplinary Collaborative Research in Data Mining and
Data-Intensive Science”
Foundation for Polish Science under the PARENT/BRIDGE
programme, cofinanced from European Union, Regional Development
Fund (No POMOST/2013-7/8)
Contributors to the development of DMOP and/or other e-LICO
infrastructure used in the research described in this presentation:
Claudia d’Amato, Huyen Do, Simon Fischer, Dragan Gamberger,
Melanie Hilario, Lina Al-Jadir, Simon Jupp, Alexandros Kalousis, C.
Maria Keet, Joerg Uwe-Kietz, Petra Kralj Novak, Babak Mougouie,
Phong Nguyen, Raul Palma, Jedrzej Potoniec, Floarea Serban, Robert
Stevens, Anze Vavpetic, Jun Wang, Derry Wijaya, Adam Woznica
RMonto and Meta-mining experiments done jointly with Jedrzej
Potoniec
Thanks to Veli Bicer for sharing the AIFB dataset
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31
Bibliography
Keet, C.M., Lawrynowicz, A., dAmato, C., Hilario, M.: Modeling issues, choices in the data mining optimization ontology.
In Rodriguez-Muro, M., et al., eds.: OWLED. Volume 1080 of CEUR Workshop Proceedings., CEUR-WS.org (2013)
Hilario, M., Nguyen, P., Do, H., Woznica, A., Kalousis, A. (2011). Ontology-Based Meta-Mining of Knowledge Discovery
Workflows. In N. Jankowski, W. Duch, K. Grabczewski (Ed.), Meta-Learning in Computational Intelligence (pp.
273-316). Springer.
Potoniec, J., Lawrynowicz, A. (2011b). RMonto: Ontological extension to RapidMiner. Poster and Demo Session of the
ISWC 2011 - 10th International Semantic Web Conference.
Lawrynowicz, A., Potoniec, J.:Pattern Based Feature Construction in Semantic Data Mining. IJSWIS 10(1) (2014)
Keet, C.M, Detecting and Revising Flaws in OWL Object Property Expressions. EKAW 2012: 252-266
Serban, F., Vanschoren, J., Kietz, J.-U., Bernstein, A. (2012). A survey of intelligent assistants for data analysis. ACM
Computing Surveys
Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco
25th September 2014 OEG group sem
/ 31

More Related Content

Similar to Data Mining OPtimization Ontology and its application to meta-mining of knowledge discovery processes

NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...Raphael Troncy
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsMarcel Kurovski
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systemsinovex GmbH
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksRaphael Troncy
 
Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation Daniele Di Mitri
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AIGreg Werner
 
Big-Data Analytics for Media Management
Big-Data Analytics for Media ManagementBig-Data Analytics for Media Management
Big-Data Analytics for Media Managementtechkrish
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...butest
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...butest
 
Integration data models, Learning Layers project meeting in Bremen
Integration data models, Learning Layers project meeting in BremenIntegration data models, Learning Layers project meeting in Bremen
Integration data models, Learning Layers project meeting in BremenVladimir Tomberg
 
discopen
discopendiscopen
discopenJisc
 
Response needed 1The paper is well placed on the issues of the.docx
Response needed 1The paper is well placed on the issues of the.docxResponse needed 1The paper is well placed on the issues of the.docx
Response needed 1The paper is well placed on the issues of the.docxaudeleypearl
 
Personal Knowledge Graphs: Use Cases in e-learning Platforms
Personal Knowledge Graphs: Use Cases in e-learning PlatformsPersonal Knowledge Graphs: Use Cases in e-learning Platforms
Personal Knowledge Graphs: Use Cases in e-learning PlatformsEleniIlkou
 
Dominik Kowald PhD Defense Recommender Systems
Dominik Kowald PhD Defense Recommender SystemsDominik Kowald PhD Defense Recommender Systems
Dominik Kowald PhD Defense Recommender SystemsDominik Kowald
 
The MPEG-21 Multimedia Framework for Integrated Management of Environments en...
The MPEG-21 Multimedia Framework for Integrated Management of Environments en...The MPEG-21 Multimedia Framework for Integrated Management of Environments en...
The MPEG-21 Multimedia Framework for Integrated Management of Environments en...Alpen-Adria-Universität
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxjuliennehar
 
Resource recommendation vs privacy enhancement
Resource recommendation vs privacy enhancementResource recommendation vs privacy enhancement
Resource recommendation vs privacy enhancementSilvia Puglisi
 

Similar to Data Mining OPtimization Ontology and its application to meta-mining of knowledge discovery processes (20)

NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...
 
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social Networks
 
Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AI
 
Big-Data Analytics for Media Management
Big-Data Analytics for Media ManagementBig-Data Analytics for Media Management
Big-Data Analytics for Media Management
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...
 
Network analysis
Network analysisNetwork analysis
Network analysis
 
Integration data models, Learning Layers project meeting in Bremen
Integration data models, Learning Layers project meeting in BremenIntegration data models, Learning Layers project meeting in Bremen
Integration data models, Learning Layers project meeting in Bremen
 
discopen
discopendiscopen
discopen
 
Winter Projects GDSC IITK
Winter Projects GDSC IITKWinter Projects GDSC IITK
Winter Projects GDSC IITK
 
Response needed 1The paper is well placed on the issues of the.docx
Response needed 1The paper is well placed on the issues of the.docxResponse needed 1The paper is well placed on the issues of the.docx
Response needed 1The paper is well placed on the issues of the.docx
 
Personal Knowledge Graphs: Use Cases in e-learning Platforms
Personal Knowledge Graphs: Use Cases in e-learning PlatformsPersonal Knowledge Graphs: Use Cases in e-learning Platforms
Personal Knowledge Graphs: Use Cases in e-learning Platforms
 
Dominik Kowald PhD Defense Recommender Systems
Dominik Kowald PhD Defense Recommender SystemsDominik Kowald PhD Defense Recommender Systems
Dominik Kowald PhD Defense Recommender Systems
 
The MPEG-21 Multimedia Framework for Integrated Management of Environments en...
The MPEG-21 Multimedia Framework for Integrated Management of Environments en...The MPEG-21 Multimedia Framework for Integrated Management of Environments en...
The MPEG-21 Multimedia Framework for Integrated Management of Environments en...
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
 
Resource recommendation vs privacy enhancement
Resource recommendation vs privacy enhancementResource recommendation vs privacy enhancement
Resource recommendation vs privacy enhancement
 

More from Agnieszka Ławrynowicz

CHIST-ERA 2019 - presentation of CAMIL (Poznan University of Technology)
CHIST-ERA 2019 - presentation of CAMIL (Poznan University of Technology)CHIST-ERA 2019 - presentation of CAMIL (Poznan University of Technology)
CHIST-ERA 2019 - presentation of CAMIL (Poznan University of Technology)Agnieszka Ławrynowicz
 
Ontologie w historyczno-geograficznych systemach informacyjnych
Ontologie w historyczno-geograficznych systemach informacyjnychOntologie w historyczno-geograficznych systemach informacyjnych
Ontologie w historyczno-geograficznych systemach informacyjnychAgnieszka Ławrynowicz
 
Hazardous Situation Ontology Design Pattern
Hazardous Situation Ontology Design Pattern Hazardous Situation Ontology Design Pattern
Hazardous Situation Ontology Design Pattern Agnieszka Ławrynowicz
 
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...Agnieszka Ławrynowicz
 

More from Agnieszka Ławrynowicz (6)

CHIST-ERA 2019 - presentation of CAMIL (Poznan University of Technology)
CHIST-ERA 2019 - presentation of CAMIL (Poznan University of Technology)CHIST-ERA 2019 - presentation of CAMIL (Poznan University of Technology)
CHIST-ERA 2019 - presentation of CAMIL (Poznan University of Technology)
 
Ontologie w historyczno-geograficznych systemach informacyjnych
Ontologie w historyczno-geograficznych systemach informacyjnychOntologie w historyczno-geograficznych systemach informacyjnych
Ontologie w historyczno-geograficznych systemach informacyjnych
 
ML Schema: Machine Learning Schema
ML Schema: Machine Learning SchemaML Schema: Machine Learning Schema
ML Schema: Machine Learning Schema
 
Hazardous Situation Ontology Design Pattern
Hazardous Situation Ontology Design Pattern Hazardous Situation Ontology Design Pattern
Hazardous Situation Ontology Design Pattern
 
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...
 
ZTG 2013 Agnieszka Ławrynowicz
ZTG 2013 Agnieszka ŁawrynowiczZTG 2013 Agnieszka Ławrynowicz
ZTG 2013 Agnieszka Ławrynowicz
 

Recently uploaded

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 

Recently uploaded (20)

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 

Data Mining OPtimization Ontology and its application to meta-mining of knowledge discovery processes

  • 1. Data Mining OPtimization Ontology and its application to meta-mining of knowledge discovery processes Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowledgements Poznan University of Technology, Poland 25th September 2014 OEG group seminar at Universidad Polit´ecnica de Madrid (UPM) Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 2. Outline Overview of DMOP: purpose, scope, core classes Modeling issues ▸ meta-modeling in DMOP; ▸ alignment of DMOP with the DOLCE foundational ontology; ▸ qualities and attributes; ▸ property chains in DMOP; ▸ other modeling considerations; Meta-mining of KDD processes ▸ RapidMiner ▸ RMOnto ▸ Fr-ONT-Qu ▸ experimental evaluation Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 3. Data Mining OPtimization Ontology (DMOP) the primary goal of DMOP is to support all decision-making steps that determine the outcome of the data mining process; development started in EU FP7 project e-LICO (2009-2012); DMOP v5.4: ∼ 750 classes, ∼ 200 properties, ∼ 3200 axioms; highly axiomatized; using almost all of OWL 2 features; Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 4. Overview of meta-learning Meta-learning: learning to learn application of machine learning techniques to meta-data about past machine learning experiments; the goal: to modify some aspect of the learning process to improve the performance of the resulting model; meta-mining: meta-learning applied to full data mining process Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 5. Overview of the e-LICO system .,+1B/0DF'4;)<'<!=1)/*'.0!<!*1<'1;!'?)BB!0!*1'=/D./*!*1<'/B'1;!'!"#$%&'+0=;)1!=1>0!'G()E>0!'7H'+*?' <;/I<';/I'1;!A')*1!0+=1'1/'+=;)!J!'1;!'><!0K<'L*/I,!?E!'?)<=/J!0A'E/+,F'' 4;!'!"#$%&')*B0+<10>=1>0!'G?!.)=1!?')*'1;!'B)E>0!'>*?!0'1;!'?+<;!?',)*!H')<'1;!'D!+*<'MA'I;)=;'1;!' ?+1+"D)*)*E'.,+1B/0D')<'?!,)J!0!?'1/'<=)!*1)<1<F'4;!')**/J+1)J!'=/0!''/B'1;!'!"#$%&'.,+1B/0D')<'1;!' !"#$%%&'$"#( )&*+,-$./( 0**&*#1"#' G$NOP' +M/J!' 1;!' ?+<;!?' ,)*!H' I)1;' )1<' .,+**!0' +*?' D!1+",!+0*!0F' Q/I!J!0P'1/'?!,)J!0'1;!'?+1+"D)*)*E'.,+1B/0D'1/')1<'<=)!*1)<1'><!0<P'1;!0!'+0!'<!J!0+,'/1;!0'<!0J)=!<' +*?'=/D./*!*1<F'()E>0!'7'<;/I<'+*'/J!0J)!I'/B'!"#$%&R<'=/D./*!*1<'+*?';/I'1;!A')*1!0+=1'I)1;' !+=;'/1;!0F' ' ()E>0!'7F'&J!0J)!I'/B'1;!'!"#$%&'<A<1!DF'' 4;!0!'+0!'1I/'><!0"B+=)*E'=/D./*!*1<'B/0'1;!'!"#$%&'.,+1B/0DS'1;!<!'+,,/I'<=)!*1)<1<'1/'+==!<<'?+1+" D)*)*E' /.!0+1/0<' +*?T/0' /1;!0' ?+1+' .0/=!<<)*E' <!0J)=!<P' 1/' =/D./<!' 1;!D' )*1/' I/0LB,/I<' +*?' !U!=>1!' 1;!DP' =/,,!=1)*E' 1;!' 0!<>,1<' B/0' )*1!0.0!1+1)/*' /0' B>01;!0' +*+,A<)<F' 4;!<!' 1I/' =!*10+,' )*B0+<10>=1>0!'=/D./*!*1<'+0!V'Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 6. Competency questions ”Given a data mining task/data set, which of the valid or applicable workflows/algorithms will yield optimal results (or at least better results than the others)?” ”Given a set of candidate workflows/algorithms for a given task/data set, which data set/workflow/algorithm characteristics should be taken into account in order to select the most appropriate one?” and others more fine-grained, e.g.: ”Which induction algorithms should I use (or avoid) when my dataset has many more variables than instances?” Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 7. Architecture of DMOP knowledge base and its satellite triple stores 12 e-LICO Figure 5: Architecture of DMOP knowledge base and its satellite triple stores Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 8. The core concepts of DMOP Fig. 1. The core concepts of DMOP. Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 9. Meta-modeling in DMOP 1/4 only processes (executions of workflows) and operations (executions of operators) consume inputs and produce outputs DM algorithms (as well as operators and workflows) can only specify the type of input or output inputs and outputs (DM-Dataset and DM-Hypothesis class hierarchy, respectively) are modeled as subclasses of IO-Object class Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 10. Meta-modeling in DMOP 2/4 DM algorithms: classes or individuals? Individuals. Problem: expressing types of inputs/outputs associated with algorithm ”C4.5 specifiesInputClass CategoricalLabeledDataSet” Individual Class (instance of DM-Algorithm) (subclass of DM-Hypothesis) Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 11. Meta-modeling in DMOP 3/4 Initial solution: one artificial class per each single algorithm with a single instance corresponding to this particular algorithm Problem: hasInput, hasOutput, specifiesInputClass, specifiesOutputClass—assigned a common range—IO-Object ”C4.5 specifiesInputClass Iris” ? Individual Individual (instance of DM-Algorithm) (instance of DM-Hypothesis) Iris is a concrete dataset. Clearly, any DM algorithm is not designed to handle only a particular dataset. Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 12. Meta-modeling in DMOP 4/4 Final solution: weak form of punning available in OWL 2 IO-Class: meta-class—the class of all classes of input and output objects ”C4.5 specifiesInputClass CategoricalLabeledDataSet” Individual Individual (instance of DM-Algorithm) (instance of IO-Class) ”DM-Process hasInput some CategoricalLabeledDataSet” Class Class (subclass of dolce:process) (subclass of IO-Object) Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 13. Alignment of DMOP with DOLCE 1/2 Two main reasons to align DMOP with a foundational ontology: considerations about attributes and data properties; extant non-foundational ontology solutions were partial re-inventions of how they are treated in a foundational ontology; reuse of the ontology’s object properties; Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 14. Alignment of DMOP with DOLCE 2/2 Perdurant: DM-Experiment and DM-Operation are subclasses of dolce:process; Endurant: most DM classes, such as algorithm, software, strategy, task, and optimization problem, are subclasses of dolce:non-physical-endurant; Quality: characteristics and parameters of DM entities made subclasses of dolce:abstract-quality; Abstract: for identifying discrete values, classes added as subclasses of dolce:abstract-region; object properties: DMOP reuses mainly DOLCE’s parthood, quality, and quale relations; each of the four DOLCE main branches have been used. Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 15. Qualities and attributes 1/4 How to handle ’attributes’ in OWL ontologies, and, in a broader context, measurements? easy way: attribute is a binary functional relation between a class and a datatype Elephant ⊑ =1 hasWeight.integer Elephant ⊑ =1 hasWeightPrecise.real Elephant ⊑ =1 hasWeightImperial.integer (in lbs) building into one’s ontology application decisions about how to store the data (and in which unit it is) Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 16. Qualities and attributes 2/4 How to handle ’attributes’ in OWL ontologies, and, in a broader context, measurements? more elaborate way: unfold the notion of an object’s property (e.g. weight) from one attribute/OWL data property into at least two properties: one OWL object property from the object to the ’reified attribute’ (“quality property” represented as an OWL class) and another property to the value(s) ▸ favoured in foundational ontologies; ▸ solves the problem of non-reusability of the ’attribute’ and prevents duplication of data properties; ▸ neither ontology has any solution to represent actual values and units of measurements; measurements for DMOP more alike values for parameters; Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 17. Qualities and attributes 3/4 DM-Data dolce:non-physical-endurant dolce:abstract DataType DataFormat dolce:quality dolce:region dolce:abstract-regiondolce:quale dolce:abstract-quality anyType hasDataValue Characteristic Parameter hasDataType hasDataType dolce:has-quale dolce:particular dolce:has-quality dolce:q-location TableFormat DataTable hasTableFormat DataCharacteristic has-quality Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 18. Qualities and attributes 4/4 ModelingAlgorithm ⊑ =1 has-quality.LearningPolicy LearningPolicy is a dolce:quality LearningPolicy ⊑ =1 has-quale.Eager-Lazy Eager-Lazy is a subclass of dolce:abstract-region Eager-Lazy ⊑ ≤ 1 hasDataValue.anyType In this way, the ontology can be linked to many different applications, who even may use different data types, yet still agree on the meaning of the characteristics and parameters (’attributes’) of the algorithms, tasks, and other DM endurants. Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 19. Property chains DMOP has 11 property chains; principal issues in declaring safe property chains (guaranteed not to cause unsatisfiable classes or other undesirable deductions), are declaring and choosing properties, and their domain and range axioms; all investigated in detail in (Keet, EKAW ’2012) and adjusted were necessary; Example: hasMainTable ○ hasFeature ⊑ hasFeature Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 20. Other modeling considerations several other OWL 2 features were used; ObjectInverseOf; “object property characteristics” used sparingly, and only the basic ‘functional’ characteristic asserted; local reflexivity investigated on a subsumes property for instances in DMOP v5.2, but eventually modeled differently with classes and metamodeling/punning; DOLCE’s parthood is transitive, should be transitive in DMOP; but it was discovered after the release of v5.3 that the object property copy function in Prot´eg´e does not copy any property characteristics; Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 21. What is RapidMiner? 1/2 Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 22. What is RapidMiner? 2/2 Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 23. What is RapidMiner? 2/2 Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 24. RMonto - plugin to RapidMiner Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 25. Fr-ONT-Qu algorithm for mining patterns in RDF(s) data patterns expressed as SPARQL queries 2 thresholds: for keeping good enough patterns and for refining best patterns several quality measures to select for thresholds (e.g. support on KB) for classification task outperformed state-of-art approaches to classification of Semantic Web data on tasks with available results and datasets (see: ”Pattern based feature construction in semantic data mining” by A. Lawrynowicz, J. Potoniec, IJSWIS 10(1), 2014): ▸ kernel methods Bloehdorn et al. (2007), Loesch et al. (ESWC 2012 best paper) on SWRC AIFB dataset, ▸ statistical relational classifier SPARQL-ML by Kiefer et al (ESWC 2008 best paper) on SWRC AIFB dataset and OWLS-TC v2.1 dataset, ▸ concept learning algorithms DL-FOIL by Fanizzi et al (2008), DL-Learner cutting-edge CELOE variant by Lehmann (2009) on all measures on datasets BioPax, NTN, Financial Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 26. Fr-ONT-Qu - pattern based classification Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 27. Fr-ONT-Qu - trie data structure Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 28. Semantic meta-mining experimental setup baseline DM experiment set: 1581 RapidMiner workflows solving a predictive modeling task on 11 UCI datasets dataset characteristics meta-data stored in DMEX-DB containing over 85 million of RDF triples workflow patterns represented as SPARQL queries using DMOP entities Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 29. Semantic meta-mining results McNemar’s test for pairs of classifiers performed with the null hypothesis that a classifier built using dataset characteristics and a mined pattern set has the same error rate as the baseline that used dataset characteristics and only the names of the machine learning DM operators Test confirmed that classifiers trained using workflow patterns performed significantly better (accuracy 0.927) than the baseline (accuracy 0.890) Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 30. Acknowledgements EU FP7 ICT-2007.4.4 (No 231519) ”e-LICO: An e-Laboratory for Interdisciplinary Collaborative Research in Data Mining and Data-Intensive Science” Foundation for Polish Science under the PARENT/BRIDGE programme, cofinanced from European Union, Regional Development Fund (No POMOST/2013-7/8) Contributors to the development of DMOP and/or other e-LICO infrastructure used in the research described in this presentation: Claudia d’Amato, Huyen Do, Simon Fischer, Dragan Gamberger, Melanie Hilario, Lina Al-Jadir, Simon Jupp, Alexandros Kalousis, C. Maria Keet, Joerg Uwe-Kietz, Petra Kralj Novak, Babak Mougouie, Phong Nguyen, Raul Palma, Jedrzej Potoniec, Floarea Serban, Robert Stevens, Anze Vavpetic, Jun Wang, Derry Wijaya, Adam Woznica RMonto and Meta-mining experiments done jointly with Jedrzej Potoniec Thanks to Veli Bicer for sharing the AIFB dataset Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31
  • 31. Bibliography Keet, C.M., Lawrynowicz, A., dAmato, C., Hilario, M.: Modeling issues, choices in the data mining optimization ontology. In Rodriguez-Muro, M., et al., eds.: OWLED. Volume 1080 of CEUR Workshop Proceedings., CEUR-WS.org (2013) Hilario, M., Nguyen, P., Do, H., Woznica, A., Kalousis, A. (2011). Ontology-Based Meta-Mining of Knowledge Discovery Workflows. In N. Jankowski, W. Duch, K. Grabczewski (Ed.), Meta-Learning in Computational Intelligence (pp. 273-316). Springer. Potoniec, J., Lawrynowicz, A. (2011b). RMonto: Ontological extension to RapidMiner. Poster and Demo Session of the ISWC 2011 - 10th International Semantic Web Conference. Lawrynowicz, A., Potoniec, J.:Pattern Based Feature Construction in Semantic Data Mining. IJSWIS 10(1) (2014) Keet, C.M, Detecting and Revising Flaws in OWL Object Property Expressions. EKAW 2012: 252-266 Serban, F., Vanschoren, J., Kietz, J.-U., Bernstein, A. (2012). A survey of intelligent assistants for data analysis. ACM Computing Surveys Agnieszka Lawrynowicz collaboration with C. Maria Keet, Melanie Hilario, Claudia d’Amato, Jedrzej Potoniec and others - see acknowleData Mining OPtimization Ontology and its application to meta-mining of knowledge disco 25th September 2014 OEG group sem / 31