SlideShare a Scribd company logo
1 of 66
Download to read offline
Euler: 
A 
Logic-­‐Based 
Toolkit 
for 
Aligning 
and 
Reconciling 
Mul:ple 
Taxonomic 
Perspec:ves 
Mingmin 
Chen1 
Shizhuo 
Yu1 
Parisa 
Kianmajd1 
Nico 
Franz2 
Shawn 
Bowers3 
Bertram 
Ludäscher 
4 
1 
Dept. 
of 
Computer 
Science 
, 
University 
of 
California, 
Davis 
2 
School 
of 
Life 
Sciences, 
Arizona 
State 
University 
3 
Dept. 
of 
Computer 
Science, 
Gonzaga 
University 
4 
GSLIS 
& 
NCSA, 
University 
of 
Illinois 
at 
Urbana-­‐Champaign
Outline 
• Meet 
Nico, 
Curator 
of 
Insects 
• TAP: 
The 
Taxonomy 
Alignment 
Problem 
• Euler/X 
– Logic 
Inside! 
(X 
in 
FOL, 
RCC, 
ASP) 
• Related 
Projects 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
2
Meet 
Prof. 
Nico 
Franz: 
Curator 
of 
Insects 
@ 
ASU 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
3
What 
Nico 
et 
al. 
do 
for 
a 
living 
… 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
4
Use 
Case: 
Perelleschus 
sec. 
2001 
& 
2006 
Perelleschus 
salpinflexus 
sec. 
Franz 
& 
Cardona-­‐Duque 
(2013) 
DOI:10.1080/14772000.2013.806371 
1 
Input 
ar:cula:ons: 
Franz 
& 
Cardona-­‐Duque. 
2013. 
Descripaon 
of 
two 
new 
species 
and 
phylogeneac 
reassessment 
of 
Perelleschus 
Wibmer 
& 
O'Brien, 
1986 
(Coleoptera: 
Curculionidae), 
with 
a 
complete 
taxonomic 
concept 
history 
of 
Perelleschus 
sec. 
Franz 
& 
Cardona-­‐Duque, 
2013. 
2013. 
Systema5cs 
and 
Biodiversity 
11: 
209–236. 
Merge 
analyses: 
Franz 
et 
al. 
2014. 
Reasoning 
over 
taxonomic 
change: 
exploring 
alignments 
for 
the 
Perelleschus 
use 
case. 
PLoS 
ONE. 
(in 
press) 
1 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
5
T1: 
Goal: 
Align 
two 
phylogenies 
with 
differen:al 
taxon 
sampling 
Perelleschus 
sec. 
2001 
• Phylogeneac 
revision 
• 8 
ingroup 
species 
concepts 
• 2 
outgroup 
concepts 
• 18 
concepts 
total 
Source: 
Nico 
Franz. 
Explaining 
taxonomy's 
legacy 
to 
computers 
– 
how 
and 
why? 
Naming 
Diversity 
in 
the 
21st 
Natural 
History, 
U 
of 
Colorado, 
9/30/2014. 
T2: 
The 
Meaning 
of 
Names: 
Century, 
Museum 
of 
Perelleschus 
sec. 
2006 
• Exemplar 
analysis 
• 2 
ingroup 
species 
concepts 
• 1 
outgroup 
concept 
• 7 
concepts 
total 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
6
What 
Nico 
does 
for 
a 
living 
(cont’d): 
The 
Indoors 
Part 
• Go 
fun 
places, 
find 
new 
bugs, 
study 
them 
… 
– “Bugs-­‐R-­‐Us” 
(see 
taxonbytes.org) 
• Now: 
Compare, 
align 
and 
revise 
taxonomies, 
based 
on 
careful 
observaaon, 
“character” 
data, 
experase 
… 
• Formally: 
– Input: 
T1 
+ 
T2 
(taxonomies) 
+ 
A 
(expert 
ar3cula3ons) 
– Output: 
revised, 
“merged” 
taxonomy 
(-­‐ies) 
T3 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
7
Taxonomy Alignment Problem (TAP) 
T1 
• Given: 
T2 
– Taxonomies T1 , T2 
• incl. constraints (coverage, disjointness) 
– Set of articulations (an alignment) A 
• Find: 
– Combined (“merged”) taxonomy T3 (= T1 + T2 + A) 
• Is it a taxonomy? Or a DAG? 
– Optional: 
• Final alignment (should be minimal) 
T3 
A 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
8
Real Example: Turn this … 
1.16 
1.17 
1.20 
2.40 
< OR == 
1.18 
1.19 
2.41 
== 
1.14 
1.15 
2.36 
! 
2.38 
< OR == 
2.39 
== 
1.12 
1.13 
1.12L 
! 
2.37 
== 
1.11 
2.42 
== 
2.43 
== 
1.27 
== 2.50 
1.23 
1.24 
1.25 
2.53 
> OR ! 
2.52 
> OR ! 
2.47 
< OR == 
2.54 
> OR ! 
1.22 
2.46 
== 
1.21 
2.45 
== 
2.44 
< OR == 
1.26 
== 2.49 
2.48 
== 
2.51 
== 
2.35 
2.36L 
Nodes 
1 18 
2 21 
Edges 
isa_1 17 
isa_2 20 
Art. 20 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
9
… into this! 
(Perellescus Alignment Result) 
2.44 
1.14 
2.40 
1.11 
2.38 
• T3 := T1 and T2 are “merged” 
2.47 
1.16 
2.52 
1.22 
2.46 
– Blue dashed: overlaps è resolve via “zoom-in view” 
2.35 
1.20 
1.23 
2.53 
2.54 
1.17 
2.41 
1.25 
2.48 
1.12 
2.36 
1.26 
2.49 
1.13 
2.37 
1.18 
2.42 
1.19 
2.43 
1.15 
2.39 
1.21 
2.45 
1.12L 
2.36L 
1.27 
2.50 
1.24 
2.51 
Nodes 
Taxonomy 1 5 
Taxonomy 2 8 
MERGED Taxa 13 
Edges 
Overlaps 10 
Input 24 
INFERRED 5 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
10
So 
how 
does 
it 
work? 
• If 
you 
have 
3 
concepts 
A, 
B, 
and 
C. 
• Assume 
you 
know 
something 
about 
– A 
óR1 
B 
(e.g. 
R1: 
A 
is 
a 
subset 
of 
B) 
– B 
óR2 
C 
(e.g., 
R2: 
B 
is 
disjoint 
from 
C) 
• Now 
what 
can 
you 
say 
about 
this: 
– A 
óR3 
C 
• Yes 
?? 
• … 
it 
follows 
that 
R3: 
A 
is 
disjoint 
from 
C! 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
11
Ar:cula:on 
Language 
(RCC-­‐5) 
• How 
does 
the 
expert 
express 
the 
known 
(or 
assumed) 
relaaonship 
between 
taxa 
A 
and 
B? 
• How 
can 
A 
and 
B 
be 
related? 
• Use 
basic 
set 
constraints 
(B5): 
– A 
= 
B 
(equals 
EQ) 
(==) 
– A 
< 
B 
(proper 
part 
of 
PP) 
(<) 
– A 
> 
B 
(inverse 
proper 
part 
of 
IPP) 
(>) 
– A 
o 
B 
(paraally 
overlaps 
PO) 
(><) 
– A 
! 
B 
(disjoint 
“region” 
DR) 
(!) 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
12
Taxonomies 
and 
Ar:cula:ons 
in 
Euler 
There 
are 
32 
(= 
25) 
possible 
disjunc:ons 
for 
represenang 
par:al 
informa:on. 
A 
taxonomy 
T 
is 
a 
triple 
(N, 
≼, 
ϕ) 
with 
names 
(taxa) 
N, 
a 
paraal 
order 
(is-­‐a) 
≼, 
and 
taxonomic 
constraints 
ϕ. 
• Sibling 
Disjointness: 
sibling 
taxa 
do 
not 
overlap 
• (Parent) 
Coverage: 
The 
union 
of 
the 
children 
“covers” 
the 
parent 
è 
no 
“missing” 
children 
A 
B 
(iv) 
par5al 
overlap 
A 
B 
(ii) 
proper 
part 
B 
A 
(iii) 
Inverse 
proper 
part 
A 
B 
(i) 
congruence 
A 
B 
(v) 
disjointness 
An 
ar:cula:on 
is 
a 
relaaon 
(set-­‐constraint) 
between 
taxa 
A 
and 
B. 
One, 
and 
only 
one, 
of 
the 
following 
base 
relaaons 
B5 
must 
hold: 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
13
R32 
lahce 
of 
32 
(=25) 
disjunc:ons 
over 
B5 
= < > o ! 
(TRUE) 
Level 5 
(tautology) 
= < o ! = < > o = > o ! = < > ! < > o ! 
Level 4 
= < o = o ! = < ! < o ! = > o = < > = > ! < > o > o ! < > ! 
= o = < = ! < o o ! < ! = > > o < > > ! 
= o < ! > 
∅ 
(FALSE) 
= EQ(x,y) Equals 
< PP(x,y) Proper Part of 
> iPP(x,y) Inverse Proper Part 
o PO(x,y) Partially Overlaps 
! DR(x,y) Disjoint from 
Level 1 
(BASE-5 relations) 
Level 3 
Level 2 
Level 0 
(contradiction) 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
14
• … Aristotle … 
• … Euler … 
• … 
• … Greg Whitbread … 
• [BPB93] J. H. Beach, S. Pramanik, and J. H. Beaman. Hierarchic 
taxonomic databases.,Advances in Computer Methods for Systematic 
Biology: Artificial Intelligence, Databases, Computer Vision, 1993 
• [Ber95] Walter G. Berendsohn. The concept of “potential taxa” in 
databases. Taxon, 44:207–212, 1995. 
• [Ber03] Walter G. Berendsohn. MoReTax – Handling Factual Information 
Linked to Taxonomic Concepts in Biology. No. 39 in Schriftenreihe für 
Vegetationskunde. Bundesamt für Naturschutz, 2003. 
• [GG03] M. Geoffroy and A. Güntsch. Assembling and navigating the 
potential taxon graph. In [Ber03], pages 71–82, 2003. 
• [TL07] Thau, D., & Ludäscher, B. (2007). Reasoning about taxonomies 
in first-order logic. Ecological Informatics, 2(3), 195-209. 
• [FP09] Franz, N. M., & Peet, R. K. (2009). Perspectives: towards a 
language for mapping relationships among taxonomic concepts. 
Systematics and Biodiversity, 7(1), 5-20. 
• … 
15 
Some History 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014
What’s in a name? Euler Diagrams 
• Project named after Euler Diagrams: 
IF A is-a B 
AND C and B are disjoint 
------------------------------------ 
THEN: A and C are disjoint! 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
16
Euler 
Diagrams 
asTrees 
(or 
Graphs) 
A 
containment 
hierarchy 
(taxonomy) 
An 
equivalent 
graph 
(w/ 
transi5ve 
edges) 
same 
informa:on 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
17
Represent 
Phylogenies 
as 
Trees 
… 
T1: 
Perelleschus 
sec. 
2001 
• Phylogeneac 
revision 
• 8 
ingroup 
species 
concepts 
• 2 
outgroup 
concepts 
• 18 
concepts 
total 
1.16 
1.17 1.20 
1.18 1.19 
1.14 
1.15 
1.12 
1.13 1.12L 
1.11 
1.27 
1.23 
1.25 1.24 
1.22 1.21 
1.26 
2.37 2.42 B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
18
… 
for 
all 
taxonomies 
of 
interest 
… 
1.16 
1.17 1.20 
1.18 1.19 
1.14 
1.15 
1.12 
1.13 1.12L 
1.11 
1.27 
1.23 
1.25 1.24 
1.22 1.21 
1.26 
2.41 
2.42 2.43 
2.35 
2.36 2.38 
2.37 2.36L 2.39 2.40 
2.53 
2.45 2.46 2.47 
2.52 
2.54 
2.44 
2.51 
2.50 
2.48 
2.49 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
19
… 
ready, 
rotate 
by 
90o, 
set 
… 
1.16 
1.17 1.20 
1.18 1.19 
1.14 
1.15 
1.12 
1.13 1.12L 
1.11 
1.27 
1.23 
1.25 1.24 
1.22 1.21 
1.26 
2.41 
2.40 
2.49 2.54 
2.46 2.45 
2.35 
2.37 
2.36 
2.39 
2.38 
2.53 
2.52 
2.47 
2.51 
2.50 
2.48 
2.43 2.42 
2.44 
2.36L 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
20
Go! An expert input alignment! 
Just add some Euler Reasoning … 
1.16 
1.17 
1.20 
2.40 
< OR == 
1.18 
1.19 
2.41 
== 
1.14 
1.15 
2.36 
! 
2.38 
< OR == 
2.39 
== 
1.12 
1.13 
1.12L 
! 
2.37 
== 
1.11 
2.42 
== 
2.43 
== 
1.27 
== 2.50 
1.23 
1.24 
1.25 
2.53 
> OR ! 
2.52 
> OR ! 
2.47 
< OR == 
2.54 
> OR ! 
1.22 
2.46 
== 
1.21 
2.45 
== 
2.44 
< OR == 
1.26 
== 2.49 
2.48 
== 
2.51 
== 
2.35 
2.36L 
Nodes 
1 18 
2 21 
Edges 
isa_1 17 
isa_2 20 
Art. 20 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
21
Euler/X 
toolkit 
in 
a 
single 
screenshot 
(desktop 
version, 
IX-­‐2014)
… 
et 
voilà! 
The 
merged 
T3 
(=T1 
& 
T2 
& 
A) 
2.52 
1.23 
2.53 
2.54 
1.22 
2.46 
1.25 
2.48 
1.26 
2.49 
1.21 
2.45 
1.18 
2.42 
1.19 
2.43 
1.27 
2.50 
1.24 
2.51 
The 
Euler 
reasoner(s) 
infer: 
-­‐ 
Grey: 
“perfect 
match” 
(congruences) 
-­‐ 
Green, 
Yellow: 
“keepers” 
from 
T1, 
T2 
-­‐ 
Red 
edges: 
deduced 
subset/“sub-­‐class”relaaons 
-­‐ 
Blue 
edges: 
deduced 
overlaps 
2.44 
1.16 
2.40 
1.14 
2.47 
2.38 
1.11 
2.35 
1.20 
1.17 
2.41 
1.12 
2.36 
1.15 
2.39 
1.13 
2.37 
1.12L 
2.36L 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
23
1.16 
But 
wait: 
PW1 
… 
2.40 
1.14 
2.44 
2.47 
2.38 
1.11 
1.12 
2.35 
2.36 
1.12L 
2.36L 
1.20 
2.52 
1.23 
2.53 
2.54 
1.17 
2.41 
1.25 
2.48 
1.26 
2.49 
1.13 
2.37 
1.18 
2.42 
1.19 
2.43 
1.15 
2.39 
1.22 
2.46 
1.21 
2.45 
1.27 
2.50 
1.24 
2.51 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
24
1.16 
2.40 
1.14 
2.44 
2.47 
… 
PW2 
2.38 
1.11 
1.12 
2.35 
1.12L 
2.36 
1.20 
2.52 
1.23 
2.53 
2.54 
1.17 
2.41 
2.36L 
1.25 
2.48 
1.26 
2.49 
1.13 
2.37 
1.18 
2.42 
1.19 
2.43 
1.15 
2.39 
1.22 
2.46 
1.21 
2.45 
1.27 
2.50 
1.24 
2.51 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
25
1.16 
2.40 
1.14 
2.44 
2.47 
… 
PW3 
2.38 
1.11 
2.36 1.12 
2.36L 
2.35 
1.12L 
1.20 
2.52 
1.23 
2.53 
2.54 
1.17 
2.41 
1.25 
2.48 
1.26 
2.49 
1.13 
2.37 
1.18 
2.42 
1.19 
2.43 
1.15 
2.39 
1.22 
2.46 
1.21 
2.45 
1.27 
2.50 
1.24 
2.51 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
26
1.16 
2.40 
1.14 
2.44 
2.47 
… 
PW4 
2.38 
1.11 
2.35 
1.20 
2.52 
1.23 
2.53 
2.54 
1.17 
2.41 
1.22 
2.46 
1.25 
2.48 
1.12 
2.36 
1.26 
2.49 
1.13 
2.37 
1.18 
2.42 
1.19 
2.43 
1.15 
2.39 
1.21 
2.45 
1.12L 
2.36L 
1.27 
2.50 
1.24 
2.51 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
27
1.16 
1.14 
2.40 
2.44 
2.47 
… 
PW5 
1.11 
2.38 
1.12 
2.35 
2.36 
1.12L 
1.20 
1.23 
2.52 2.53 
2.54 
2.36L 
1.17 
2.41 
1.25 
2.48 
1.26 
2.49 
1.13 
2.37 
1.18 
2.42 
1.19 
2.43 
1.15 
2.39 
1.22 
2.46 
1.21 
2.45 
1.27 
2.50 
1.24 
2.51 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
28
Hmmm… 
depending 
on 
input 
alignment: 
PW1 
1.16 
2.40 
1.14 
2.44 
2.47 
2.38 
1.11 
2.36 1.12 
2.36L 
2.35 
1.12L 
1.20 
2.52 
1.23 
2.53 
2.54 
1.17 
2.41 
1.25 
2.48 
1.26 
2.49 
1.13 
2.37 
1.18 
2.42 
1.19 
2.43 
1.15 
2.39 
1.22 
2.46 
1.21 
2.45 
1.27 
2.50 
1.24 
2.51 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
29
… 
and 
PW2 
are 
the 
only 
solu:ons! 
1.16 
What 
happened? 
2.40 
1.14 
2.44 
2.47 
2.38 
1.11 
2.35 
1.20 
2.52 
1.23 
2.53 
2.54 
1.17 
2.41 
1.22 
2.46 
1.25 
2.48 
1.12 
2.36 
1.26 
2.49 
1.13 
2.37 
1.18 
2.42 
1.19 
2.43 
1.15 
2.39 
1.21 
2.45 
1.12L 
2.36L 
1.27 
2.50 
1.24 
2.51 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
30
TAP: 
Possible 
Outcomes 
1.a 
1.b 
isa 
1.c 
isa 
2.d 
< 2.f 
< 2.e 
= 
< 
isa 
isa 
Input 
Alignment 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
31
TAP: 
Possible 
Outcomes 
1.a 
1.b 
isa 
1.c 
isa 
2.d 
< 2.f 
< 2.e 
= 
< 
isa 
isa 
Input 
Alignment 
{A1, A2, A3, A4} 
Black-­‐Box 
Provenance 
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4} 
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4} 
{A1} {A2} {A3} {A4} 
{ } 
Inconsistent! 
è 
Diagnosis 
(Reiter) 
= 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
32
TAP: 
Possible 
Outcomes 
1.a 
1.b 
isa 
1.c 
isa 
2.d 
< 2.f 
< 2.e 
= 
< 
isa 
isa 
Input 
Alignment 
{A1, A2, A3, A4} 
Black-­‐Box 
Provenance 
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4} 
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4} 
{A1} {A2} {A3} {A4} 
{ } 
Inconsistent! 
è 
Diagnosis 
(Reiter) 
= 
2.e 1.b 
1.c 
1.a 
2.d 
2.f 
Ambiguous! 
è 
Mul5ple 
Possible 
Worlds 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
33
TAP: 
Possible 
Outcomes 
1.a 
1.b 
isa 
1.c 
isa 
2.d 
< 2.f 
< 2.e 
= 
< 
isa 
isa 
Input 
Alignment 
{A1, A2, A3, A4} 
Black-­‐Box 
Provenance 
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4} 
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4} 
{A1} {A2} {A3} {A4} 
{ } 
Inconsistent! 
è 
Diagnosis 
(Reiter) 
= 
2.e 1.b 
1.c 
1.a 
2.d 
2.f 
Ambiguous! 
è 
Mul5ple 
Possible 
Worlds 
2.f 1.c 
1.b 
1.a 
2.d 
2.e 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
34
TAP: 
Possible 
Outcomes 
1.a 
1.b 
isa 
1.c 
isa 
2.d 
< 2.f 
< 2.e 
= 
< 
isa 
isa 
Input 
Alignment 
{A1, A2, A3, A4} 
Black-­‐Box 
Provenance 
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4} 
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4} 
{A1} {A2} {A3} {A4} 
{ } 
Inconsistent! 
è 
Diagnosis 
(Reiter) 
= 
2.e 1.b 
1.c 
1.a 
2.d 
2.f 
Ambiguous! 
è 
Mul5ple 
Possible 
Worlds 
2.f 1.c 
1.b 
1.a 
2.d 
2.e 
1.b 
2.e 
1.a 
2.d 
1.c 
2.f 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
35
Euler/X 
Toolkit 
and 
Workflow 
• FO 
reasoning 
about 
taxonomies 
(MFOL) 
• Earlier: 
CleanTax 
– Prover9/Mace4 
• Now: 
Euler 
– ASP 
Reasoners 
(DLV, 
Clingo) 
– Specialized 
reasoners 
(PyRCC) 
– … 
– X 
= 
ASP, 
RCC, 
… 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
36
Reducing Ambiguity 
Possible Worlds 
(PWs) View 
Aggregate 
View (AV) 
Cluster View 
(CV) 
Explore! 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
37
Common 
Outcome: 
Inconsistency! 
1.a 
1.b 
isa 
1.c 
isa 
2.d 
< 2.f 
< 2.e 
= 
< 
isa 
isa 
Input 
Alignment 
{A1, A2, A3, A4} 
Black-­‐Box 
Provenance 
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4} 
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4} 
{A1} {A2} {A3} {A4} 
{ } 
Inconsistent! 
è 
Diagnosis 
(Reiter) 
= 
• Need 
to 
debug 
the 
input 
araculaaons 
è 
(black-­‐box) 
diagnosis! 
• Focus: 
– How 
do 
we 
efficiently 
compute 
the 
diagnosac 
lauce? 
• Also: 
– How 
to 
visualize.. 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
38
A 
Hybrid 
Diagnosis 
Approach 
Combining 
Black-­‐Box 
and 
White-­‐Box 
Reasoning 
Mingmin 
Chen1 
Shizhuo 
Yu1 
Nico 
Franz2 
Shawn 
Bowers3 
Bertram 
Ludäscher 
4 
1 
Department 
of 
Computer 
Science 
, 
University 
of 
California, 
Davis 
2 
School 
of 
Life 
Sciences, 
Arizona 
State 
University 
3 
Department 
of 
Computer 
Science, 
Gonzaga 
University 
4 
GSLIS 
& 
NCSA, 
University 
of 
Illinois 
at 
Urbana-­‐Champaign
Example 
Instance 
(from 
syntheac 
benchmark 
suite) 
• Here: 
N 
= 
10 
taxa 
in 
T1, 
T2 
• Euler/X 
finds: 
inconsistent! 
• è 
diagnos:c 
lahce 
of 
210 
= 
1024 
nodes 
è Find 
minimal 
inconsistent 
subset 
(MIS) 
è maximal 
consistent 
subset 
(MCS) 
.. 
è show 
to 
user! 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
40
Visualizing 
Diagnoses 
N 
= 
10 
araculaaons 
è 
210 
= 
1024 
node 
diagnosac 
lauce 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
41
Bener 
Idea: 
Just 
show 
MIS, 
MCS 
N 
= 
4 
araculaaons 
è 
24 
= 
16 
node 
diagnosac 
lauce, 
but 
3 
MCS 
and 
2 
MIS 
are 
enough! 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
42
Visualizing 
Diagnoses 
.. 
but 
4 
MCS 
and 
1 
MIC 
tell 
it 
all! 
1024 
node 
lauce 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
43
Visualizing 
Diagnoses 
Example 
from 
RuleML’14 
paper: 
N=12 
è 
4096 
nodes 
.. 
but 
7 
MCS 
and 
5 
MIC 
tell 
it 
all! 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
44
Black-Box Inconsistency Analysis 
(Diagnostic Lattice) 
• Then: 
What happens if you 
can’t have all (here: 4) 
articulations together? 
– Repair: find & revise minimal inconsistent subsets (Min-Incons) 
– Expand: find maximal consistent subsets (Max-Cons) & revise outs 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
45
Inconsistency Analysis (Diagnostic Lattice) 
• Black-­‐box 
Analysis 
(Hiung 
Set 
algo.) 
yields 
a 
• The Min-Incons (MIS) and 
Max-Cons (MCS) sets 
determine all others 
è Repair MIS and/or Expand 
MCS 
Diagnosis 
(lauce) 
– for 
n=4 
araculaaons, 
there 
are 
168 
possible 
diagnoses 
– depending 
on 
expected 
“red/green 
areas” 
è 
explore 
space 
differently 
• |araculaaons| 
= 
n 
è 
|possible 
diagnoses| 
= 
|monotonic 
Boolean 
funcaons| 
= 
Dedekind 
Number 
(n): 
2, 
3, 
6, 
20, 
168, 
7581, 
7828354, 
... 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
46
Improving 
Diagnosis 
• Reiter’s 
“black-­‐box” 
(model-­‐based) 
diagnosis 
helps 
debug 
the 
araculaaons 
• Limited 
scalability 
(inherent 
complexity) 
• But 
every 
bit 
helps: 
– Hiung 
Set 
Algorithm 
(“logarithmic 
extracaon”) 
• Our 
idea: 
– Exploit 
“white-­‐box” 
reasoning 
informaaon 
è 
RULES 
to 
the 
rescue 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
47
Key 
Idea: 
exploit 
white-­‐box 
info 
• We 
use 
Answer 
Set 
Programming 
(ASP) 
to 
solve 
Taxonomy 
Alignment 
Problem 
(TAP) 
• Inconsistency 
= 
“False” 
is 
derived 
in 
the 
head: 
False 
:-­‐ 
<denial 
of 
integrity 
constraint> 
• Apply 
provenance 
trick 
from 
databases 
J 
– What 
araculaaons 
contribute 
to 
a 
derivaaon 
of 
“False” 
? 
– Eliminate 
those 
that 
don’t! 
è 
an 
example 
of 
reusing 
inferences 
across 
separate 
black-­‐ 
box 
tests! 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
48
The 
Provenance 
“Trick” 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
49
Hybrid 
Provenance 
A3: 
c 
< 
f 
Black-­‐box 
Provenance 
1.a 
1.b 
isa 
1.c 
isa 
2.d 
< 2.f 
< 2.e 
= 
< 
isa 
isa 
Input 
Alignment 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
50
Hybrid 
Provenance 
A3: 
c 
< 
f 
Black-­‐box 
Provenance 
r7: d = e ∪ f 
< 2.f 
< 2.e 
A1: a = d 
a = e ∪ f 
A1+A2 
+ 
… 
=> 
r3: a = b ∪ c 
r4: b ∩ c = ∅ r8: e ∩ f = ∅ A2: b < e 
f < c 
f 
< 
c 
White-­‐box 
Provenance 
1.a 
1.b 
isa 
1.c 
isa 
2.d 
= 
< 
isa 
isa 
Input 
Alignment 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
51
The 
Hybrid 
Approach 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
52
Hybrid 
Approach 
What 
ar5cula5ons 
contribute 
to 
some 
inconsistency? 
Good 
old 
black-­‐box 
(HST) 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
53
Benchmark 
Results 
• White-­‐box 
< 
Hybrid 
< 
Black-­‐box 
(runames) 
• Note: 
white-­‐box 
does 
not 
give 
you 
a 
diagnosis 
• Potassco 
< 
DLV 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
54
Benchmark 
DLV 
• White-­‐box 
< 
Hybrid 
< 
Black-­‐box 
(runames) 
• Potassco 
< 
DLV 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
55
Benchmark 
Clingo 
• White-­‐box 
< 
Hybrid 
< 
Black-­‐box 
(runames) 
• Potassco 
< 
DLV 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
56
Summary: 
Hybrid 
Diagnosis 
• ASP 
rules 
can 
be 
used 
to 
efficiently 
solve 
real-­‐ 
world 
taxonomy 
reasoning 
problems 
• Reiter’s 
diagnosis 
useful 
to 
debug 
inconsistent 
alignments 
• Adding 
a 
“white-­‐box” 
provenance 
approach 
speeds 
up 
state-­‐of-­‐the-­‐art 
HST 
algorithm 
by 
elimina:ng 
independent 
ar:cula:ons 
• Future 
work: 
– Further 
improvements, 
including 
parallelism: 
• Trade-­‐off 
with 
sharing 
inferences 
across 
parallel 
instances 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
57
Related 
Projects 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
58
The Data Life Cycle 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
59
Data Quality & Curation Workflows 
• Collections & occurrence data is 
all over the map 
– … literally (off the map!) 
• Issues: 
– Lat/Long transposition, 
coordinate & projection issues 
– Data entry/creation, “fuzzy” 
data, naming issues, bit rot, 
data conversions and 
transformations, schema 
mappings, … (you name it) 
• Filtered-Push Collaboration 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
60
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
61
Filtered-Push: Kurator (Data Curation Workflows) 
Tianhong 
Song 
Sven 
Köhler 
Lei 
Dou 
(former 
member) 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
62
From Tool Users to Tool Makers 
Screen capture… 
back to the original definition 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
63
Theory 
meets 
Prac:ce 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
64
Under 
the 
hood: 
Logic 
(ASP) 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
65
Summary 
& 
Invita:on 
• Building 
open 
source 
tools 
for 
– Euler: 
Reasoning 
about 
taxonomies 
(& 
data 
integraaon) 
– Kurator: 
Data 
Curaaon 
workflows 
• … 
and 
other 
scienafic 
workflows 
• Topic 
not 
covered: 
– (Game) 
Theory 
of 
Provenance 
(DAIS 
talk 
@CS, 
10/7/2014) 
• Looking 
for: 
– new 
collaborators, 
students, 
.. 
• Let’s 
meet! 
– ludaesch@illinois.edu 
B. 
Ludäscher 
Euler: 
Reasoning 
about 
Taxonomies 
CIRSS 
Seminar 
9/19/2014 
66

More Related Content

Similar to Euler: A Logic­‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives

AI3391 Artificial Intelligence Session 26 First order logic.pptx
AI3391 Artificial Intelligence Session 26 First order logic.pptxAI3391 Artificial Intelligence Session 26 First order logic.pptx
AI3391 Artificial Intelligence Session 26 First order logic.pptxAsst.prof M.Gokilavani
 
Further pure mathematics 3 coordinate systems
Further pure mathematics 3  coordinate systemsFurther pure mathematics 3  coordinate systems
Further pure mathematics 3 coordinate systemsDennis Almeida
 
J.F. van Werven - Reconsidering Recursion (Master's Thesis)
J.F. van Werven - Reconsidering Recursion (Master's Thesis)J.F. van Werven - Reconsidering Recursion (Master's Thesis)
J.F. van Werven - Reconsidering Recursion (Master's Thesis)Jorike van Werven
 
Integrated modelling Cape Town
Integrated modelling Cape TownIntegrated modelling Cape Town
Integrated modelling Cape TownBob O'Hara
 
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other CasesFranz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Casestaxonbytes
 
T. Pradeu & M. Lemoine: Philosophy in Science: Definition and Boundaries
T. Pradeu & M. Lemoine: Philosophy in Science: Definition and BoundariesT. Pradeu & M. Lemoine: Philosophy in Science: Definition and Boundaries
T. Pradeu & M. Lemoine: Philosophy in Science: Definition and Boundariesjemille6
 
24-Day Content Area Li.docx
24-Day Content Area Li.docx24-Day Content Area Li.docx
24-Day Content Area Li.docxtamicawaysmith
 
Partitions or R³ in unit circles and the Axiom of Choice
Partitions or R³ in unit circles and the Axiom of ChoicePartitions or R³ in unit circles and the Axiom of Choice
Partitions or R³ in unit circles and the Axiom of ChoiceAzul Lihuen Fatalini
 
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGY
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGYICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGY
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGYDaniel Band
 
advanced engineering mathematics-erwin kreyszig.pdf
advanced engineering mathematics-erwin kreyszig.pdfadvanced engineering mathematics-erwin kreyszig.pdf
advanced engineering mathematics-erwin kreyszig.pdfcibeyo cibeyo
 
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGY
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGYICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGY
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGYDaniel Band
 
Variations in citation practices across the scientific landscape: Analysis ba...
Variations in citation practices across the scientific landscape: Analysis ba...Variations in citation practices across the scientific landscape: Analysis ba...
Variations in citation practices across the scientific landscape: Analysis ba...Wout Lamers
 
Visualization of composition and characteristics of scientific elites at an I...
Visualization of composition and characteristics of scientific elites at an I...Visualization of composition and characteristics of scientific elites at an I...
Visualization of composition and characteristics of scientific elites at an I...KNOWeSCAPE2014
 
Why (categorical) representation theory?
Why (categorical) representation theory?Why (categorical) representation theory?
Why (categorical) representation theory?Daniel Tubbenhauer
 

Similar to Euler: A Logic­‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives (20)

Waterloo.ppt
Waterloo.pptWaterloo.ppt
Waterloo.ppt
 
Mapping of Knowledge in DDC Arun Joseph
Mapping of Knowledge in DDC Arun JosephMapping of Knowledge in DDC Arun Joseph
Mapping of Knowledge in DDC Arun Joseph
 
AI3391 Artificial Intelligence Session 26 First order logic.pptx
AI3391 Artificial Intelligence Session 26 First order logic.pptxAI3391 Artificial Intelligence Session 26 First order logic.pptx
AI3391 Artificial Intelligence Session 26 First order logic.pptx
 
Further pure mathematics 3 coordinate systems
Further pure mathematics 3  coordinate systemsFurther pure mathematics 3  coordinate systems
Further pure mathematics 3 coordinate systems
 
J.F. van Werven - Reconsidering Recursion (Master's Thesis)
J.F. van Werven - Reconsidering Recursion (Master's Thesis)J.F. van Werven - Reconsidering Recursion (Master's Thesis)
J.F. van Werven - Reconsidering Recursion (Master's Thesis)
 
Integrated modelling Cape Town
Integrated modelling Cape TownIntegrated modelling Cape Town
Integrated modelling Cape Town
 
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other CasesFranz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
 
Points of view in conceptual space
Points of view in conceptual spacePoints of view in conceptual space
Points of view in conceptual space
 
T. Pradeu & M. Lemoine: Philosophy in Science: Definition and Boundaries
T. Pradeu & M. Lemoine: Philosophy in Science: Definition and BoundariesT. Pradeu & M. Lemoine: Philosophy in Science: Definition and Boundaries
T. Pradeu & M. Lemoine: Philosophy in Science: Definition and Boundaries
 
24-Day Content Area Li.docx
24-Day Content Area Li.docx24-Day Content Area Li.docx
24-Day Content Area Li.docx
 
Partitions or R³ in unit circles and the Axiom of Choice
Partitions or R³ in unit circles and the Axiom of ChoicePartitions or R³ in unit circles and the Axiom of Choice
Partitions or R³ in unit circles and the Axiom of Choice
 
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGY
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGYICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGY
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGY
 
advanced engineering mathematics-erwin kreyszig.pdf
advanced engineering mathematics-erwin kreyszig.pdfadvanced engineering mathematics-erwin kreyszig.pdf
advanced engineering mathematics-erwin kreyszig.pdf
 
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGY
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGYICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGY
ICIS Module Spec - BI2S12 VERTEBRATE ZOOLOGY
 
Variations in citation practices across the scientific landscape: Analysis ba...
Variations in citation practices across the scientific landscape: Analysis ba...Variations in citation practices across the scientific landscape: Analysis ba...
Variations in citation practices across the scientific landscape: Analysis ba...
 
Visualization of composition and characteristics of scientific elites at an I...
Visualization of composition and characteristics of scientific elites at an I...Visualization of composition and characteristics of scientific elites at an I...
Visualization of composition and characteristics of scientific elites at an I...
 
Why (categorical) representation theory?
Why (categorical) representation theory?Why (categorical) representation theory?
Why (categorical) representation theory?
 
LDA on social bookmarking systems
LDA on social bookmarking systemsLDA on social bookmarking systems
LDA on social bookmarking systems
 
Prep for TNReady!
Prep for TNReady!Prep for TNReady!
Prep for TNReady!
 
All teams together
All teams togetherAll teams together
All teams together
 

More from Bertram Ludäscher

Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionGames, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionBertram Ludäscher
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!Bertram Ludäscher
 
[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database RulesBertram Ludäscher
 
[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database RulesBertram Ludäscher
 
Answering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query PatternsAnswering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query PatternsBertram Ludäscher
 
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Bertram Ludäscher
 
Which Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A DialogueWhich Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A DialogueBertram Ludäscher
 
From Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesFrom Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesBertram Ludäscher
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsBertram Ludäscher
 
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseDeduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseBertram Ludäscher
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...Bertram Ludäscher
 
Dissecting Reproducibility: A case study with ecological niche models in th...
Dissecting Reproducibility:  A case study with ecological niche models  in th...Dissecting Reproducibility:  A case study with ecological niche models  in th...
Dissecting Reproducibility: A case study with ecological niche models in th...Bertram Ludäscher
 
Incremental Recomputation: Those who cannot remember the past are condemned ...
Incremental Recomputation:  Those who cannot remember the past are condemned ...Incremental Recomputation:  Those who cannot remember the past are condemned ...
Incremental Recomputation: Those who cannot remember the past are condemned ...Bertram Ludäscher
 
Validation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency AnnotationsValidation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency AnnotationsBertram Ludäscher
 
An ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflowsAn ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflowsBertram Ludäscher
 
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses ApproachKnowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses ApproachBertram Ludäscher
 
Whole-Tale: The Experience of Research
Whole-Tale: The Experience of ResearchWhole-Tale: The Experience of Research
Whole-Tale: The Experience of ResearchBertram Ludäscher
 
ETC & Authors in the Driver's Seat
ETC & Authors in the Driver's SeatETC & Authors in the Driver's Seat
ETC & Authors in the Driver's SeatBertram Ludäscher
 
From Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable ProvenanceFrom Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable ProvenanceBertram Ludäscher
 
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligion
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligionWild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligion
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligionBertram Ludäscher
 

More from Bertram Ludäscher (20)

Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionGames, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
 
[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules
 
[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules
 
Answering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query PatternsAnswering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query Patterns
 
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
 
Which Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A DialogueWhich Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A Dialogue
 
From Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesFrom Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science Tales
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
 
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseDeduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
 
Dissecting Reproducibility: A case study with ecological niche models in th...
Dissecting Reproducibility:  A case study with ecological niche models  in th...Dissecting Reproducibility:  A case study with ecological niche models  in th...
Dissecting Reproducibility: A case study with ecological niche models in th...
 
Incremental Recomputation: Those who cannot remember the past are condemned ...
Incremental Recomputation:  Those who cannot remember the past are condemned ...Incremental Recomputation:  Those who cannot remember the past are condemned ...
Incremental Recomputation: Those who cannot remember the past are condemned ...
 
Validation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency AnnotationsValidation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency Annotations
 
An ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflowsAn ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflows
 
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses ApproachKnowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
 
Whole-Tale: The Experience of Research
Whole-Tale: The Experience of ResearchWhole-Tale: The Experience of Research
Whole-Tale: The Experience of Research
 
ETC & Authors in the Driver's Seat
ETC & Authors in the Driver's SeatETC & Authors in the Driver's Seat
ETC & Authors in the Driver's Seat
 
From Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable ProvenanceFrom Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable Provenance
 
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligion
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligionWild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligion
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligion
 

Recently uploaded

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 

Recently uploaded (20)

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 

Euler: A Logic­‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives

  • 1. Euler: A Logic-­‐Based Toolkit for Aligning and Reconciling Mul:ple Taxonomic Perspec:ves Mingmin Chen1 Shizhuo Yu1 Parisa Kianmajd1 Nico Franz2 Shawn Bowers3 Bertram Ludäscher 4 1 Dept. of Computer Science , University of California, Davis 2 School of Life Sciences, Arizona State University 3 Dept. of Computer Science, Gonzaga University 4 GSLIS & NCSA, University of Illinois at Urbana-­‐Champaign
  • 2. Outline • Meet Nico, Curator of Insects • TAP: The Taxonomy Alignment Problem • Euler/X – Logic Inside! (X in FOL, RCC, ASP) • Related Projects B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 2
  • 3. Meet Prof. Nico Franz: Curator of Insects @ ASU B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 3
  • 4. What Nico et al. do for a living … B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 4
  • 5. Use Case: Perelleschus sec. 2001 & 2006 Perelleschus salpinflexus sec. Franz & Cardona-­‐Duque (2013) DOI:10.1080/14772000.2013.806371 1 Input ar:cula:ons: Franz & Cardona-­‐Duque. 2013. Descripaon of two new species and phylogeneac reassessment of Perelleschus Wibmer & O'Brien, 1986 (Coleoptera: Curculionidae), with a complete taxonomic concept history of Perelleschus sec. Franz & Cardona-­‐Duque, 2013. 2013. Systema5cs and Biodiversity 11: 209–236. Merge analyses: Franz et al. 2014. Reasoning over taxonomic change: exploring alignments for the Perelleschus use case. PLoS ONE. (in press) 1 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 5
  • 6. T1: Goal: Align two phylogenies with differen:al taxon sampling Perelleschus sec. 2001 • Phylogeneac revision • 8 ingroup species concepts • 2 outgroup concepts • 18 concepts total Source: Nico Franz. Explaining taxonomy's legacy to computers – how and why? Naming Diversity in the 21st Natural History, U of Colorado, 9/30/2014. T2: The Meaning of Names: Century, Museum of Perelleschus sec. 2006 • Exemplar analysis • 2 ingroup species concepts • 1 outgroup concept • 7 concepts total B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 6
  • 7. What Nico does for a living (cont’d): The Indoors Part • Go fun places, find new bugs, study them … – “Bugs-­‐R-­‐Us” (see taxonbytes.org) • Now: Compare, align and revise taxonomies, based on careful observaaon, “character” data, experase … • Formally: – Input: T1 + T2 (taxonomies) + A (expert ar3cula3ons) – Output: revised, “merged” taxonomy (-­‐ies) T3 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 7
  • 8. Taxonomy Alignment Problem (TAP) T1 • Given: T2 – Taxonomies T1 , T2 • incl. constraints (coverage, disjointness) – Set of articulations (an alignment) A • Find: – Combined (“merged”) taxonomy T3 (= T1 + T2 + A) • Is it a taxonomy? Or a DAG? – Optional: • Final alignment (should be minimal) T3 A B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 8
  • 9. Real Example: Turn this … 1.16 1.17 1.20 2.40 < OR == 1.18 1.19 2.41 == 1.14 1.15 2.36 ! 2.38 < OR == 2.39 == 1.12 1.13 1.12L ! 2.37 == 1.11 2.42 == 2.43 == 1.27 == 2.50 1.23 1.24 1.25 2.53 > OR ! 2.52 > OR ! 2.47 < OR == 2.54 > OR ! 1.22 2.46 == 1.21 2.45 == 2.44 < OR == 1.26 == 2.49 2.48 == 2.51 == 2.35 2.36L Nodes 1 18 2 21 Edges isa_1 17 isa_2 20 Art. 20 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 9
  • 10. … into this! (Perellescus Alignment Result) 2.44 1.14 2.40 1.11 2.38 • T3 := T1 and T2 are “merged” 2.47 1.16 2.52 1.22 2.46 – Blue dashed: overlaps è resolve via “zoom-in view” 2.35 1.20 1.23 2.53 2.54 1.17 2.41 1.25 2.48 1.12 2.36 1.26 2.49 1.13 2.37 1.18 2.42 1.19 2.43 1.15 2.39 1.21 2.45 1.12L 2.36L 1.27 2.50 1.24 2.51 Nodes Taxonomy 1 5 Taxonomy 2 8 MERGED Taxa 13 Edges Overlaps 10 Input 24 INFERRED 5 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 10
  • 11. So how does it work? • If you have 3 concepts A, B, and C. • Assume you know something about – A óR1 B (e.g. R1: A is a subset of B) – B óR2 C (e.g., R2: B is disjoint from C) • Now what can you say about this: – A óR3 C • Yes ?? • … it follows that R3: A is disjoint from C! B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 11
  • 12. Ar:cula:on Language (RCC-­‐5) • How does the expert express the known (or assumed) relaaonship between taxa A and B? • How can A and B be related? • Use basic set constraints (B5): – A = B (equals EQ) (==) – A < B (proper part of PP) (<) – A > B (inverse proper part of IPP) (>) – A o B (paraally overlaps PO) (><) – A ! B (disjoint “region” DR) (!) B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 12
  • 13. Taxonomies and Ar:cula:ons in Euler There are 32 (= 25) possible disjunc:ons for represenang par:al informa:on. A taxonomy T is a triple (N, ≼, ϕ) with names (taxa) N, a paraal order (is-­‐a) ≼, and taxonomic constraints ϕ. • Sibling Disjointness: sibling taxa do not overlap • (Parent) Coverage: The union of the children “covers” the parent è no “missing” children A B (iv) par5al overlap A B (ii) proper part B A (iii) Inverse proper part A B (i) congruence A B (v) disjointness An ar:cula:on is a relaaon (set-­‐constraint) between taxa A and B. One, and only one, of the following base relaaons B5 must hold: B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 13
  • 14. R32 lahce of 32 (=25) disjunc:ons over B5 = < > o ! (TRUE) Level 5 (tautology) = < o ! = < > o = > o ! = < > ! < > o ! Level 4 = < o = o ! = < ! < o ! = > o = < > = > ! < > o > o ! < > ! = o = < = ! < o o ! < ! = > > o < > > ! = o < ! > ∅ (FALSE) = EQ(x,y) Equals < PP(x,y) Proper Part of > iPP(x,y) Inverse Proper Part o PO(x,y) Partially Overlaps ! DR(x,y) Disjoint from Level 1 (BASE-5 relations) Level 3 Level 2 Level 0 (contradiction) B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 14
  • 15. • … Aristotle … • … Euler … • … • … Greg Whitbread … • [BPB93] J. H. Beach, S. Pramanik, and J. H. Beaman. Hierarchic taxonomic databases.,Advances in Computer Methods for Systematic Biology: Artificial Intelligence, Databases, Computer Vision, 1993 • [Ber95] Walter G. Berendsohn. The concept of “potential taxa” in databases. Taxon, 44:207–212, 1995. • [Ber03] Walter G. Berendsohn. MoReTax – Handling Factual Information Linked to Taxonomic Concepts in Biology. No. 39 in Schriftenreihe für Vegetationskunde. Bundesamt für Naturschutz, 2003. • [GG03] M. Geoffroy and A. Güntsch. Assembling and navigating the potential taxon graph. In [Ber03], pages 71–82, 2003. • [TL07] Thau, D., & Ludäscher, B. (2007). Reasoning about taxonomies in first-order logic. Ecological Informatics, 2(3), 195-209. • [FP09] Franz, N. M., & Peet, R. K. (2009). Perspectives: towards a language for mapping relationships among taxonomic concepts. Systematics and Biodiversity, 7(1), 5-20. • … 15 Some History B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014
  • 16. What’s in a name? Euler Diagrams • Project named after Euler Diagrams: IF A is-a B AND C and B are disjoint ------------------------------------ THEN: A and C are disjoint! B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 16
  • 17. Euler Diagrams asTrees (or Graphs) A containment hierarchy (taxonomy) An equivalent graph (w/ transi5ve edges) same informa:on B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 17
  • 18. Represent Phylogenies as Trees … T1: Perelleschus sec. 2001 • Phylogeneac revision • 8 ingroup species concepts • 2 outgroup concepts • 18 concepts total 1.16 1.17 1.20 1.18 1.19 1.14 1.15 1.12 1.13 1.12L 1.11 1.27 1.23 1.25 1.24 1.22 1.21 1.26 2.37 2.42 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 18
  • 19. … for all taxonomies of interest … 1.16 1.17 1.20 1.18 1.19 1.14 1.15 1.12 1.13 1.12L 1.11 1.27 1.23 1.25 1.24 1.22 1.21 1.26 2.41 2.42 2.43 2.35 2.36 2.38 2.37 2.36L 2.39 2.40 2.53 2.45 2.46 2.47 2.52 2.54 2.44 2.51 2.50 2.48 2.49 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 19
  • 20. … ready, rotate by 90o, set … 1.16 1.17 1.20 1.18 1.19 1.14 1.15 1.12 1.13 1.12L 1.11 1.27 1.23 1.25 1.24 1.22 1.21 1.26 2.41 2.40 2.49 2.54 2.46 2.45 2.35 2.37 2.36 2.39 2.38 2.53 2.52 2.47 2.51 2.50 2.48 2.43 2.42 2.44 2.36L B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 20
  • 21. Go! An expert input alignment! Just add some Euler Reasoning … 1.16 1.17 1.20 2.40 < OR == 1.18 1.19 2.41 == 1.14 1.15 2.36 ! 2.38 < OR == 2.39 == 1.12 1.13 1.12L ! 2.37 == 1.11 2.42 == 2.43 == 1.27 == 2.50 1.23 1.24 1.25 2.53 > OR ! 2.52 > OR ! 2.47 < OR == 2.54 > OR ! 1.22 2.46 == 1.21 2.45 == 2.44 < OR == 1.26 == 2.49 2.48 == 2.51 == 2.35 2.36L Nodes 1 18 2 21 Edges isa_1 17 isa_2 20 Art. 20 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 21
  • 22. Euler/X toolkit in a single screenshot (desktop version, IX-­‐2014)
  • 23. … et voilà! The merged T3 (=T1 & T2 & A) 2.52 1.23 2.53 2.54 1.22 2.46 1.25 2.48 1.26 2.49 1.21 2.45 1.18 2.42 1.19 2.43 1.27 2.50 1.24 2.51 The Euler reasoner(s) infer: -­‐ Grey: “perfect match” (congruences) -­‐ Green, Yellow: “keepers” from T1, T2 -­‐ Red edges: deduced subset/“sub-­‐class”relaaons -­‐ Blue edges: deduced overlaps 2.44 1.16 2.40 1.14 2.47 2.38 1.11 2.35 1.20 1.17 2.41 1.12 2.36 1.15 2.39 1.13 2.37 1.12L 2.36L B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 23
  • 24. 1.16 But wait: PW1 … 2.40 1.14 2.44 2.47 2.38 1.11 1.12 2.35 2.36 1.12L 2.36L 1.20 2.52 1.23 2.53 2.54 1.17 2.41 1.25 2.48 1.26 2.49 1.13 2.37 1.18 2.42 1.19 2.43 1.15 2.39 1.22 2.46 1.21 2.45 1.27 2.50 1.24 2.51 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 24
  • 25. 1.16 2.40 1.14 2.44 2.47 … PW2 2.38 1.11 1.12 2.35 1.12L 2.36 1.20 2.52 1.23 2.53 2.54 1.17 2.41 2.36L 1.25 2.48 1.26 2.49 1.13 2.37 1.18 2.42 1.19 2.43 1.15 2.39 1.22 2.46 1.21 2.45 1.27 2.50 1.24 2.51 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 25
  • 26. 1.16 2.40 1.14 2.44 2.47 … PW3 2.38 1.11 2.36 1.12 2.36L 2.35 1.12L 1.20 2.52 1.23 2.53 2.54 1.17 2.41 1.25 2.48 1.26 2.49 1.13 2.37 1.18 2.42 1.19 2.43 1.15 2.39 1.22 2.46 1.21 2.45 1.27 2.50 1.24 2.51 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 26
  • 27. 1.16 2.40 1.14 2.44 2.47 … PW4 2.38 1.11 2.35 1.20 2.52 1.23 2.53 2.54 1.17 2.41 1.22 2.46 1.25 2.48 1.12 2.36 1.26 2.49 1.13 2.37 1.18 2.42 1.19 2.43 1.15 2.39 1.21 2.45 1.12L 2.36L 1.27 2.50 1.24 2.51 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 27
  • 28. 1.16 1.14 2.40 2.44 2.47 … PW5 1.11 2.38 1.12 2.35 2.36 1.12L 1.20 1.23 2.52 2.53 2.54 2.36L 1.17 2.41 1.25 2.48 1.26 2.49 1.13 2.37 1.18 2.42 1.19 2.43 1.15 2.39 1.22 2.46 1.21 2.45 1.27 2.50 1.24 2.51 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 28
  • 29. Hmmm… depending on input alignment: PW1 1.16 2.40 1.14 2.44 2.47 2.38 1.11 2.36 1.12 2.36L 2.35 1.12L 1.20 2.52 1.23 2.53 2.54 1.17 2.41 1.25 2.48 1.26 2.49 1.13 2.37 1.18 2.42 1.19 2.43 1.15 2.39 1.22 2.46 1.21 2.45 1.27 2.50 1.24 2.51 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 29
  • 30. … and PW2 are the only solu:ons! 1.16 What happened? 2.40 1.14 2.44 2.47 2.38 1.11 2.35 1.20 2.52 1.23 2.53 2.54 1.17 2.41 1.22 2.46 1.25 2.48 1.12 2.36 1.26 2.49 1.13 2.37 1.18 2.42 1.19 2.43 1.15 2.39 1.21 2.45 1.12L 2.36L 1.27 2.50 1.24 2.51 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 30
  • 31. TAP: Possible Outcomes 1.a 1.b isa 1.c isa 2.d < 2.f < 2.e = < isa isa Input Alignment B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 31
  • 32. TAP: Possible Outcomes 1.a 1.b isa 1.c isa 2.d < 2.f < 2.e = < isa isa Input Alignment {A1, A2, A3, A4} Black-­‐Box Provenance {A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4} {A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4} {A1} {A2} {A3} {A4} { } Inconsistent! è Diagnosis (Reiter) = B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 32
  • 33. TAP: Possible Outcomes 1.a 1.b isa 1.c isa 2.d < 2.f < 2.e = < isa isa Input Alignment {A1, A2, A3, A4} Black-­‐Box Provenance {A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4} {A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4} {A1} {A2} {A3} {A4} { } Inconsistent! è Diagnosis (Reiter) = 2.e 1.b 1.c 1.a 2.d 2.f Ambiguous! è Mul5ple Possible Worlds B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 33
  • 34. TAP: Possible Outcomes 1.a 1.b isa 1.c isa 2.d < 2.f < 2.e = < isa isa Input Alignment {A1, A2, A3, A4} Black-­‐Box Provenance {A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4} {A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4} {A1} {A2} {A3} {A4} { } Inconsistent! è Diagnosis (Reiter) = 2.e 1.b 1.c 1.a 2.d 2.f Ambiguous! è Mul5ple Possible Worlds 2.f 1.c 1.b 1.a 2.d 2.e B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 34
  • 35. TAP: Possible Outcomes 1.a 1.b isa 1.c isa 2.d < 2.f < 2.e = < isa isa Input Alignment {A1, A2, A3, A4} Black-­‐Box Provenance {A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4} {A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4} {A1} {A2} {A3} {A4} { } Inconsistent! è Diagnosis (Reiter) = 2.e 1.b 1.c 1.a 2.d 2.f Ambiguous! è Mul5ple Possible Worlds 2.f 1.c 1.b 1.a 2.d 2.e 1.b 2.e 1.a 2.d 1.c 2.f B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 35
  • 36. Euler/X Toolkit and Workflow • FO reasoning about taxonomies (MFOL) • Earlier: CleanTax – Prover9/Mace4 • Now: Euler – ASP Reasoners (DLV, Clingo) – Specialized reasoners (PyRCC) – … – X = ASP, RCC, … B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 36
  • 37. Reducing Ambiguity Possible Worlds (PWs) View Aggregate View (AV) Cluster View (CV) Explore! B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 37
  • 38. Common Outcome: Inconsistency! 1.a 1.b isa 1.c isa 2.d < 2.f < 2.e = < isa isa Input Alignment {A1, A2, A3, A4} Black-­‐Box Provenance {A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4} {A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4} {A1} {A2} {A3} {A4} { } Inconsistent! è Diagnosis (Reiter) = • Need to debug the input araculaaons è (black-­‐box) diagnosis! • Focus: – How do we efficiently compute the diagnosac lauce? • Also: – How to visualize.. B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 38
  • 39. A Hybrid Diagnosis Approach Combining Black-­‐Box and White-­‐Box Reasoning Mingmin Chen1 Shizhuo Yu1 Nico Franz2 Shawn Bowers3 Bertram Ludäscher 4 1 Department of Computer Science , University of California, Davis 2 School of Life Sciences, Arizona State University 3 Department of Computer Science, Gonzaga University 4 GSLIS & NCSA, University of Illinois at Urbana-­‐Champaign
  • 40. Example Instance (from syntheac benchmark suite) • Here: N = 10 taxa in T1, T2 • Euler/X finds: inconsistent! • è diagnos:c lahce of 210 = 1024 nodes è Find minimal inconsistent subset (MIS) è maximal consistent subset (MCS) .. è show to user! B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 40
  • 41. Visualizing Diagnoses N = 10 araculaaons è 210 = 1024 node diagnosac lauce B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 41
  • 42. Bener Idea: Just show MIS, MCS N = 4 araculaaons è 24 = 16 node diagnosac lauce, but 3 MCS and 2 MIS are enough! B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 42
  • 43. Visualizing Diagnoses .. but 4 MCS and 1 MIC tell it all! 1024 node lauce B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 43
  • 44. Visualizing Diagnoses Example from RuleML’14 paper: N=12 è 4096 nodes .. but 7 MCS and 5 MIC tell it all! B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 44
  • 45. Black-Box Inconsistency Analysis (Diagnostic Lattice) • Then: What happens if you can’t have all (here: 4) articulations together? – Repair: find & revise minimal inconsistent subsets (Min-Incons) – Expand: find maximal consistent subsets (Max-Cons) & revise outs B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 45
  • 46. Inconsistency Analysis (Diagnostic Lattice) • Black-­‐box Analysis (Hiung Set algo.) yields a • The Min-Incons (MIS) and Max-Cons (MCS) sets determine all others è Repair MIS and/or Expand MCS Diagnosis (lauce) – for n=4 araculaaons, there are 168 possible diagnoses – depending on expected “red/green areas” è explore space differently • |araculaaons| = n è |possible diagnoses| = |monotonic Boolean funcaons| = Dedekind Number (n): 2, 3, 6, 20, 168, 7581, 7828354, ... B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 46
  • 47. Improving Diagnosis • Reiter’s “black-­‐box” (model-­‐based) diagnosis helps debug the araculaaons • Limited scalability (inherent complexity) • But every bit helps: – Hiung Set Algorithm (“logarithmic extracaon”) • Our idea: – Exploit “white-­‐box” reasoning informaaon è RULES to the rescue B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 47
  • 48. Key Idea: exploit white-­‐box info • We use Answer Set Programming (ASP) to solve Taxonomy Alignment Problem (TAP) • Inconsistency = “False” is derived in the head: False :-­‐ <denial of integrity constraint> • Apply provenance trick from databases J – What araculaaons contribute to a derivaaon of “False” ? – Eliminate those that don’t! è an example of reusing inferences across separate black-­‐ box tests! B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 48
  • 49. The Provenance “Trick” B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 49
  • 50. Hybrid Provenance A3: c < f Black-­‐box Provenance 1.a 1.b isa 1.c isa 2.d < 2.f < 2.e = < isa isa Input Alignment B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 50
  • 51. Hybrid Provenance A3: c < f Black-­‐box Provenance r7: d = e ∪ f < 2.f < 2.e A1: a = d a = e ∪ f A1+A2 + … => r3: a = b ∪ c r4: b ∩ c = ∅ r8: e ∩ f = ∅ A2: b < e f < c f < c White-­‐box Provenance 1.a 1.b isa 1.c isa 2.d = < isa isa Input Alignment B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 51
  • 52. The Hybrid Approach B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 52
  • 53. Hybrid Approach What ar5cula5ons contribute to some inconsistency? Good old black-­‐box (HST) B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 53
  • 54. Benchmark Results • White-­‐box < Hybrid < Black-­‐box (runames) • Note: white-­‐box does not give you a diagnosis • Potassco < DLV B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 54
  • 55. Benchmark DLV • White-­‐box < Hybrid < Black-­‐box (runames) • Potassco < DLV B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 55
  • 56. Benchmark Clingo • White-­‐box < Hybrid < Black-­‐box (runames) • Potassco < DLV B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 56
  • 57. Summary: Hybrid Diagnosis • ASP rules can be used to efficiently solve real-­‐ world taxonomy reasoning problems • Reiter’s diagnosis useful to debug inconsistent alignments • Adding a “white-­‐box” provenance approach speeds up state-­‐of-­‐the-­‐art HST algorithm by elimina:ng independent ar:cula:ons • Future work: – Further improvements, including parallelism: • Trade-­‐off with sharing inferences across parallel instances B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 57
  • 58. Related Projects B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 58
  • 59. The Data Life Cycle B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 59
  • 60. Data Quality & Curation Workflows • Collections & occurrence data is all over the map – … literally (off the map!) • Issues: – Lat/Long transposition, coordinate & projection issues – Data entry/creation, “fuzzy” data, naming issues, bit rot, data conversions and transformations, schema mappings, … (you name it) • Filtered-Push Collaboration B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 60
  • 61. B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 61
  • 62. Filtered-Push: Kurator (Data Curation Workflows) Tianhong Song Sven Köhler Lei Dou (former member) B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 62
  • 63. From Tool Users to Tool Makers Screen capture… back to the original definition B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 63
  • 64. Theory meets Prac:ce B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 64
  • 65. Under the hood: Logic (ASP) B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 65
  • 66. Summary & Invita:on • Building open source tools for – Euler: Reasoning about taxonomies (& data integraaon) – Kurator: Data Curaaon workflows • … and other scienafic workflows • Topic not covered: – (Game) Theory of Provenance (DAIS talk @CS, 10/7/2014) • Looking for: – new collaborators, students, .. • Let’s meet! – ludaesch@illinois.edu B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 66