Slideshow transcript
Slide 1: Deploying Semantic Technologies for Digital Publishing A C a s e S tud y fro m Log o s B ib le S o ftwa re S e a n B ois e n (s e a n @lo g o s .c o m ) S lide s a t: h ttp:/ s e m a ntic b ible .org / e r/ s e nta tions / / oth pre 2007-S e m Te c h /
Slide 2: Outlin e • B a c kg round: a pplic a tion a nd m otiva tion • S c ope a nd Ove rvie w • Te c h nic a l C h a lle ng e s : – R e ific a tion for prove na nc e da ta – Conve rting le g a c y da ta – Tools for knowle dg e e xte ns ion • F uture dire c tions
Slide 3: Wh o Am I? • 19 ye a rs with B B N Te c h nolog ie s – Informa tion e xtra c tion, h uma n la ng ua g e te c h nolog y – S c ie ntis t, te c h nolog y ma na g e r • S e m a ntic We b h ob b yis t • S e nior Inform a tion Arc h ite c t a t Log os • One -m a n s e m a ntic b a n d
Slide 4: Th e Im porta nc e of th e B ib le a s a S e m a ntic Dom a in • Th e mos t wide ly dis tribute d book – 35M B ib le s a nd Te s ta m e nts in 2005 • Th e mos t wide ly tra ns la te d work – > 2000 la ng ua g e s – 41 la ng ua g e s a t www.b ib le g a te wa y.c om • S pa ns 1000s of ye a rs of a nc ie nt h is tory
Slide 5: Lo g o s B ib le S o ftwa re • Hig h -e nd de s ktop dig ita l libra ry – > 7000 title s – R e s ourc e s in a doz e n la ng ua g e s – Us e rs in 180 c ountrie s – E xte ns ive c ros s -inde xing a nd h ype r linking • Le a ding publis h e r a nd de ve lope r of dig ita l re s ourc e s for B ible s tudy
Slide 6: Lo g o s Va lu e • Dig ita l lib ra ry with h ype rlinke d re fe re n c e s a nd c ita tions • Inform a tion inte g ra tion for na vig a tion, s e a rc h • S upport fo r orig ina l la ng ua g e s • S e a rc h • Ne w c o nte n t to e nric h B ib le s tudy
Slide 7: Th e B ib le Kn o wle d g e b a s e (B K) • A ma c h ine -re a da ble knowle dg e ba s e of s e ma ntic a lly- org a niz e d B ible da ta – In OWL – Linke d to B ib lic a l te xts – S e a rc h , n a vig a tion , vis u a liz a tio n • R e la tions h ips s upport dis c ove ry a nd e xplora tion • R e us a ble c onte nt (unlike pros e ) • Inte g ra tion fra me work for libra ry re s ourc e s (future ) • Toda y: na me d pe ople a nd pla c e s , a nd th e ir re la tions h ips • Tomorrow: c h ronolog y, e ve nts , c onc e pts , non-na me d th ing s , ke y te rms , topic s , …
Slide 8: App ro a c h • B uild on S e m a n tic We b s ta nda rds • Mode l th e dom a in ra th e r th a n a nn ota te te xts • La ye r knowle dg e : firs t e ntitie s , th e n re la tions h ips • B e c ons e rva tive in wh a t we a s s e rt a nd provide re fe re nc e s a s e vide nc e • Try to a void ph ilos oph y a nd foc us on e nd -us e r va lue
Slide 9: Th e S e m a ntic Va lu e P rop o s itio n • Ide ntify a nd dis a mbig ua te e ntitie s (be yond na me s ) – 30 p e op le na m e d Ze c h a ria h – J e s u s ’ d is c iple : P e te r, S im on , S im e o n, C e ph a s … – J ud a h : p e rs on , trib e , te rrito ry • Link re fe re nc e informa tion to pa s s a g e s for ba c kg round • P rovide a ric h s e t of re la tions h ips to e nc oura g e e xplora tion a nd dis c ove ry • P rovide c ons is te nt c ros s -re s ourc e inde xing • Le ve ra g e th ird-pa rty tools • P rovide s c a la bility • Avoid re inve nting th e wh e e l
Slide 10: Us e r B e n e fits • Dis a m b ig ua tion m a ke s s e a rc h work b e tte r • P a s s a g e g uide dis pla ys re le va nt e ntitie s to provide b a c kg round inform a tion • R e la tions h ips e nc oura g e b rows ing a n d e xplora tion • Vis ua liz a tion m a ke s c om ple x inform a tion e a s ie r to g ra s p
Slide 11: De ve lop m e nt Too ls • Ontolo g y de ve lopm e nt a nd ins ta nc e c re a tion with P ro té g é • Le g a c y da ta c onve rs ion a nd da ta m e rg ing th roug h XS LT • S tora g e in S e s a m e • S om e inte g ra tion c ode in P yth on for loa ding a nd q ue rying R DF • TB D
Slide 12: Mo s t Im p orta n t B K C la s s e s • > 60 c la s s e s in a ll (n ot c ou n tin g re ifie d re la tio ns h ip s ) • Ma n y up p e r c la s s e s a re n o t in s ta n tia te d • G e n e ra l c oo rd in a tio n o f c la s s n a m e s with S UMO – B ut not true re -us e
Slide 13: B K C la s s e s fo r P la c e s
Slide 14: B K Ab s tra c t C la s s e s
Slide 15: B K Ins ta n c e s • ~1 0 0k trip le s • ~3 0 00 p e o ple in s ta n c e s – Aa ron to Zuris h a dda i – Na m e s (va rious la ng ua g e s ) • ~2 0 k pa s s a g e re fe re n c e s fo r a s s e rtio n s • 9 0 c itie s , oth e r p la c e s • E th nic itie s , b e lie f s ys te m s , la ng u a g e s , s o c ia l role s , o rg a niz a tion s
Slide 16: Ma jo r B K R e la tio ns h ip s Dom a in P rope rty R a ng e Me m be r of G roup F a m ily R e la tions h ips Hum a n Hum a n Knows , c olla b ora te s , a nta g onis t, e ne m y S oc ia l role , (a ttribute s ) E th nic ity, B e lie f Na tive , re s ide nt, R e g ion vis ite d pla c e R e g ion S ubre g ion G e oloc a tion da ta La titude , long itude , e tc . And inve rs e re la tions h ips …
Slide 17: C h a lle n g e : As s e rtio n s a b o ut P ro p e rtie s • P ro ve n a n c e is im p o rta n t to th e d om a in a nd a p p lic a tion • P ro b le m : h o w to m a ke a s s e rtion s a b o ut prop e rtie s – <#J oh n.3, is F a th e rOf, #P e te r>: s a ys wh o? is F a th e rOf #P e te r h a s F a th e r #J oh n.3 h a s F a th e r is F a th e rOf #Andre w.1
Slide 18: R e ific a tio n • Me rria m -We b s te r: – “to re g a rd (s ome th ing a bs tra c t) a s a ma te ria l or c onc re te th ing “ • Mode l th e relationship b e twe e n ins ta nc e s a s a n ins ta nc e its e lf
Slide 19: R e ifie d R e la tio ns h ip s • S o lu tio n : m a ke th e re la tion s h ip a n o b je c t a b ou t wh ic h we c a n m a ke a s s e rtio n s – All “s im ple ” prope rtie s g e t m ore c om ple x is S onOf is F a th e rOf #J oh n.3 #P e te r _pa re nt_ P e te r h a s S on h a s F a th e r #J oh n.3 is F a th e rO is S onOf f #J oh n.3 #Andre w.1 _pa re nt_ h a s F a th e r Andre w.1 h a s S on “b ib le .64.1.42” “b ib le .64.21 .15” re fe re nc e “b ib le .64.21.16” “b ib le .64.21.17”
Slide 20: S om e C o n s e q u e nc e s o f R e ific a tio n • C la s s a nd prope rty ins ta nc e o ve rh e a d – 2 s imple inve rs e prope rtie s be c ome 4 prope rtie s a nd 1 c la s s – Abs tra c t h ie ra rc h y of c la s s e s of re ifie d re la tions h ips – Add ove rh e a d a s we ll to ontolog y de ve lopme nt, que ry c ons truc tion, e tc . • S ym m e tric a nd tra ns itive prope rtie s • C h a lle ng e s for re a s oning – R e s tric tions c ome from a c ombina tion of prope rtie s a nd re ifie d c la s s e s
Slide 21: R e ifie d C la s s e s (F a m ily) • All b in a ry re la tion s h ip s with a pp ro pria te re s tric tio ns on th e ir a rg u m e n ts (m a x 2, ra ng e re s tric tio n s , e tc .)
Slide 22: Oth e r R e ifie d C la s s e s
Slide 23: P ro p e rtie s B e twe e n R e ifie d P ro p e rtie s owl:inverseOf reif:onsetOf re if:is re if:h a s reif:pairedProperty F a th e rOf F a th e r reif:inverseOf re if:is S onOf re if:h a s S on reif:codaOf owl:inverseOf • B e yo n d OWL • De fin e d with re s pe c t to pa rtic ula r re ifie d c la s s e s • Au tom a tic a lly de riva b le fro m th e o n tolog y
Slide 24: R e ifie d R e la tio ns h ip s : Na m e s • Appe lla tions (na m e s ) a re c la s s ins ta nc e s – An Appe lla tion ins ta nc e h a s s tring re pre s e nta tions (in va rious la ng ua g e s ) – Ke e ps a ll th e fa c ts a bout a na me (diffe re nt la ng ua g e ve rs ions , pronunc ia tion, lite ra l me a ning , e tc .) in one pla c e • An individu a l h a s a (re ifie d) Na m ing R e la tion to a n Appe lla tion ins ta nc e – Me ntions of th e individua l a re prope rtie s of th e Na ming R e la tion
Slide 25: R e ifie d Na m e s E xa m p le “B a rna b a s ”@e n b k:Na m e b k:Ma n b k:Appe lla tion Rel ᾶ α νβ \"Β ρ α ς\"@ e l isNamedBy hasAppellation rdf:type hasName \"B e rna b é \"@e s #B a rna b a s #Na m e Of hasPhonetic #B a rna b a s na m e dB y B a rna b a s Representation B a rna b a s b ä r'n ə -b ə s #B a rna b a s #Na m e Of hasPronunciation na m e dB y J os e ph J os e ph www.lib ronix.c om / b ka udio/ b a rna b a s .wa v #J os e ph 2.1 reference #J os e ph 2.1 na m e dB y J os e ph “b ib le .61.1.16” “b ib le .61.1.18 Etc. And all the right-to-left equivalents …
Slide 26: C h a lle n g e : C o n ve rtin g Le g a c y Da ta • S tra te g y: us e XS L to g e ne ra te R DF ma tc h ing th e ontolog y – Le g a c y XML da ta o rg a n iz e d b y na m e a n d b y p e rs on – G e n e ra te re ifie d re la tion s from s im p le o ne s • Lookup ta b le for re ifie d inve rs e prope rtie s (b ut kb q ue ry would b e c le a ne r) – B o th s ide s of fa m ily re la tio ns h ips a re d e fine d in de pe n d e n tly • UR I Na ming – Ma p d iffe re n t XML na m e s to a s in g le UR I – G e n e ra te s h a re d UR Is fo r re ifie d re la tio n s like #<p e rs o nUR I>_<re la tio n >_<p e rs o nUR I> – R DF m e rg ing c on ne c ts th e m in th e kb • Wh y not owl:s a me As ? – Ad ditio na l c o m ple xity b ut n o p ra c tic a l b e ne fit fo r in te rn a l-on ly da ta
Slide 27: C on ve rtin g Le g a c y Da ta (2 ) • Oth e r OWL da ta with diffe re nt UR Is a nd n on- re ifie d re la tions – Ma p e ntitie s to c ommon UR Is (s h a re d a c ros s both le g a c y da ta s e ts ) – Adopt s a me UR I c ons truc tion princ iple s – E xpa nd out re ifie d re la tions – R DF me rg e in th e kb
Slide 28: Le g a c y Da ta P ip e lin e B ib lic a l P e ople XML … in R DF Aa ron.xm l Aa ron.xm l XS L Loa de r • Que ry Aa ron.xm l Aa ron.xm l Aa ron.xm l Aa ron.xm l (S e R QL, Aa ron.xm l Aa ron.rdf S P AR QL ) B K on tolog y me rg e m a p • E xtra c t • AP I NTNa me s XS L B K-NTNa me s (OWL) • We b s e rvic e Oth e r da ta (OWL)
Slide 29: C h a lle n g e : Ma inte n a n c e a nd E xte n s ion • Ho w to lowe r th e s kill th re s h old for e xte nding th e da ta ? • Approa c h : – Dis ting uis h diffe re nt ope ra tions 1. Ad ding n e w in s ta n c e s o f re la tion s h ips (e a s y) 2. Ad ding n on -re la tion a l a ttrib u te s (e a s y) 3. Ad ding n e w in s ta n c e s o f b a s ic e n titie s (a little h a rd e r) 4. F ixing b a d d a ta , e xte nd ing th e on to lo g y (h a rd) – G e t th e c ore e ntitie s rig h t firs t (e na ble s #1) – De ve lop s pe c ia liz e d tools th a t • Are c o ns tra in e d in s c o pe • P ro vid e s im ple c h oic e s • Hide c om p lic a tio n s (like re ific a tio n)
Slide 30: How Do We De live r S e m a n tic s ? • P a rt of a c ons um e r s oftwa re a pplic a tion: not on th e ope n we b • No t pra c tic a l to s h ip a n R DF s to re • Like ly: c om b ina tio n of – S ome s ta tic re s ults s h ippe d with produc t – S ome we b s e rvic e s upport for dyna mic informa tion – A we b porta l with ric h e r s e a rc h c a pa bilitie s
Slide 31: Ope n Arc h ite c tu re Is s u e s • Vis ua liz a tion – Like ly: c us tom MF C • E nd-us e r q ue ry – Like ly: a t mos t, te mpla te d que rie s • R e a s oning – Ne c e s s a ry, but …
Slide 32: F utu re E xte n s io ns to B K • P la c e na m e s a nd re la te d prop e rtie s • B rie f de s c riptions for e ntitie s • P la c e pe ople in B ib lic a l e ra s • Na rra tive role (g re e ting s in e pis tle s , s c e ne pa rtic ipa nts , b a c kg ro und ) • Ke y e ve nts from na rra tive s • C o nc e pts • Un na m e d th ing s (de s c riptions , pronouns ) • He a dwords a nd le xic a l re la tion s h ips
Slide 33: R e fe re n c e s • We a ving th e Ne w Te s ta m e nt into th e S e m a ntic We b , h ttp:/ s e m a ntic b ib le .org / e r/ s e nta tions / / oth pre 2006-s b l • S ug g e s te d Uppe r Me rg e d Ontolog y (S UMO), h ttp:/ www.ontolog yporta l.o rg / / • De fining N-a ry R e la tions on th e S e m a ntic We b , h ttp:/ www.w3.org / / wb p-n-a ryR e la tions / TR s



Add a comment on Slide 1
If you have a SlideShare account, login to comment; else you can comment as a guest- Favorites & Groups
Showing 1-50 of 0 (more)