Deploying Semantic Technologies for Digital Publishing: A Case Study from Logos Bible Software

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Deploying Semantic Technologies for Digital Publishing: A Case Study from Logos Bible Software - Presentation Transcript

    1. Deploying Semantic Technologies for Digital Publishing A Case Study from Logos Bible Software Sean Boisen (sean@logos.com) Slides at: http://semanticbible.org/other/presentations/2007-SemTech/
    2. Outline
      • Background: application and motivation
      • Scope and Overview
      • Technical Challenges:
        • Reification for provenance data
        • Converting legacy data
        • Tools for knowledge extension
      • Future directions
    3. Who Am I?
      • 19 years with BBN Technologies
        • Information extraction, human language technology
        • Scientist, technology manager
      • Semantic Web hobbyist
      • Senior Information Architect at Logos
      • One-man semantic band
    4. The Importance of the Bible as a Semantic Domain
      • The most widely distributed book
        • 35M Bibles and Testaments in 2005
      • The most widely translated work
        • > 2000 languages
        • 41 languages at www.biblegateway.com
      • Spans 1000s of years of ancient history
    5. Logos Bible Software
      • High-end desktop digital library
        • > 7000 titles
        • Resources in a dozen languages
        • Users in 180 countries
        • Extensive cross-indexing and hyper linking
      • Leading publisher and developer of digital resources for Bible study
    6. Logos Value
      • Digital library with hyperlinked references and citations
      • Information integration for navigation, search
      • Support for original languages
      • Search
      • New content to enrich Bible study
    7. The Bible Knowledgebase (BK)
      • A machine-readable knowledgebase of semantically-organized Bible data
        • In OWL
        • Linked to Biblical texts
        • Search, navigation, visualization
      • Relationships support discovery and exploration
      • Reusable content (unlike prose)
      • Integration framework for library resources (future)
      • Today: named people and places, and their relationships
      • Tomorrow: chronology, events, concepts, non-named things, key terms, topics, …
    8. Approach
      • Build on Semantic Web standards
      • Model the domain rather than annotate texts
      • Layer knowledge: first entities, then relationships
      • Be conservative in what we assert and provide references as evidence
      • Try to avoid philosophy and focus on end-user value
    9. The Semantic Value Proposition
      • Identify and disambiguate entities (beyond names)
        • 30 people named Zechariah
        • Jesus’ disciple: Peter, Simon, Simeon, Cephas …
        • Judah: person, tribe, territory
      • Link reference information to passages for background
      • Provide a rich set of relationships to encourage exploration and discovery
      • Provide consistent cross-resource indexing
      • Leverage third-party tools
      • Provide scalability
      • Avoid reinventing the wheel
    10. User Benefits
      • Disambiguation makes search work better
      • Passage guide displays relevant entities to provide background information
      • Relationships encourage browsing and exploration
      • Visualization makes complex information easier to grasp
    11. Development Tools
      • Ontology development and instance creation with Protégé
      • Legacy data conversion and data merging through XSLT
      • Storage in Sesame
      • Some integration code in Python for loading and querying RDF
      • TBD
    12. Most Important BK Classes
      • > 60 classes in all (not counting reified relationships)
      • Many upper classes are not instantiated
      • General coordination of class names with SUMO
        • But not true re-use
    13. BK Classes for Places
    14. BK Abstract Classes
    15. BK Instances
      • ~100k triples
      • ~3000 people instances
        • Aaron to Zurishaddai
        • Names (various languages)
      • ~20k passage references for assertions
      • 90 cities, other places
      • Ethnicities, belief systems, languages, social roles, organizations
    16. Major BK Relationships Family Relationships Human Human Domain Range Property Knows, collaborates, antagonist, enemy Member of Group Region Native, resident, visited place And inverse relationships … (attributes) Social role, Ethnicity, Belief Region Subregion Geolocation data Latitude, longitude, etc.
    17. Challenge: Assertions about Properties
      • Provenance is important to the domain and application
      • Problem: how to make assertions about properties
        • <#John.3, isFatherOf, #Peter>: says who?
      #John.3 #Peter isFatherOf hasFather #Andrew.1 hasFather isFatherOf
    18. Reification
      • Merriam-Webster:
        • “ to regard (something abstract) as a material or concrete thing “
      • Model the relationship between instances as an instance itself
    19. Reified Relationships
      • Solution: make the relationship an object about which we can make assertions
        • All “simple” properties get more complex
      #John.3 #John.3 _parent_ Andrew.1 #Andrew.1 “ bible.64.1.42” reference #John.3 _parent_ Peter #Peter “ bible.64.21.15” “ bible.64.21.16” “ bible.64.21.17” isFatherOf hasFather isSonOf hasSon isFatherOf hasFather isSonOf hasSon
    20. Some Consequences of Reification
      • Class and property instance overhead
        • 2 simple inverse properties become 4 properties and 1 class
        • Abstract hierarchy of classes of reified relationships
        • Add overhead as well to ontology development, query construction, etc.
      • Symmetric and transitive properties
      • Challenges for reasoning
        • Restrictions come from a combination of properties and reified classes
    21. Reified Classes (Family)
      • All binary relationships with appropriate restrictions on their arguments (max 2, range restrictions, etc.)
    22. Other Reified Classes
    23. Properties Between Reified Properties
      • Beyond OWL
      • Defined with respect to particular reified classes
      • Automatically derivable from the ontology
      reif:is FatherOf reif:isSonOf reif:hasSon reif:has Father owl:inverseOf owl:inverseOf reif:inverseOf reif:pairedProperty reif:onsetOf reif:codaOf
    24. Reified Relationships: Names
      • Appellations (names) are class instances
        • An Appellation instance has string representations (in various languages)
        • Keeps all the facts about a name (different language versions, pronunciation, literal meaning, etc.) in one place
      • An individual has a (reified) NamingRelation to an Appellation instance
        • Mentions of the individual are properties of the NamingRelation
    25. Reified Names Example #Barnabas isNamedBy hasAppellation “ Barnabas”@en &quot;Βαρναβᾶς&quot;@ el &quot;Bernabé&quot;@es hasName #Joseph2.1 bär ' nə-bəs hasPhonetic Representation www.libronix.com/ bkaudio/ barnabas.wav hasPronunciation bk:Man rdf:type bk:NameRel #Barnabas namedBy Barnabas #Barnabas namedBy Joseph #Joseph2.1 namedBy Joseph bk:Appellation #NameOf Barnabas #NameOf Joseph “ bible.61.1.16” reference “ bible.61.1.18 Etc. And all the right-to-left equivalents …
    26. Challenge: Converting Legacy Data
      • Strategy: use XSL to generate RDF matching the ontology
        • Legacy XML data organized by name and by person
        • Generate reified relations from simple ones
          • Lookup table for reified inverse properties (but kb query would be cleaner)
        • Both sides of family relationships are defined independently
      • URI Naming
        • Map different XML names to a single URI
        • Generate shared URIs for reified relations like #<personURI>_<relation>_<personURI>
        • RDF merging connects them in the kb
      • Why not owl:sameAs?
        • Additional complexity but no practical benefit for internal-only data
    27. Converting Legacy Data (2)
      • Other OWL data with different URIs and non-reified relations
        • Map entities to common URIs (shared across both legacy datasets)
        • Adopt same URI construction principles
        • Expand out reified relations
        • RDF merge in the kb
    28. Legacy Data Pipeline XSL Loader NTNames (OWL) BK ontology XSL BK-NTNames merge map
      • Query (SeRQL, SPARQL)
      • Extract
      • API
      • Web service
      Other data (OWL) Aaron.xml Aaron.xml Aaron.xml Aaron.xml Biblical People XML Aaron.xml Aaron.xml Aaron.xml Aaron.rdf … in RDF
    29. Challenge: Maintenance and Extension
      • How to lower the skill threshold for extending the data?
      • Approach:
        • Distinguish different operations
          • Adding new instances of relationships (easy)
          • Adding non-relational attributes (easy)
          • Adding new instances of basic entities (a little harder)
          • Fixing bad data, extending the ontology (hard)
        • Get the core entities right first (enables #1)
        • Develop specialized tools that
          • Are constrained in scope
          • Provide simple choices
          • Hide complications (like reification)
    30. How Do We Deliver Semantics?
      • Part of a consumer software application: not on the open web
      • Not practical to ship an RDF store
      • Likely: combination of
        • Some static results shipped with product
        • Some web service support for dynamic information
        • A web portal with richer search capabilities
    31. Open Architecture Issues
      • Visualization
        • Likely: custom MFC
      • End-user query
        • Likely: at most, templated queries
      • Reasoning
        • Necessary, but …
    32. Future Extensions to BK
      • Place names and related properties
      • Brief descriptions for entities
      • Place people in Biblical eras
      • Narrative role (greetings in epistles, scene participants, background)
      • Key events from narratives
      • Concepts
      • Unnamed things (descriptions, pronouns)
      • Headwords and lexical relationships
    33. References
      • Weaving the New Testament into the Semantic Web, http://semanticbible.org/other/presentations/2006-sbl/Weaving.xhtml
      • Suggested Upper Merged Ontology (SUMO), http://www.ontologyportal.org/
      • Defining N-ary Relations on the Semantic Web, http://www.w3.org/TR/swbp-n-aryRelations

    + sboisensboisen, 3 years ago

    custom

    1248 views, 0 favs, 0 embeds more stats

    Presented May 24, 2007 at the Semantic Technology C more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 1248
      • 1248 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 32
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories