AURA Wiki - Knowledge Acquisition with a Semantic Wiki Application


Published on

Constructing an AI knowledge base requires decomposing complex sentences into simplified statements with encoding concepts. Due to knowledge engineering cost and complexity, we created an experiment to test the scenario where college students do the above task using a semantic wiki. This wiki also tracked the progress of each student and provided an integrated environment for our knowledge workers.

In this presentation we will discuss the layout of the imported data within the wiki, the user experience throughout the publishing process, the underlying technologies behind the wiki app, and the preliminary results of the experiment. The semantic wiki web application included the following technologies:

• Semantic MediaWiki Plus, which provides an object oriented framework for semi-structured data.
• JavaScript, HTML5, and AJAX service-based graphing of triples and entities within the project and for interconnected services.
• Faceted browsing and semantic pivoting among related entities: textbook paragraphs, sentences, concepts, and sentence encodings.
• Virtuoso integration with the knowledge base.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Hello, looking over the program I’m aware this is a pretty competitive hour for talks… we’re doing this right after lunch… going against a Google talk… and with a cryptic title about artificial intelligence engines and a semantic media wiki installation.
  • This talk is going to cover an experiment we ran the last 6 months of 2012. An experiment that involves a symbolic AI population program and our solution to lowering the costs associated with encoding a text book into the Knowledge Base. We’re going to expand on the process for adding new data to the knowledge base, and our attempt to lower the cost structure by using domain experts using an installation of Semantic MediaWiki specifically created to populate Aura.
  • So let’s begin with AURA, and AURA itself is pretty large… so I chose one screenshot to include on one slide. In fact, this isn’t even a screenshot of AURA doing anything beyond one screen used to populate the knowledge base, and debugging a question into an explanation via concept maps. This screen quickly became a major choke point when it comes to populating the underlying concept maps composing the underlying knowledge base. In fact, it got exponentially more expensive and time consuming to add new concepts and relations to AURA as more chapters were encoded into AURA.This is a good screenshot because you see AURA failing to answer a question because it needs more data encoded. Looking at the third arrow AURA is saying a group of CMAPS to answer the question “What are the parts of the Eukaryotic Cell” do not exist. So it’s time to start the process for adding these concept maps from the textbook…
  • A process that looks roughly like this… I don’t want to dwell on all the steps being shown here too long, but as shown above it’s quite extensive to add even even trivial data to the knowledge base. This is the work process of several groups from Knowledge Engineers to SRI research groups to biologists and teachers. When project management was asked it which step needs focused on to speed up data population it came down to number 2…Actually the first part of #2…
  • We cared about this step.Authoring the “Universal Truth” portion of this process was time consuming, expensive, and getting more difficult as the knowledge base grew. It required trained biologists, trained educators that were used to the source text, and the knowledge engineering team focused on hiring individuals that could be trained into understanding “how” to encode these universal truths.A large part of the experiment was dedicated to training students in recognizing a universal truth and how to derive them from source sentences. We also specifically created work paths within our Semantic MediaWiki installation to aid in recognizing and constructing Universal Truths.
  • … and that wasn’t an easy task due to the nature of a “Universal Truth”. - Read definition – So easy enough to understand? I chose a sentence from wikipedia to demonstrate just how easy this task can get – Read sentence – Any guesses on how many universal truths lie in that sentence? Well just at a glance I found 5 and the last one is probably not valid being composed of two truths both stating water has a chemical formula, H20 is a chemical formula, and then a statement connecting water to H2O.
  • With all of that in mind and facing a pretty significant problem adding more content to AURA, we devised an experiment with the explicit intent to outsource universal truth authoring to the greatest number of domain experts. This is our “bullet list of pain” thinly veiled as “project goals”…
  • And finally… with our simple problem complete with simple project goals we decided on the easiest group of people in the world to schedule – College Students.-- read points –Students attending University of Washington or recent graduatesAll have a background in biology or life sciencesNative English speakers with excellent writing skillsEach student read the chapters in question and was provided with an iPad running the Inquire applicationStudents were paid for their time
  • Designed as a portal for annotating a textbook with Universal Truths we developed Aura Wiki to build on each aspect of the project – assuming the students pass the current project goal (ie – One painful bullet point). Here is an example of the entry point to the wiki functioning as a portal, and an early version of the UT authoring page at a sentence level.
  • We also decided to take on the task of storing and marking up the entire text book with semantic entities.First we began with the top level importing standard table of context data into a set of wiki pages marked by category – read top section pointThen we added the markup including glossaries, a taxonomy of existing concepts imported from Aura, and we imported existing universal truths from the current system as examples.
  • Frequently deemed the ugliest - and most common - page on the website it quickly became the focal point for UIX improvements as we realized it wasn’t really plausible to provide random sentences to users for UT annotation. These pages were created originally as background pages for tracking textbook properties and were not originally intended to be navigational elements. However, users would often leave the UT authoring page soon after creating their first set of annotations navigating to the actual text book table of content pages generating these criticisms…
  • Once the import was complete and we added the annotation pages this was the site map structure that emerged.Where we intended the users to stay and focusEverything the users found and decided to useA proposed review system for moderators / trusted usersRemoved to google analytics
  • -- Add arrows and explain turning on –First we had our import sources and addition of knowledge engineering UTs including marking up pages with additional semantic properties.The data was normalized for wiki presentation and queriesThe wiki portions of AURA wiki and the import agents to create the textbook pagesFinally, the export and sync agents to push/pull UTs to/from AURA
  • After all of the importing, normalization, alignment of wiki semantic properties to AURA’s ontology, and addition of pre-existing Universal Truth’s we ended up with a sentence annotation page that looks like this. On this page you can … - read slides – Read SentenceAccess Sentence ContextAccess Neighboring SentencesCheck & Submit RelevancyCheck & Submit Authoring StatusDisplay Existing Universal TruthsAuthor Universal TruthsAnd on closer inspection…
  • Here is the expanded view of the context surrounding a sentence available for UT annotation.Each page has a unique id for the table of contents elementThe sentence itself is an elementElements pointing to the previous and next sentences.Elements pointing to top level entitiesUsers can update the sentences relevancy and encoding status.
  • Each sentence has a collection of universal truths, each represented by a wiki page, that are created inline on the sentence page. On this page you’re viewing the expanded editing pane for adding a universal truth including : The listing of existing universal truths applied to the sentenceThe UT authoring blockAnd two autocomplete boxes for applying additional semantic properties to the universal truth
  • Reference sentenceThe universal truth textUT concept – AURA providedUT context – AURA providedAccuracy rating for the universal truthDate created, approved, and when ratings were applied
  • How do we show progress?How do we show community contributors?How do we focus members on a specific chapter or sentenceHow do we train users in what a universal truth entails – Guided TutorialThere were several requests for unique mediawiki extensions
  • Our original text view needed expanded to add context for authoring..-- 4 clicks --Problem is this made pages very long so authoring Uts required a lot of scrolling up and down the page in our original format.
  • These pages were created behind the scenes by the UT inline authoring component, and there was a huge debate on whether they should be visible to users. While important to the wiki for queries, moderating universal truths, and exporting semantic properties the operations provided by default wiki pages conflicted with some of our original assumptions.-- 4 clicks --
  • Like the second proposal it soon became obvious people couldn’t moderate a universal truth without the full context of a paragraph and possibly even an entire textbook section. This meant we had to remove the ability to approve and deny universal truths across sentences and focus on the annotations per sentence.
  • 6 University of Washington Students participated in the testEach received 45 minutes of training on creating Univeral TruthsEach was given 1 hour and a pre-selected list of sentences on a user page to completeThe groups generated over 100 Universal Truths each sessionThey averaged 37 Universal Truths an hour per studentStudents were frequently observed using their domain experience to construct UTs not specifically worded in the source sentence
  • A complex iPad application and I chose one wireframe to put on one slide.You’re looking at Inquire displaying the online textbook portion of Aura
  • AURA Wiki - Knowledge Acquisition with a Semantic Wiki Application

    1. 1. AUTHORING WITHAURA WIKISemTechBiz 2013, San Francisco
    2. 2. Today we will be talking about…• Populating a Symbolic AI – Aura• The spiraling cost structure for encoding data intoa symbolic AI• How do we bring low cost domain experts into theprocess?• Creating a Semantic MediaWiki Installation• Importing a textbook into Semantic MediaWikiand marking up pages with properties• Customizing the installation for annotatingtextbook sentences
    3. 3. AURA
    4. 4. 3) Encoding Planning -- 35% timeGroup Common UTs, ID KR/KE Issues,ID Already Encoded, Write How to EncodePre-Planning, QA CheckStatus Labeling: Encoding Complete, KR Issue (Closed)2) Reaching Consensus -- 14% timeUniversal Truth Authoring, Concept Chosen QA Check1) Determining Relevance -- 2% timeHighlighting, Diagram AnalysisQA CheckStatus Labeling: Relevant, Irrelevant (Closed)6) Question-Based Testing -- 14% timeUse Minimal Test Suite, Reasoning JIRA Issues Filed,Encoder Fills KB GapsQA Check with Screenshots of “Passing" Comparisonand Relationship Questions5) Key Term Review -- 25% timeKR Evaluated by Modeling Expert and Biologist,Encoder Makes ChangesKR Evaluated by Modeling Expert and BiologistQA Check4) Encoding -- 10% timeEncode, File JIRA IssuesQA CheckStatus Labeling: Encoding Complete, KE Issue
    5. 5. -- How to choose a concept given a UT?-- How to produce UTs from sentences?SentenceSentenceUTUTUTUTChapterChapterKBBookCMapCMapCMapCMapChapter UT2) Reaching Consensus -- 14% timeUniveral Truth Authoring, Concept Chosen
    6. 6. What is a Universal Truth?• “A Universal Truth is a stand-alone, unambiguousdeclarative sentence about a textbook topic thatexpresses a single fact that is universally true”- AURA Knowledge Engineering Manual• “Water is composed of two Hydrogen element moleculesand one Oxygen element molecule with the chemicalformula H20”• Water is composed of hydrogen• Water is composed of oxygen• Hydrogen is an element• Oxygen is an element• Water has the chemical formula H20• Does: “Water is a compound” count?
    7. 7. Project Goals• “Crowd Source Universal TruthAuthoring”• Can Domain Experts Author Useful UniversalTruths?• Can We Speed Up Encoding a Textbook with Inputfrom Domain Experts?• Can We Create a UT Authoring Portal for MultipleTextbooks?• Can Existing Social Networks Provide DomainExperts Capable of UT Authoring?• Could Gamification be Applied to An Existing Portalto Add Non-Domain Experts?
    8. 8. About the Domain Experts• Students attending University of Washington orrecent graduates• All have a background in biology or life sciences• Native English speakers with excellent writingskills• Each student read the chapters in question andwas provided with an iPad running the Inquireapplication• Students were paid for their time
    9. 9. A Semantic MediaWiki Portal
    10. 10. Storing a Text Book in Aura Wiki• The wiki was created with instances of page typescomposed of textbook sentences• Sentence• Paragraph• Section• Chapter• Book• The wiki also has imported resources to aid in the UTauthoring process• Glossary Pages• Taxonomy Concepts• Universal Truths – Human and Machine
    11. 11. Navigating Aura WikiWhere’sthe nextsentence?
    12. 12. Navigating Aura Wiki
    13. 13. Authoring Universal Truths• Components :• Read Sentence• Access Sentence Context• Access NeighboringSentences• Check & Submit Relevancy• Check & Submit AuthoringStatus• Display Existing UniversalTruths• Author Universal Truths
    14. 14. Authoring Universal Truths• Semantic Wiki Properties• Each page has a unique idfor the table of contentselement• The sentence itself is anelement• Elements pointing to theprevious and nextsentences.• Elements pointing to toplevel entities• Users can update thesentences relevancy andencoding status.Sentence and Context View
    15. 15. Authoring Universal TruthsInput form for new UT.First two inputs arerequired.
    16. 16. Authoring Universal Truths• Semantic Wiki Properties• Reference sentence• The universal truth text• UT concept – AURA provided• UT context – AURA provided• Accuracy rating for the universaltruth• Date created, approved, andwhen ratings were appliedUniversal Truth
    17. 17. PROPOSALSUser Experience Review
    18. 18. Navigating Aura Wiki• Unregistered and Registered Main Pages• Unregistered users are locked out• Registration is turned off for anonymous users• Unique Extensions Proposed for Guided Authoring
    19. 19. How to View a Textbook Paragraph?Auto create tripleformat UTs fromsentence?
    20. 20. How to View a Universal Truth Page?How do we unifyversions of thepage for exportto AURA?
    21. 21. Knowledge Engineer Editing
    22. 22. Knowledge Engineer Editing
    23. 23. STUDENT REVIEWCan Experts Author Universal Truths?
    24. 24. Domain Expert Authoring Statistics• 6 University of Washington Students participated in thetest• Each received 45 minutes of training on creatingUniversal Truths• Each was given 1 hour and a pre-selected list ofsentences on a user page to complete• The groups generated over 100+ Universal Truths eachsession• They averaged 37 Universal Truths an hour per student• Students were frequently observed using their domainexperience to construct UTs not specifically worded in thesource sentence (ie: “Water is a compound”)
    25. 25. CONCLUSION
    26. 26. Project Goals• “Crowd Source Universal TruthAuthoring”• Can Domain Experts Author Useful UniversalTruths?• Can We Speed Up Encoding a Textbook withInput from Domain Experts?
    27. 27. Project Goals• “Crowd Source Universal TruthAuthoring”• Can We Create a UT Authoring Portal forMultiple Textbooks?
    28. 28. Project Goals• “Crowd Source Universal TruthAuthoring”• Can Existing Social Networks Provide DomainExperts Capable of UT Authoring?• Could Gamification be Applied to An ExistingPortal to Add Non-Domain Experts?
    30. 30. THANK YOU(clap now)