AUTHORING WITH
AURA WIKI
SemTechBiz 2013, San Francisco
Today we will be talking about…
• Populating a Symbolic AI – Aura
• The spiraling cost structure for encoding data into
a symbolic AI
• How do we bring low cost domain experts into the
process?
• Creating a Semantic MediaWiki Installation
• Importing a textbook into Semantic MediaWiki
and marking up pages with properties
• Customizing the installation for annotating
textbook sentences
AURA
3) Encoding Planning -- 35% time
Group Common UTs, ID KR/KE Issues,
ID Already Encoded, Write How to Encode
Pre-Planning, QA Check
Status Labeling: Encoding Complete, KR Issue (Closed)
2) Reaching Consensus -- 14% time
Universal Truth Authoring, Concept Chosen QA Check
1) Determining Relevance -- 2% time
Highlighting, Diagram Analysis
QA Check
Status Labeling: Relevant, Irrelevant (Closed)
6) Question-Based Testing -- 14% time
Use Minimal Test Suite, Reasoning JIRA Issues Filed,
Encoder Fills KB Gaps
QA Check with Screenshots of “Passing" Comparison
and Relationship Questions
5) Key Term Review -- 25% time
KR Evaluated by Modeling Expert and Biologist,
Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10% time
Encode, File JIRA Issues
QA Check
Status Labeling: Encoding Complete, KE Issue
-- How to choose a concept given a UT?
-- How to produce UTs from sentences?
Sentence
Sentence
UT
UT
UT
UT
Chapter
Chapter
KBBook
CMap
CMap
CMap
CMap
Chapter UT
2) Reaching Consensus -- 14% time
Univeral Truth Authoring, Concept Chosen
What is a Universal Truth?
• “A Universal Truth is a stand-alone, unambiguous
declarative sentence about a textbook topic that
expresses a single fact that is universally true”
- AURA Knowledge Engineering Manual
• “Water is composed of two Hydrogen element molecules
and one Oxygen element molecule with the chemical
formula H20”
• Water is composed of hydrogen
• Water is composed of oxygen
• Hydrogen is an element
• Oxygen is an element
• Water has the chemical formula H20
• Does: “Water is a compound” count?
Project Goals
• “Crowd Source Universal Truth
Authoring”
• Can Domain Experts Author Useful Universal
Truths?
• Can We Speed Up Encoding a Textbook with Input
from Domain Experts?
• Can We Create a UT Authoring Portal for Multiple
Textbooks?
• Can Existing Social Networks Provide Domain
Experts Capable of UT Authoring?
• Could Gamification be Applied to An Existing Portal
to Add Non-Domain Experts?
About the Domain Experts
• Students attending University of Washington or
recent graduates
• All have a background in biology or life sciences
• Native English speakers with excellent writing
skills
• Each student read the chapters in question and
was provided with an iPad running the Inquire
application
• Students were paid for their time
A Semantic MediaWiki Portal
Storing a Text Book in Aura Wiki
• The wiki was created with instances of page types
composed of textbook sentences
• Sentence
• Paragraph
• Section
• Chapter
• Book
• The wiki also has imported resources to aid in the UT
authoring process
• Glossary Pages
• Taxonomy Concepts
• Universal Truths – Human and Machine
Navigating Aura Wiki
Where’s
the next
sentence?
Navigating Aura Wiki
Authoring Universal Truths
• Components :
• Read Sentence
• Access Sentence Context
• Access Neighboring
Sentences
• Check & Submit Relevancy
• Check & Submit Authoring
Status
• Display Existing Universal
Truths
• Author Universal Truths
Authoring Universal Truths
• Semantic Wiki Properties
• Each page has a unique id
for the table of contents
element
• The sentence itself is an
element
• Elements pointing to the
previous and next
sentences.
• Elements pointing to top
level entities
• Users can update the
sentences relevancy and
encoding status.
Sentence and Context View
Authoring Universal Truths
Input form for new UT.
First two inputs are
required.
Authoring Universal Truths
• Semantic Wiki Properties
• Reference sentence
• The universal truth text
• UT concept – AURA provided
• UT context – AURA provided
• Accuracy rating for the universal
truth
• Date created, approved, and
when ratings were applied
Universal Truth
PROPOSALS
User Experience Review
Navigating Aura Wiki
• Unregistered and Registered Main Pages
• Unregistered users are locked out
• Registration is turned off for anonymous users
• Unique Extensions Proposed for Guided Authoring
How to View a Textbook Paragraph?
Auto create triple
format UTs from
sentence?
How to View a Universal Truth Page?
How do we unify
versions of the
page for export
to AURA?
Knowledge Engineer Editing
Knowledge Engineer Editing
STUDENT REVIEW
Can Experts Author Universal Truths?
Domain Expert Authoring Statistics
• 6 University of Washington Students participated in the
test
• Each received 45 minutes of training on creating
Universal Truths
• Each was given 1 hour and a pre-selected list of
sentences on a user page to complete
• The groups generated over 100+ Universal Truths each
session
• They averaged 37 Universal Truths an hour per student
• Students were frequently observed using their domain
experience to construct UTs not specifically worded in the
source sentence (ie: “Water is a compound”)
CONCLUSION
Project Goals
• “Crowd Source Universal Truth
Authoring”
• Can Domain Experts Author Useful Universal
Truths?
• Can We Speed Up Encoding a Textbook with
Input from Domain Experts?
Project Goals
• “Crowd Source Universal Truth
Authoring”
• Can We Create a UT Authoring Portal for
Multiple Textbooks?
Project Goals
• “Crowd Source Universal Truth
Authoring”
• Can Existing Social Networks Provide Domain
Experts Capable of UT Authoring?
• Could Gamification be Applied to An Existing
Portal to Add Non-Domain Experts?
QUESTIONS?
COMMENTS?
THANK YOU
(clap now)

AURA Wiki - Knowledge Acquisition with a Semantic Wiki Application

  • 1.
  • 2.
    Today we willbe talking about… • Populating a Symbolic AI – Aura • The spiraling cost structure for encoding data into a symbolic AI • How do we bring low cost domain experts into the process? • Creating a Semantic MediaWiki Installation • Importing a textbook into Semantic MediaWiki and marking up pages with properties • Customizing the installation for annotating textbook sentences
  • 3.
  • 4.
    3) Encoding Planning-- 35% time Group Common UTs, ID KR/KE Issues, ID Already Encoded, Write How to Encode Pre-Planning, QA Check Status Labeling: Encoding Complete, KR Issue (Closed) 2) Reaching Consensus -- 14% time Universal Truth Authoring, Concept Chosen QA Check 1) Determining Relevance -- 2% time Highlighting, Diagram Analysis QA Check Status Labeling: Relevant, Irrelevant (Closed) 6) Question-Based Testing -- 14% time Use Minimal Test Suite, Reasoning JIRA Issues Filed, Encoder Fills KB Gaps QA Check with Screenshots of “Passing" Comparison and Relationship Questions 5) Key Term Review -- 25% time KR Evaluated by Modeling Expert and Biologist, Encoder Makes Changes KR Evaluated by Modeling Expert and Biologist QA Check 4) Encoding -- 10% time Encode, File JIRA Issues QA Check Status Labeling: Encoding Complete, KE Issue
  • 5.
    -- How tochoose a concept given a UT? -- How to produce UTs from sentences? Sentence Sentence UT UT UT UT Chapter Chapter KBBook CMap CMap CMap CMap Chapter UT 2) Reaching Consensus -- 14% time Univeral Truth Authoring, Concept Chosen
  • 6.
    What is aUniversal Truth? • “A Universal Truth is a stand-alone, unambiguous declarative sentence about a textbook topic that expresses a single fact that is universally true” - AURA Knowledge Engineering Manual • “Water is composed of two Hydrogen element molecules and one Oxygen element molecule with the chemical formula H20” • Water is composed of hydrogen • Water is composed of oxygen • Hydrogen is an element • Oxygen is an element • Water has the chemical formula H20 • Does: “Water is a compound” count?
  • 7.
    Project Goals • “CrowdSource Universal Truth Authoring” • Can Domain Experts Author Useful Universal Truths? • Can We Speed Up Encoding a Textbook with Input from Domain Experts? • Can We Create a UT Authoring Portal for Multiple Textbooks? • Can Existing Social Networks Provide Domain Experts Capable of UT Authoring? • Could Gamification be Applied to An Existing Portal to Add Non-Domain Experts?
  • 8.
    About the DomainExperts • Students attending University of Washington or recent graduates • All have a background in biology or life sciences • Native English speakers with excellent writing skills • Each student read the chapters in question and was provided with an iPad running the Inquire application • Students were paid for their time
  • 9.
  • 10.
    Storing a TextBook in Aura Wiki • The wiki was created with instances of page types composed of textbook sentences • Sentence • Paragraph • Section • Chapter • Book • The wiki also has imported resources to aid in the UT authoring process • Glossary Pages • Taxonomy Concepts • Universal Truths – Human and Machine
  • 11.
  • 12.
  • 14.
    Authoring Universal Truths •Components : • Read Sentence • Access Sentence Context • Access Neighboring Sentences • Check & Submit Relevancy • Check & Submit Authoring Status • Display Existing Universal Truths • Author Universal Truths
  • 15.
    Authoring Universal Truths •Semantic Wiki Properties • Each page has a unique id for the table of contents element • The sentence itself is an element • Elements pointing to the previous and next sentences. • Elements pointing to top level entities • Users can update the sentences relevancy and encoding status. Sentence and Context View
  • 16.
    Authoring Universal Truths Inputform for new UT. First two inputs are required.
  • 17.
    Authoring Universal Truths •Semantic Wiki Properties • Reference sentence • The universal truth text • UT concept – AURA provided • UT context – AURA provided • Accuracy rating for the universal truth • Date created, approved, and when ratings were applied Universal Truth
  • 18.
  • 19.
    Navigating Aura Wiki •Unregistered and Registered Main Pages • Unregistered users are locked out • Registration is turned off for anonymous users • Unique Extensions Proposed for Guided Authoring
  • 20.
    How to Viewa Textbook Paragraph? Auto create triple format UTs from sentence?
  • 21.
    How to Viewa Universal Truth Page? How do we unify versions of the page for export to AURA?
  • 22.
  • 23.
  • 24.
    STUDENT REVIEW Can ExpertsAuthor Universal Truths?
  • 25.
    Domain Expert AuthoringStatistics • 6 University of Washington Students participated in the test • Each received 45 minutes of training on creating Universal Truths • Each was given 1 hour and a pre-selected list of sentences on a user page to complete • The groups generated over 100+ Universal Truths each session • They averaged 37 Universal Truths an hour per student • Students were frequently observed using their domain experience to construct UTs not specifically worded in the source sentence (ie: “Water is a compound”)
  • 26.
  • 27.
    Project Goals • “CrowdSource Universal Truth Authoring” • Can Domain Experts Author Useful Universal Truths? • Can We Speed Up Encoding a Textbook with Input from Domain Experts?
  • 28.
    Project Goals • “CrowdSource Universal Truth Authoring” • Can We Create a UT Authoring Portal for Multiple Textbooks?
  • 29.
    Project Goals • “CrowdSource Universal Truth Authoring” • Can Existing Social Networks Provide Domain Experts Capable of UT Authoring? • Could Gamification be Applied to An Existing Portal to Add Non-Domain Experts?
  • 30.
  • 31.

Editor's Notes

  • #2 Hello, looking over the program I’m aware this is a pretty competitive hour for talks… we’re doing this right after lunch… going against a Google talk… and with a cryptic title about artificial intelligence engines and a semantic media wiki installation.
  • #3 This talk is going to cover an experiment we ran the last 6 months of 2012. An experiment that involves a symbolic AI population program and our solution to lowering the costs associated with encoding a text book into the Knowledge Base. We’re going to expand on the process for adding new data to the knowledge base, and our attempt to lower the cost structure by using domain experts using an installation of Semantic MediaWiki specifically created to populate Aura.
  • #4 So let’s begin with AURA, and AURA itself is pretty large… so I chose one screenshot to include on one slide. In fact, this isn’t even a screenshot of AURA doing anything beyond one screen used to populate the knowledge base, and debugging a question into an explanation via concept maps. This screen quickly became a major choke point when it comes to populating the underlying concept maps composing the underlying knowledge base. In fact, it got exponentially more expensive and time consuming to add new concepts and relations to AURA as more chapters were encoded into AURA.This is a good screenshot because you see AURA failing to answer a question because it needs more data encoded. Looking at the third arrow AURA is saying a group of CMAPS to answer the question “What are the parts of the Eukaryotic Cell” do not exist. So it’s time to start the process for adding these concept maps from the textbook…
  • #5 A process that looks roughly like this… I don’t want to dwell on all the steps being shown here too long, but as shown above it’s quite extensive to add even even trivial data to the knowledge base. This is the work process of several groups from Knowledge Engineers to SRI research groups to biologists and teachers. When project management was asked it which step needs focused on to speed up data population it came down to number 2…Actually the first part of #2…
  • #6 We cared about this step.Authoring the “Universal Truth” portion of this process was time consuming, expensive, and getting more difficult as the knowledge base grew. It required trained biologists, trained educators that were used to the source text, and the knowledge engineering team focused on hiring individuals that could be trained into understanding “how” to encode these universal truths.A large part of the experiment was dedicated to training students in recognizing a universal truth and how to derive them from source sentences. We also specifically created work paths within our Semantic MediaWiki installation to aid in recognizing and constructing Universal Truths.
  • #7 … and that wasn’t an easy task due to the nature of a “Universal Truth”. - Read definition – So easy enough to understand? I chose a sentence from wikipedia to demonstrate just how easy this task can get – Read sentence – Any guesses on how many universal truths lie in that sentence? Well just at a glance I found 5 and the last one is probably not valid being composed of two truths both stating water has a chemical formula, H20 is a chemical formula, and then a statement connecting water to H2O.
  • #8 With all of that in mind and facing a pretty significant problem adding more content to AURA, we devised an experiment with the explicit intent to outsource universal truth authoring to the greatest number of domain experts. This is our “bullet list of pain” thinly veiled as “project goals”…
  • #9 And finally… with our simple problem complete with simple project goals we decided on the easiest group of people in the world to schedule – College Students.-- read points –Students attending University of Washington or recent graduatesAll have a background in biology or life sciencesNative English speakers with excellent writing skillsEach student read the chapters in question and was provided with an iPad running the Inquire applicationStudents were paid for their time
  • #10 Designed as a portal for annotating a textbook with Universal Truths we developed Aura Wiki to build on each aspect of the project – assuming the students pass the current project goal (ie – One painful bullet point). Here is an example of the entry point to the wiki functioning as a portal, and an early version of the UT authoring page at a sentence level.
  • #11 We also decided to take on the task of storing and marking up the entire text book with semantic entities.First we began with the top level importing standard table of context data into a set of wiki pages marked by category – read top section pointThen we added the markup including glossaries, a taxonomy of existing concepts imported from Aura, and we imported existing universal truths from the current system as examples.
  • #12 Frequently deemed the ugliest - and most common - page on the website it quickly became the focal point for UIX improvements as we realized it wasn’t really plausible to provide random sentences to users for UT annotation. These pages were created originally as background pages for tracking textbook properties and were not originally intended to be navigational elements. However, users would often leave the UT authoring page soon after creating their first set of annotations navigating to the actual text book table of content pages generating these criticisms…
  • #13 Once the import was complete and we added the annotation pages this was the site map structure that emerged.Where we intended the users to stay and focusEverything the users found and decided to useA proposed review system for moderators / trusted usersRemoved to google analytics
  • #14 -- Add arrows and explain turning on –First we had our import sources and addition of knowledge engineering UTs including marking up pages with additional semantic properties.The data was normalized for wiki presentation and queriesThe wiki portions of AURA wiki and the import agents to create the textbook pagesFinally, the export and sync agents to push/pull UTs to/from AURA
  • #15 After all of the importing, normalization, alignment of wiki semantic properties to AURA’s ontology, and addition of pre-existing Universal Truth’s we ended up with a sentence annotation page that looks like this. On this page you can … - read slides – Read SentenceAccess Sentence ContextAccess Neighboring SentencesCheck & Submit RelevancyCheck & Submit Authoring StatusDisplay Existing Universal TruthsAuthor Universal TruthsAnd on closer inspection…
  • #16 Here is the expanded view of the context surrounding a sentence available for UT annotation.Each page has a unique id for the table of contents elementThe sentence itself is an elementElements pointing to the previous and next sentences.Elements pointing to top level entitiesUsers can update the sentences relevancy and encoding status.
  • #17 Each sentence has a collection of universal truths, each represented by a wiki page, that are created inline on the sentence page. On this page you’re viewing the expanded editing pane for adding a universal truth including : The listing of existing universal truths applied to the sentenceThe UT authoring blockAnd two autocomplete boxes for applying additional semantic properties to the universal truth
  • #18 Reference sentenceThe universal truth textUT concept – AURA providedUT context – AURA providedAccuracy rating for the universal truthDate created, approved, and when ratings were applied
  • #20 How do we show progress?How do we show community contributors?How do we focus members on a specific chapter or sentenceHow do we train users in what a universal truth entails – Guided TutorialThere were several requests for unique mediawiki extensions
  • #21 Our original text view needed expanded to add context for authoring..-- 4 clicks --Problem is this made pages very long so authoring Uts required a lot of scrolling up and down the page in our original format.
  • #22 These pages were created behind the scenes by the UT inline authoring component, and there was a huge debate on whether they should be visible to users. While important to the wiki for queries, moderating universal truths, and exporting semantic properties the operations provided by default wiki pages conflicted with some of our original assumptions.-- 4 clicks --
  • #24 Like the second proposal it soon became obvious people couldn’t moderate a universal truth without the full context of a paragraph and possibly even an entire textbook section. This meant we had to remove the ability to approve and deny universal truths across sentences and focus on the annotations per sentence.
  • #26 6 University of Washington Students participated in the testEach received 45 minutes of training on creating Univeral TruthsEach was given 1 hour and a pre-selected list of sentences on a user page to completeThe groups generated over 100 Universal Truths each sessionThey averaged 37 Universal Truths an hour per studentStudents were frequently observed using their domain experience to construct UTs not specifically worded in the source sentence
  • #31 A complex iPad application and I chose one wireframe to put on one slide.You’re looking at Inquire displaying the online textbook portion of Aura