Constructing an AI knowledge base requires decomposing complex sentences into simplified statements with encoding concepts. Due to knowledge engineering cost and complexity, we created an experiment to test the scenario where college students do the above task using a semantic wiki. This wiki also tracked the progress of each student and provided an integrated environment for our knowledge workers.
In this presentation we will discuss the layout of the imported data within the wiki, the user experience throughout the publishing process, the underlying technologies behind the wiki app, and the preliminary results of the experiment. The semantic wiki web application included the following technologies:
• Semantic MediaWiki Plus, which provides an object oriented framework for semi-structured data.
• JavaScript, HTML5, and AJAX service-based graphing of triples and entities within the project and for interconnected services.
• Faceted browsing and semantic pivoting among related entities: textbook paragraphs, sentences, concepts, and sentence encodings.
• Virtuoso integration with the knowledge base.
First Firecat Friday presentation: tools, best practices and design insights we've put to work for organizations of all sizes to help groups and teams work on projects, share ideas, keep track of files, stay on top of tasks -- while feeling like a team.
Open-Source Collaboration Tools are Good for You - 2009 editionBertrand Delacretaz
Slides of my "Open-Source Collaboration Tools are Good for You!" presentation at openexpo.ch Bern, April 2009. Video at http://www.youtube.com/watch?v=XdNyzNCRLd8 . Relooked and slightly expanded from previous versions, including "speaking in URLs" and "making mistakes in public".
Top 10 Network Operation Center Best Practices
In this free ebook you'll find tips
and best practices related to:
5 Essential tools NOC must have:
1. How to develop and maintain team knowledge and skills
2. Training new NOC Team
3. Improving communication and collaboration within and outside the NOC
4. Escalating, prioritizing, and handling problems
How to Become a Thought Leader in Your NicheLeslie Samuel
Are bloggers thought leaders? Here are some tips on how you can become one. Provide great value, put awesome content out there on a regular basis, and help others.
Setting up a centralized knowledge base for your library can be a great way to collaboratively brainstorm ideas, gather specialized knowledge, organize instructional resources, and even replace intranets. Creating a private, personal knowledge base will keep you organized, store your files, and provide an online space for brainstorming, reading lists, project ideas, to-do lists, and even travel plans. Learn how to create your own personal and organizational repositories of information and knowledge with no technical skills required!
First Firecat Friday presentation: tools, best practices and design insights we've put to work for organizations of all sizes to help groups and teams work on projects, share ideas, keep track of files, stay on top of tasks -- while feeling like a team.
Open-Source Collaboration Tools are Good for You - 2009 editionBertrand Delacretaz
Slides of my "Open-Source Collaboration Tools are Good for You!" presentation at openexpo.ch Bern, April 2009. Video at http://www.youtube.com/watch?v=XdNyzNCRLd8 . Relooked and slightly expanded from previous versions, including "speaking in URLs" and "making mistakes in public".
Top 10 Network Operation Center Best Practices
In this free ebook you'll find tips
and best practices related to:
5 Essential tools NOC must have:
1. How to develop and maintain team knowledge and skills
2. Training new NOC Team
3. Improving communication and collaboration within and outside the NOC
4. Escalating, prioritizing, and handling problems
How to Become a Thought Leader in Your NicheLeslie Samuel
Are bloggers thought leaders? Here are some tips on how you can become one. Provide great value, put awesome content out there on a regular basis, and help others.
Setting up a centralized knowledge base for your library can be a great way to collaboratively brainstorm ideas, gather specialized knowledge, organize instructional resources, and even replace intranets. Creating a private, personal knowledge base will keep you organized, store your files, and provide an online space for brainstorming, reading lists, project ideas, to-do lists, and even travel plans. Learn how to create your own personal and organizational repositories of information and knowledge with no technical skills required!
WebQuests and wikis provide vehicles for interactive, authentic projects that can become springboards for heightened research resulting in higher critical thought. A WebQuest is a kind of roadmap that takes the explorer on a journey through the many informative components of the World Wide Web. A wiki is like a web page that is always open to collaboration promoting editing and sharing of ideas and knowledge.
Play Architecture, Implementation, Shiny Objects, and a ProposalMike Slinn
ScalaCourses.com has been serving online Scala and Play training material to students for over two years. ScalaCourses.com teaches courses on the same technology stack that the web site runs on. The Cadenza application that powers ScalaCourses.com is a Play Framework 2 application, written in Scala and using Akka, Slick, AWS and Postgres. Some of the architectural features in Cadenza that allow a modest-sized Play application to serve large amounts of multimedia data efficiently is discussed, including technical details of how to work with an immutable domain model that can be modified.
Over the last 2+ years the underlying technology has changed a lot; a brief history of Play Framework will be recounted, and how that impacted Cadenza. The talk concludes with a proposal regarding Play Framework's future.
Wikis, Rubrics and Views: An Integrated Approach to Improving DocumentationTed Habermann
For many years scientists and data managers have focused on creating metadata that supports the discovery of available data. This is important, but once data sets are discovered, users need metadata that supports use and understanding of those data. This talk describes a system developed to support the required metadata improvements using wikis, rubrics, and metadata views. The wikis provide a mechanism for the community to record experiences and lessons learned and provide high-quality examples. Rubrics provide a mechanism for consistent and clear quantitative evaluation of the completeness of metadata records. The results displays include integrated links to the wiki. Views provide views with connections to the wiki and on-going interactive learning. These tools can be used with metadata from any standard and can facilitate translation of the metadata between multiple standards.
Capture All the URLs: First Steps in Web ArchivingKristen Yarmey
Presentation for a Society of American Archivists Web Archiving Roundtable professional development webinar.
Session Description:
Two co-authors, Alexis Antracoli, Records Management Archivist at Drexel University and Kristen Yarmey, Associate Professor and Digital Services Librarian at the University of
Scranton will share their experiences and engage in discussion about their web archiving projects. The work they will be talking about is covered in “Capture All the URLs: First Steps in Web Archiving” (http://palrap.pitt.edu/ojs/index.php/palrap/article/view/67).
Kristen will discuss her and her colleagues’ first steps in web archiving at the University of Scranton, including making the case to campus stakeholders, finding funding, choosing Archive-It as well as selecting content and seeds to capture. Alexis will talk about establishing policies and implementing QA procedures. Both Alexis and Kristen will provide
insights on stumbling blocks, lessons learned, and future plans. Plenty of time will be allotted for questions and discussion.
Mark Dehmlow, Head of the Library Web Department at the University of Notre Dame
At the University of Notre Dame, we recently implemented a new website in concert with rolling out a “next generation” OPAC into production for our campus. While much of the pre-launch feedback was positive, once we implemented the new systems, we started receiving a small number of intense criticisms and a small wave of problem reports. This presentation covers how to plan for big technology changes, prepare your organizations, effectively manage the barrage of post implementation technical problems, and mitigate customer concerns and criticisms. Participants are encouraged to bring brief war stories, anecdotes, and suggestions for managing technology implementations.”
Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...Anna Perricci
This is the main slide deck for a workshop at iPRES 2018 on human scale web collecting. A primary focus of the presentation was the use of Webrecorder.io, a free, open source web archiving tool available to all.
Dynamics of Web: Analysis and Implications from Search PerspectiveNattiya Kanhabua
Dynamicity of Web and its implications on various components of search systems have taken a large attention in the last decade. This course, in the first place, aims to introduce students to the general and wide topic of Web evolution, and then pinpoint a number of issues that is related to temporal aspects of search and IR. We plan to start with an overview of seminal works that shed light on the evolution of Web within time. Next, we will focus on the impacts of this evolution on search and we will essentially focus on indexing of versioned document collections and time-aware retrieval and ranking. We will discuss evolution of search results and its effects on caching, and wrap up the course with a review of some recent approaches that aim to predict and search the future!
Getting Started With Omeka (DHSI 2015 Unconference)jkmcgrath
Slides from 2015 DHSI "unconference" session titled "Getting Started with Omeka." Slides are slightly tweaked / condensed from HASTAC Webinar slides used in early 2015 by Jim (see my SlideShare page for those slides).
HASTAC Scholars: Omeka and Digital Archivesjkmcgrath
Slides from HASTAC Scholars webinar on Omeka and digital archives (February 20th, 2015). Link to webinar / notes forthcoming.
Thanks to HASTAC Scholars (and particularly to Fiona Barnett and Kalle Westerling) for the webinar invite!
I'm easy to find on Twitter @JimMc_Grath. E-mal: mcgrath[dot]ja[at]husky.neu.edu
AMIA: Examining AV Enterprise at a Regional Academic ArchiveJessica Breiman
Presentation delivered by University of Utah AV archivists at the Association of Moving Image Archivists conference 2015
Jessica Breiman
Tawnya Keller
Molly Rose Steed
Digital Tools in The Classroom: Omeka Workshop (Northeastern University)jkmcgrath
Slides from a workshop on using Omeka in the college classroom. The workshop, held on November 17th, 2014 at Northeastern University, was run by Jim McGrath, Dave DeCamp, and Amanda Rust. The workshop was co-sponsored by the Digital Scholarship Group and the NULab for Texts, Maps, and Networks. For more information about the DSG, please visit dsg.neu.edu. For more information about the NULab, please visit nulab.neu.edu
Addresses streaming data challenges in sampling rates, cache maintenance, deductive reasoning, and the surrounding Semantic Web framework. Using a fixed-size cache, the challenge is to identify and preserve assertions within a stream. Deductive reasoning will continuously be performed over the cache to draw relevant conclusions as quickly as possible. The use of a cache differentiates our work from state-of-the-art works in deductive stream reasoning in that the cache enables us to temporarily store propositions that are no longer in the stream window.
Applied semantic technology and linked dataWilliam Smith
Mapping a human brain generates petabytes of gene listings and the corresponding locations of these genes throughout the human brain. Due to the large dataset a prototype Semantic Web application was created with the unique ability to link new datasets from similar fields of research, and present these new models to an online community. The resulting application presents a large set of gene to location mappings and provides new information about diseases, drugs, and side effects in relation to the genes and areas of the human brain.
In this presentation we will discuss the normalization processes and tools for adding new datasets, the user experience throughout the publishing process, the underlying technologies behind the application, and demonstrate the preliminary use cases of the project.
More Related Content
Similar to AURA Wiki - Knowledge Acquisition with a Semantic Wiki Application
WebQuests and wikis provide vehicles for interactive, authentic projects that can become springboards for heightened research resulting in higher critical thought. A WebQuest is a kind of roadmap that takes the explorer on a journey through the many informative components of the World Wide Web. A wiki is like a web page that is always open to collaboration promoting editing and sharing of ideas and knowledge.
Play Architecture, Implementation, Shiny Objects, and a ProposalMike Slinn
ScalaCourses.com has been serving online Scala and Play training material to students for over two years. ScalaCourses.com teaches courses on the same technology stack that the web site runs on. The Cadenza application that powers ScalaCourses.com is a Play Framework 2 application, written in Scala and using Akka, Slick, AWS and Postgres. Some of the architectural features in Cadenza that allow a modest-sized Play application to serve large amounts of multimedia data efficiently is discussed, including technical details of how to work with an immutable domain model that can be modified.
Over the last 2+ years the underlying technology has changed a lot; a brief history of Play Framework will be recounted, and how that impacted Cadenza. The talk concludes with a proposal regarding Play Framework's future.
Wikis, Rubrics and Views: An Integrated Approach to Improving DocumentationTed Habermann
For many years scientists and data managers have focused on creating metadata that supports the discovery of available data. This is important, but once data sets are discovered, users need metadata that supports use and understanding of those data. This talk describes a system developed to support the required metadata improvements using wikis, rubrics, and metadata views. The wikis provide a mechanism for the community to record experiences and lessons learned and provide high-quality examples. Rubrics provide a mechanism for consistent and clear quantitative evaluation of the completeness of metadata records. The results displays include integrated links to the wiki. Views provide views with connections to the wiki and on-going interactive learning. These tools can be used with metadata from any standard and can facilitate translation of the metadata between multiple standards.
Capture All the URLs: First Steps in Web ArchivingKristen Yarmey
Presentation for a Society of American Archivists Web Archiving Roundtable professional development webinar.
Session Description:
Two co-authors, Alexis Antracoli, Records Management Archivist at Drexel University and Kristen Yarmey, Associate Professor and Digital Services Librarian at the University of
Scranton will share their experiences and engage in discussion about their web archiving projects. The work they will be talking about is covered in “Capture All the URLs: First Steps in Web Archiving” (http://palrap.pitt.edu/ojs/index.php/palrap/article/view/67).
Kristen will discuss her and her colleagues’ first steps in web archiving at the University of Scranton, including making the case to campus stakeholders, finding funding, choosing Archive-It as well as selecting content and seeds to capture. Alexis will talk about establishing policies and implementing QA procedures. Both Alexis and Kristen will provide
insights on stumbling blocks, lessons learned, and future plans. Plenty of time will be allotted for questions and discussion.
Mark Dehmlow, Head of the Library Web Department at the University of Notre Dame
At the University of Notre Dame, we recently implemented a new website in concert with rolling out a “next generation” OPAC into production for our campus. While much of the pre-launch feedback was positive, once we implemented the new systems, we started receiving a small number of intense criticisms and a small wave of problem reports. This presentation covers how to plan for big technology changes, prepare your organizations, effectively manage the barrage of post implementation technical problems, and mitigate customer concerns and criticisms. Participants are encouraged to bring brief war stories, anecdotes, and suggestions for managing technology implementations.”
Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...Anna Perricci
This is the main slide deck for a workshop at iPRES 2018 on human scale web collecting. A primary focus of the presentation was the use of Webrecorder.io, a free, open source web archiving tool available to all.
Dynamics of Web: Analysis and Implications from Search PerspectiveNattiya Kanhabua
Dynamicity of Web and its implications on various components of search systems have taken a large attention in the last decade. This course, in the first place, aims to introduce students to the general and wide topic of Web evolution, and then pinpoint a number of issues that is related to temporal aspects of search and IR. We plan to start with an overview of seminal works that shed light on the evolution of Web within time. Next, we will focus on the impacts of this evolution on search and we will essentially focus on indexing of versioned document collections and time-aware retrieval and ranking. We will discuss evolution of search results and its effects on caching, and wrap up the course with a review of some recent approaches that aim to predict and search the future!
Getting Started With Omeka (DHSI 2015 Unconference)jkmcgrath
Slides from 2015 DHSI "unconference" session titled "Getting Started with Omeka." Slides are slightly tweaked / condensed from HASTAC Webinar slides used in early 2015 by Jim (see my SlideShare page for those slides).
HASTAC Scholars: Omeka and Digital Archivesjkmcgrath
Slides from HASTAC Scholars webinar on Omeka and digital archives (February 20th, 2015). Link to webinar / notes forthcoming.
Thanks to HASTAC Scholars (and particularly to Fiona Barnett and Kalle Westerling) for the webinar invite!
I'm easy to find on Twitter @JimMc_Grath. E-mal: mcgrath[dot]ja[at]husky.neu.edu
AMIA: Examining AV Enterprise at a Regional Academic ArchiveJessica Breiman
Presentation delivered by University of Utah AV archivists at the Association of Moving Image Archivists conference 2015
Jessica Breiman
Tawnya Keller
Molly Rose Steed
Digital Tools in The Classroom: Omeka Workshop (Northeastern University)jkmcgrath
Slides from a workshop on using Omeka in the college classroom. The workshop, held on November 17th, 2014 at Northeastern University, was run by Jim McGrath, Dave DeCamp, and Amanda Rust. The workshop was co-sponsored by the Digital Scholarship Group and the NULab for Texts, Maps, and Networks. For more information about the DSG, please visit dsg.neu.edu. For more information about the NULab, please visit nulab.neu.edu
Similar to AURA Wiki - Knowledge Acquisition with a Semantic Wiki Application (20)
Addresses streaming data challenges in sampling rates, cache maintenance, deductive reasoning, and the surrounding Semantic Web framework. Using a fixed-size cache, the challenge is to identify and preserve assertions within a stream. Deductive reasoning will continuously be performed over the cache to draw relevant conclusions as quickly as possible. The use of a cache differentiates our work from state-of-the-art works in deductive stream reasoning in that the cache enables us to temporarily store propositions that are no longer in the stream window.
Applied semantic technology and linked dataWilliam Smith
Mapping a human brain generates petabytes of gene listings and the corresponding locations of these genes throughout the human brain. Due to the large dataset a prototype Semantic Web application was created with the unique ability to link new datasets from similar fields of research, and present these new models to an online community. The resulting application presents a large set of gene to location mappings and provides new information about diseases, drugs, and side effects in relation to the genes and areas of the human brain.
In this presentation we will discuss the normalization processes and tools for adding new datasets, the user experience throughout the publishing process, the underlying technologies behind the application, and demonstrate the preliminary use cases of the project.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
2. Today we will be talking about…
• Populating a Symbolic AI – Aura
• The spiraling cost structure for encoding data into
a symbolic AI
• How do we bring low cost domain experts into the
process?
• Creating a Semantic MediaWiki Installation
• Importing a textbook into Semantic MediaWiki
and marking up pages with properties
• Customizing the installation for annotating
textbook sentences
4. 3) Encoding Planning -- 35% time
Group Common UTs, ID KR/KE Issues,
ID Already Encoded, Write How to Encode
Pre-Planning, QA Check
Status Labeling: Encoding Complete, KR Issue (Closed)
2) Reaching Consensus -- 14% time
Universal Truth Authoring, Concept Chosen QA Check
1) Determining Relevance -- 2% time
Highlighting, Diagram Analysis
QA Check
Status Labeling: Relevant, Irrelevant (Closed)
6) Question-Based Testing -- 14% time
Use Minimal Test Suite, Reasoning JIRA Issues Filed,
Encoder Fills KB Gaps
QA Check with Screenshots of “Passing" Comparison
and Relationship Questions
5) Key Term Review -- 25% time
KR Evaluated by Modeling Expert and Biologist,
Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10% time
Encode, File JIRA Issues
QA Check
Status Labeling: Encoding Complete, KE Issue
5. -- How to choose a concept given a UT?
-- How to produce UTs from sentences?
Sentence
Sentence
UT
UT
UT
UT
Chapter
Chapter
KBBook
CMap
CMap
CMap
CMap
Chapter UT
2) Reaching Consensus -- 14% time
Univeral Truth Authoring, Concept Chosen
6. What is a Universal Truth?
• “A Universal Truth is a stand-alone, unambiguous
declarative sentence about a textbook topic that
expresses a single fact that is universally true”
- AURA Knowledge Engineering Manual
• “Water is composed of two Hydrogen element molecules
and one Oxygen element molecule with the chemical
formula H20”
• Water is composed of hydrogen
• Water is composed of oxygen
• Hydrogen is an element
• Oxygen is an element
• Water has the chemical formula H20
• Does: “Water is a compound” count?
7. Project Goals
• “Crowd Source Universal Truth
Authoring”
• Can Domain Experts Author Useful Universal
Truths?
• Can We Speed Up Encoding a Textbook with Input
from Domain Experts?
• Can We Create a UT Authoring Portal for Multiple
Textbooks?
• Can Existing Social Networks Provide Domain
Experts Capable of UT Authoring?
• Could Gamification be Applied to An Existing Portal
to Add Non-Domain Experts?
8. About the Domain Experts
• Students attending University of Washington or
recent graduates
• All have a background in biology or life sciences
• Native English speakers with excellent writing
skills
• Each student read the chapters in question and
was provided with an iPad running the Inquire
application
• Students were paid for their time
10. Storing a Text Book in Aura Wiki
• The wiki was created with instances of page types
composed of textbook sentences
• Sentence
• Paragraph
• Section
• Chapter
• Book
• The wiki also has imported resources to aid in the UT
authoring process
• Glossary Pages
• Taxonomy Concepts
• Universal Truths – Human and Machine
15. Authoring Universal Truths
• Semantic Wiki Properties
• Each page has a unique id
for the table of contents
element
• The sentence itself is an
element
• Elements pointing to the
previous and next
sentences.
• Elements pointing to top
level entities
• Users can update the
sentences relevancy and
encoding status.
Sentence and Context View
17. Authoring Universal Truths
• Semantic Wiki Properties
• Reference sentence
• The universal truth text
• UT concept – AURA provided
• UT context – AURA provided
• Accuracy rating for the universal
truth
• Date created, approved, and
when ratings were applied
Universal Truth
19. Navigating Aura Wiki
• Unregistered and Registered Main Pages
• Unregistered users are locked out
• Registration is turned off for anonymous users
• Unique Extensions Proposed for Guided Authoring
20. How to View a Textbook Paragraph?
Auto create triple
format UTs from
sentence?
21. How to View a Universal Truth Page?
How do we unify
versions of the
page for export
to AURA?
25. Domain Expert Authoring Statistics
• 6 University of Washington Students participated in the
test
• Each received 45 minutes of training on creating
Universal Truths
• Each was given 1 hour and a pre-selected list of
sentences on a user page to complete
• The groups generated over 100+ Universal Truths each
session
• They averaged 37 Universal Truths an hour per student
• Students were frequently observed using their domain
experience to construct UTs not specifically worded in the
source sentence (ie: “Water is a compound”)
27. Project Goals
• “Crowd Source Universal Truth
Authoring”
• Can Domain Experts Author Useful Universal
Truths?
• Can We Speed Up Encoding a Textbook with
Input from Domain Experts?
28. Project Goals
• “Crowd Source Universal Truth
Authoring”
• Can We Create a UT Authoring Portal for
Multiple Textbooks?
29. Project Goals
• “Crowd Source Universal Truth
Authoring”
• Can Existing Social Networks Provide Domain
Experts Capable of UT Authoring?
• Could Gamification be Applied to An Existing
Portal to Add Non-Domain Experts?
Hello, looking over the program I’m aware this is a pretty competitive hour for talks… we’re doing this right after lunch… going against a Google talk… and with a cryptic title about artificial intelligence engines and a semantic media wiki installation.
This talk is going to cover an experiment we ran the last 6 months of 2012. An experiment that involves a symbolic AI population program and our solution to lowering the costs associated with encoding a text book into the Knowledge Base. We’re going to expand on the process for adding new data to the knowledge base, and our attempt to lower the cost structure by using domain experts using an installation of Semantic MediaWiki specifically created to populate Aura.
So let’s begin with AURA, and AURA itself is pretty large… so I chose one screenshot to include on one slide. In fact, this isn’t even a screenshot of AURA doing anything beyond one screen used to populate the knowledge base, and debugging a question into an explanation via concept maps. This screen quickly became a major choke point when it comes to populating the underlying concept maps composing the underlying knowledge base. In fact, it got exponentially more expensive and time consuming to add new concepts and relations to AURA as more chapters were encoded into AURA.This is a good screenshot because you see AURA failing to answer a question because it needs more data encoded. Looking at the third arrow AURA is saying a group of CMAPS to answer the question “What are the parts of the Eukaryotic Cell” do not exist. So it’s time to start the process for adding these concept maps from the textbook…
A process that looks roughly like this… I don’t want to dwell on all the steps being shown here too long, but as shown above it’s quite extensive to add even even trivial data to the knowledge base. This is the work process of several groups from Knowledge Engineers to SRI research groups to biologists and teachers. When project management was asked it which step needs focused on to speed up data population it came down to number 2…Actually the first part of #2…
We cared about this step.Authoring the “Universal Truth” portion of this process was time consuming, expensive, and getting more difficult as the knowledge base grew. It required trained biologists, trained educators that were used to the source text, and the knowledge engineering team focused on hiring individuals that could be trained into understanding “how” to encode these universal truths.A large part of the experiment was dedicated to training students in recognizing a universal truth and how to derive them from source sentences. We also specifically created work paths within our Semantic MediaWiki installation to aid in recognizing and constructing Universal Truths.
… and that wasn’t an easy task due to the nature of a “Universal Truth”. - Read definition – So easy enough to understand? I chose a sentence from wikipedia to demonstrate just how easy this task can get – Read sentence – Any guesses on how many universal truths lie in that sentence? Well just at a glance I found 5 and the last one is probably not valid being composed of two truths both stating water has a chemical formula, H20 is a chemical formula, and then a statement connecting water to H2O.
With all of that in mind and facing a pretty significant problem adding more content to AURA, we devised an experiment with the explicit intent to outsource universal truth authoring to the greatest number of domain experts. This is our “bullet list of pain” thinly veiled as “project goals”…
And finally… with our simple problem complete with simple project goals we decided on the easiest group of people in the world to schedule – College Students.-- read points –Students attending University of Washington or recent graduatesAll have a background in biology or life sciencesNative English speakers with excellent writing skillsEach student read the chapters in question and was provided with an iPad running the Inquire applicationStudents were paid for their time
Designed as a portal for annotating a textbook with Universal Truths we developed Aura Wiki to build on each aspect of the project – assuming the students pass the current project goal (ie – One painful bullet point). Here is an example of the entry point to the wiki functioning as a portal, and an early version of the UT authoring page at a sentence level.
We also decided to take on the task of storing and marking up the entire text book with semantic entities.First we began with the top level importing standard table of context data into a set of wiki pages marked by category – read top section pointThen we added the markup including glossaries, a taxonomy of existing concepts imported from Aura, and we imported existing universal truths from the current system as examples.
Frequently deemed the ugliest - and most common - page on the website it quickly became the focal point for UIX improvements as we realized it wasn’t really plausible to provide random sentences to users for UT annotation. These pages were created originally as background pages for tracking textbook properties and were not originally intended to be navigational elements. However, users would often leave the UT authoring page soon after creating their first set of annotations navigating to the actual text book table of content pages generating these criticisms…
Once the import was complete and we added the annotation pages this was the site map structure that emerged.Where we intended the users to stay and focusEverything the users found and decided to useA proposed review system for moderators / trusted usersRemoved to google analytics
-- Add arrows and explain turning on –First we had our import sources and addition of knowledge engineering UTs including marking up pages with additional semantic properties.The data was normalized for wiki presentation and queriesThe wiki portions of AURA wiki and the import agents to create the textbook pagesFinally, the export and sync agents to push/pull UTs to/from AURA
After all of the importing, normalization, alignment of wiki semantic properties to AURA’s ontology, and addition of pre-existing Universal Truth’s we ended up with a sentence annotation page that looks like this. On this page you can … - read slides – Read SentenceAccess Sentence ContextAccess Neighboring SentencesCheck & Submit RelevancyCheck & Submit Authoring StatusDisplay Existing Universal TruthsAuthor Universal TruthsAnd on closer inspection…
Here is the expanded view of the context surrounding a sentence available for UT annotation.Each page has a unique id for the table of contents elementThe sentence itself is an elementElements pointing to the previous and next sentences.Elements pointing to top level entitiesUsers can update the sentences relevancy and encoding status.
Each sentence has a collection of universal truths, each represented by a wiki page, that are created inline on the sentence page. On this page you’re viewing the expanded editing pane for adding a universal truth including : The listing of existing universal truths applied to the sentenceThe UT authoring blockAnd two autocomplete boxes for applying additional semantic properties to the universal truth
Reference sentenceThe universal truth textUT concept – AURA providedUT context – AURA providedAccuracy rating for the universal truthDate created, approved, and when ratings were applied
How do we show progress?How do we show community contributors?How do we focus members on a specific chapter or sentenceHow do we train users in what a universal truth entails – Guided TutorialThere were several requests for unique mediawiki extensions
Our original text view needed expanded to add context for authoring..-- 4 clicks --Problem is this made pages very long so authoring Uts required a lot of scrolling up and down the page in our original format.
These pages were created behind the scenes by the UT inline authoring component, and there was a huge debate on whether they should be visible to users. While important to the wiki for queries, moderating universal truths, and exporting semantic properties the operations provided by default wiki pages conflicted with some of our original assumptions.-- 4 clicks --
Like the second proposal it soon became obvious people couldn’t moderate a universal truth without the full context of a paragraph and possibly even an entire textbook section. This meant we had to remove the ability to approve and deny universal truths across sentences and focus on the annotations per sentence.
6 University of Washington Students participated in the testEach received 45 minutes of training on creating Univeral TruthsEach was given 1 hour and a pre-selected list of sentences on a user page to completeThe groups generated over 100 Universal Truths each sessionThey averaged 37 Universal Truths an hour per studentStudents were frequently observed using their domain experience to construct UTs not specifically worded in the source sentence
A complex iPad application and I chose one wireframe to put on one slide.You’re looking at Inquire displaying the online textbook portion of Aura