© 2009 IBM Corporation
Conversational Internet:
A natural language interface for webpages
Dale Lane – IBM United Kingdom
14 May 2013
dale.lane@uk.ibm.com
© 2012 IBM Corporation2
Challenge
© 2012 IBM Corporation3
Challenge
© 2012 IBM Corporation4
Motivation
© 2012 IBM Corporation5
Understanding the page : Identifying type
© 2012 IBM Corporation6
Understanding the page : Identifying navigation options
© 2012 IBM Corporation7
Understanding the page : Identifying calls-to-action
© 2012 IBM Corporation8
Understanding the user : Retrieving information
© 2012 IBM Corporation9
Understanding the user : Mouse actions
© 2012 IBM Corporation10
Understanding the user : Mouse actions
© 2012 IBM Corporation11
Understanding the user : Keyboard actions
© 2012 IBM Corporation12
Understanding the user : Keyboard actions
© 2012 IBM Corporation13
Understanding the user : Keyboard actions
© 2012 IBM Corporation14
Conversational Internet
 Presenting an early-stage prototype being developed to explore the potential for
question answering as an alternative approach to screen-readers for retrieving
information from web pages
 Architecture and approach inspired by active area of research and development in
question answering on a knowledge derived from a corpus of documents
(Ferucci, Lally, Chu-Carroll, et al)
http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=6177717
© 2012 IBM Corporation15
Implementation
browser
extension
UIMA
LanguageWare
Java
client server
© 2012 IBM Corporation16
Implementation
new request new request
response
with
conv. id
response
follow-up
request
request
with
conv. id
“what can I do?”
“the options are...”
“I want to do...”
© 2012 IBM Corporation17
Implementation : Client
 Firefox extension
 Submits current state of
the page to server for
analysis
 Training mode
© 2012 IBM Corporation18
Implementation : Server
 Analyzing the page
 Processing user queries
© 2012 IBM Corporation19
Implementation : Server : Understanding the page
© 2012 IBM Corporation20
Implementation : Server : Understanding the page
What type of site is this?
 Machine learning classifiers
 Whitelists of known domains
© 2012 IBM Corporation21
Implementation : Server : Understanding the page
What can we infer from markup used?
 Semantic tags
 ARIA
 CSS class names
© 2012 IBM Corporation22
Implementation : Server : Understanding the page
What does the structure of the page layout suggest?
 Machine learning models for common page elements
© 2012 IBM Corporation23
Implementation : Server : Understanding the page
What does the text of the page tell us?
 Natural Language Processing using LanguageWare to recognize common forms
of call-to-action
© 2012 IBM Corporation24
Implementation : Server : Responding to queries
© 2012 IBM Corporation25
Implementation : Server : Responding to queries
Interpreting the query
 NLP rules created with LanguageWare to map to closest known command type
 WordNet to attempt matches using synonyms of unknown terms
© 2012 IBM Corporation26
Implementation : Server : Responding to queries
Extracting the requested information
 Information requested from page extracted from serialized CAS created by first
pipeline, by retrieving sections with relevant annotations
© 2012 IBM Corporation27
Implementation : Server : Responding to queries
Preparing a response
 Speech generated using Nuance NDev and streamed to client
© 2012 IBM Corporation28
Future work / Limitations
 Use cases
 Usability testing
 RIA / AJAX sites
© 2012 IBM Corporation29
 Paper submitted to W4A
– http://goo.gl/3X2iv
 Overview presentation
– http://youtu.be/uS6oquJdgbw
 Demonstration of the prototype
– http://youtu.be/tSGyPCcO-bY
Dale Lane
dale.lane@uk.ibm.com
@dalelane

Conversational Internet - Creating a natural language interface for web pages

  • 1.
    © 2009 IBMCorporation Conversational Internet: A natural language interface for webpages Dale Lane – IBM United Kingdom 14 May 2013 dale.lane@uk.ibm.com
  • 2.
    © 2012 IBMCorporation2 Challenge
  • 3.
    © 2012 IBMCorporation3 Challenge
  • 4.
    © 2012 IBMCorporation4 Motivation
  • 5.
    © 2012 IBMCorporation5 Understanding the page : Identifying type
  • 6.
    © 2012 IBMCorporation6 Understanding the page : Identifying navigation options
  • 7.
    © 2012 IBMCorporation7 Understanding the page : Identifying calls-to-action
  • 8.
    © 2012 IBMCorporation8 Understanding the user : Retrieving information
  • 9.
    © 2012 IBMCorporation9 Understanding the user : Mouse actions
  • 10.
    © 2012 IBMCorporation10 Understanding the user : Mouse actions
  • 11.
    © 2012 IBMCorporation11 Understanding the user : Keyboard actions
  • 12.
    © 2012 IBMCorporation12 Understanding the user : Keyboard actions
  • 13.
    © 2012 IBMCorporation13 Understanding the user : Keyboard actions
  • 14.
    © 2012 IBMCorporation14 Conversational Internet  Presenting an early-stage prototype being developed to explore the potential for question answering as an alternative approach to screen-readers for retrieving information from web pages  Architecture and approach inspired by active area of research and development in question answering on a knowledge derived from a corpus of documents (Ferucci, Lally, Chu-Carroll, et al) http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=6177717
  • 15.
    © 2012 IBMCorporation15 Implementation browser extension UIMA LanguageWare Java client server
  • 16.
    © 2012 IBMCorporation16 Implementation new request new request response with conv. id response follow-up request request with conv. id “what can I do?” “the options are...” “I want to do...”
  • 17.
    © 2012 IBMCorporation17 Implementation : Client  Firefox extension  Submits current state of the page to server for analysis  Training mode
  • 18.
    © 2012 IBMCorporation18 Implementation : Server  Analyzing the page  Processing user queries
  • 19.
    © 2012 IBMCorporation19 Implementation : Server : Understanding the page
  • 20.
    © 2012 IBMCorporation20 Implementation : Server : Understanding the page What type of site is this?  Machine learning classifiers  Whitelists of known domains
  • 21.
    © 2012 IBMCorporation21 Implementation : Server : Understanding the page What can we infer from markup used?  Semantic tags  ARIA  CSS class names
  • 22.
    © 2012 IBMCorporation22 Implementation : Server : Understanding the page What does the structure of the page layout suggest?  Machine learning models for common page elements
  • 23.
    © 2012 IBMCorporation23 Implementation : Server : Understanding the page What does the text of the page tell us?  Natural Language Processing using LanguageWare to recognize common forms of call-to-action
  • 24.
    © 2012 IBMCorporation24 Implementation : Server : Responding to queries
  • 25.
    © 2012 IBMCorporation25 Implementation : Server : Responding to queries Interpreting the query  NLP rules created with LanguageWare to map to closest known command type  WordNet to attempt matches using synonyms of unknown terms
  • 26.
    © 2012 IBMCorporation26 Implementation : Server : Responding to queries Extracting the requested information  Information requested from page extracted from serialized CAS created by first pipeline, by retrieving sections with relevant annotations
  • 27.
    © 2012 IBMCorporation27 Implementation : Server : Responding to queries Preparing a response  Speech generated using Nuance NDev and streamed to client
  • 28.
    © 2012 IBMCorporation28 Future work / Limitations  Use cases  Usability testing  RIA / AJAX sites
  • 29.
    © 2012 IBMCorporation29  Paper submitted to W4A – http://goo.gl/3X2iv  Overview presentation – http://youtu.be/uS6oquJdgbw  Demonstration of the prototype – http://youtu.be/tSGyPCcO-bY Dale Lane dale.lane@uk.ibm.com @dalelane