• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Intent Classification of Voice Queries on Mobile Devices
 

Intent Classification of Voice Queries on Mobile Devices

on

  • 184 views

Intent Classification of Voice Queries on Mobile Devices, ...

Intent Classification of Voice Queries on Mobile Devices,
Subhabrata Mukherjee, Ashish Verma and Kenneth W. Church, In Proceedings of the 22nd International World Wide Web Conference (WWW 2013), Rio De Janeiro, Brazil, May 13 - May 17, 2013 (Poster)

Statistics

Views

Total Views
184
Views on SlideShare
184
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Intent Classification of Voice Queries on Mobile Devices Intent Classification of Voice Queries on Mobile Devices Presentation Transcript

    • Intent Classification of Voice Queries on Mobile DevicesIntent Classification of Voice Queries on Mobile DevicesIntent Classification of Voice Queries on Mobile DevicesIntent Classification of Voice Queries on Mobile Devices Subhabrata Mukherjee†, Ashish Verma†, Kenneth W. Church‡ †IBM India Research Lab, ‡ IBM T.J. Watson Research Center • Mobile query classification is made difficult by short, noisy queries and the presence of inter-active and personalized queries like map (where is the closest restaurant), command-and-control (open facebook), dialogue (how are you?), joke (will you marry me?) etc. • Voice queries are more difficult than typed queries due to the errors introduced by the automatic speech recognizer • In this work we bring the complexities of voice search and intent classification together in proposing a multi-stage classifier for intent classification Introduction • 52,282 unique queries, with total 1,04,950 impressions, are collected from an android voice search application and manually tagged. Avg. number of words per query is 2.3. Avg. word error rate of ASR engine is 20%. The figure below shows the F1 score comparison of different models over 13 most frequent query classes. Evaluation • Stage 1 - Multi-class Naïve Bayes classifier uses bag-of-words, part-of-speech tag information and domain words of the query as features and predicts top K probable classes for a query. • Domain words (DW) of a query are extracted from url’s of top ranked retrieved pages from search engine. For example, the search engine retrieves the DW’s imdb marvel en.wikipedia youtube etc. for the query the avengers. • DW introduce external knowledge in classifier (Movies, Music, Sports etc.) and keeps online data requirement low • POS tags detect patterns for Command-and-Control, Map, Knowledge, Dialogue, broken queries (Endpoint) etc. • Stage 2 - Top K ranked classes are taken with some additional features in a logistic regression classifier. • Features like DW and url ranking, DW and query overlap (Navigational ), substrings having maximum information gain for a class help differentiate between close query categories from Stage 1 • Stage 3 – Prediction of the independent binary classifiers are combined into a single prediction Intent Classification 0 0.2 0.4 0.6 0.8 1 1.2 w eather map nav command know ledge movies dialogue restaurant w eb music sports acceptor endpoint Reg+Vocab+POS+Yahoo Vocab+POS+Yahoo Vocab+Yahoo Vocab+POS Vocab •Navigational - Query DW overlap •Map - Presence of substrings find, close, direction, near, where in query •Movie - Presence of IMDB as the topmost DW •Command-and-Control - Query starting with a Verb •Websearch - Presence of Wikipedia in the topmost two DW’s KnowledgeWhat is pi EndpointHow to spell NavigationalBest buy com Command-and-ControlCall mom MapFind nearest SubwayExample Queries