Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Inside view of "Clova Inside" - Natural language understanding system to support users anytime, anywhere

971 views

Published on

Toshinori Sato (overlast)
LINE / VA Development Team

AI assistant with VoiceUI that enables intuitive device operation through natural actions, such as human speech and emotion, is an example that requires multiple and simultaneous applications of AI technologies, including natural language processing, voice recognition, speech synthesis, image processing, and information retrieval. LINE Clova is a platform to add AI assistant features to various smart devices in order to repond to user commands through the use of natural language, such as voice and text. This session is a candid talk on the development, operation and other related topics of natural language understanding (NLU) technology that is integral to realize VoiceUI on Clova. Topics include, NLU system overview on the current Clova platform, issues and solutions of developing AI assistant for smart devices, co-existing with skills developed and operated by third parties, LINE’s views on consistent VoiceUI design, content development case studies, and upcoming topics that LINE must address. While still at the dawn of AI assistant and smart devices, no one is certain what the flagship skill or application would look like. With this, NLU system for smart devices must continue to support increasing device types and features, as well as growing number of contents, services, and their updates. In addition, issues that become evident by using human speech and text as queries must also be resolved. This session is intended to convey the fascination of working with AI Assistant technology through sharing the excitement of facing new challenges in the development process and the technological and business opportunities in overcoming these challenges.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Inside view of "Clova Inside" - Natural language understanding system to support users anytime, anywhere

  1. 1. Behind the “Clova Inside” Development of Natural Language Understanding system that supports you at any time and anywhere Toshinori Sato (@overlast) LINE / Search&Clova Center
  2. 2. Toshinori Sato (@overlast) ● Senior Software Engineer ● Natural Language Processing ● Information Retrieval ● Clova ● Japanese NLU system ● OSS ● Main Contributor of NEologd project ● mecab-ipadic-NEologd
  3. 3. Agenda • VoiceUI of Clova • Inside of our NLU system • Various issues related to NLU • Future work
  4. 4. Smart Speakers Smart Displays Smart Watch, TV, … Smart Devices
  5. 5. VoiceUI of Clova
  6. 6. Clova ? (Clova, Tell me Shinjuku’s Weather)
  7. 7. Clova ? (Clova, Tell me Shinjuku’s Weather) … (Today's weather in Shinjuku is sunny, the temperature is ...)
  8. 8. Clova user’s speech data Clova ? (Clova, Weather?)
  9. 9. Clova user’s speech data Speech Recognition user’s 
 speech data Clova ? (Clova, Weather?) (The weather of the present Shinjuku is sunny)
  10. 10. Clova (Weather) user’s speech data Speech Recognition recognized
 text list user’s 
 speech data Clova ? (Clova, Weather?) (The weather of the present Shinjuku is sunny)
  11. 11. Clova (Weather) NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data Clova ? (Clova, Weather?) (The weather of the present Shinjuku is sunny)
  12. 12. Clova Key Value Domain Weather Intention Inform Main Goal General Place Time NLU Result (Weather) NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data Clova ? (Clova, Weather?) (The weather of the present Shinjuku is sunny)
  13. 13. Clova Clova ? (Clova, Weather?) NLU Result The user wants: wether:inform:general Place slot of NLU Result: unknown Time slot of NLU Result: unknown Key Value Domain Weather Intention Inform Main Goal General Place Time
  14. 14. Clova Clova ? (Clova, Weather?) Key Value Domain Weather Intention Inform Main Goal General Place Sinjuku Time Present NLU Result User’s current position: Shinjuku Clova’s default time value: Present
  15. 15. Clova Weather web API NLU
 Result Key Value Domain Weather Intention Inform Main Goal General Place Sinjuku Time Present NLU Result (Weather) NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data Clova ? (Clova, Weather?) (The weather of the present Shinjuku is sunny)
  16. 16. Clova Weather web API generated text NLU
 Result Key Value Domain Weather Intention Inform Main Goal General Place Sinjuku Time Present NLU Result (Weather) NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data Clova ? (Clova, Weather?) (The weather of the present Shinjuku is sunny)
  17. 17. Clova Speech Synthesis (The weather of the present Shinjuku is sunny) generated text Weather web API generated text NLU
 Result Key Value Domain Weather Intention Inform Main Goal General Place Sinjuku Time Present NLU Result (Weather) NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data Clova ? (Clova, Weather?)
  18. 18. Clova Speech Synthesis synthesized 
 speech data synthesized speech data (The weather of the present Shinjuku is sunny) generated text Weather web API generated text NLU
 Result Key Value Domain Weather Intention Inform Main Goal General Place Sinjuku Time Present NLU Result (Weather) NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data Clova ? (Clova, Weather?)
  19. 19. Clova Clova ? (Clova, Weather?) Speech Synthesis synthesized 
 speech data synthesized speech data (The weather of the present Shinjuku is sunny) generated text Weather web API generated text NLU
 Result Key Value Domain Weather Intention Inform Main Goal General Place Sinjuku Time Present NLU Result (Weather) NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data
  20. 20. Clova Clova ? (Clova, Weather?) Speech Synthesis synthesized 
 speech data synthesized speech data (The weather of the present Shinjuku is sunny) generated text Weather web API generated text NLU
 Result Key Value Domain Weather Intention Inform Main Goal General Place Sinjuku Time Present NLU Result (Weather) NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data
  21. 21. Inside of our NLU system
  22. 22. NLU / DM Natural Language Understanding & Dialog Management
  23. 23. Clova Clova ? (Clova, Weather?) Speech Synthesis synthesized 
 speech data synthesized speech data (The weather of the present Shinjuku is sunny) generated text Weather web API generated text NLU
 Result Key Value Domain Weather Intention Inform Main Goal General Place Sinjuku Time Present NLU Result (Weather) NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data
  24. 24. NLU ResultsUser’s Voice Key Value Domain Intention Main Goal Place Time Clova M ML Rule ClassifyKeyword Operation M M M RankingFiltering NLU / DM
  25. 25. Domain ● Function name Main Goal ● Data type to use during action Place ● Value to use for weather API Intention ● Clova’s action for an user Domain : Intention : Main Goal ● Identifier for deciding Clova's action Time ● Value to use for weather API Part of elements of NLU Result
  26. 26. Example: “ 13 ”
  27. 27. Example: “ 13 ” User’s Voice Clova 13 … … NLU / DM
  28. 28. Example: “ 13 ” User’s Voice Clova 13 … … Mode ML Rule Classify Keyword Extraction Operation Mode Mode Mode NLU / DM RankingFiltering
  29. 29. NLU action 1 : Keyword Extraction Mode Ranking User’s Voice Clova 13 … … Mode ML Rule Classify Keyword Extraction Operation Mode Mode NLU / DM Filtering
  30. 30. 13 These are keywords. That's right ?
  31. 31. Advance preparation to extract keywords ● Step 1: Construct dictionaries to extract keywords ● Step 2: Create a trie-based index ● Surface string index for each dictionary ● Step 3: Create a index of key-value database ● Individual index including all entries of each dictionary
  32. 32. Construct dictionaries to extract keywords … / / / {“address”: “ 1-1-1”, … } / / / {“address”: “ 1 8 1 ”, … } / / / {“address”: “ 1 8 1 ”, … } … Datetime POI: Point of Interest … / / / {“days”: 0} / / / {“days”: 0} … 13 / / 13 / {“hour”: 13} / / 13 / {“hour”: 13} …
  33. 33. … / / / {“address”: “ 1-1-1”, … } / / / {“address”: “ 1 8 1 ”, … } / / / {“address”: “ 1 8 1 ”, … } … … / / / {“days”: 0} / / / {“days”: 0} … 13 / / 13 / {“hour”: 13} / / 13 / {“hour”: 13} … Construct dictionaries to extract keywords Datetime POI: Point of Interest
  34. 34. Create a trie-based index … / / / {“days”: 0} / / / {“days”: 0} … 13 / / 13 / {“hour”: 13} / / 13 / {“hour”: 13} … Datetime … … 13 … Surface of datetime Trie-based Index
  35. 35. ● Create a index of key-value database Index of
 key-value DB … / / / {…} / / / {…} / / / {…} … Values (POI: Point of Interest) … … Keys
  36. 36. Search in slot using trie-based index 13 slot candidate: datetime 13 13 1 13 13 1 13
  37. 37. 13 13 13 1 13 13 1 13 slot candidate: datetime Search results for datetime slot Trie-based Index Index of
 key-value DB
  38. 38. 13 13 13 1 / / / {“days”: 0} 13 13 1 13 13 / / 13 / {“hour”: 13} slot candidate: datetime Get entries for each datetime value Trie-based Index Index of
 key-value DB
  39. 39. AVG 0.050S(Weather) Key Value Domain Weather Intention Inform Main Goal General Place Shibuya FUKURASU Time Today, 13 O’clock
  40. 40. Example: Rule based keyword extraction 13 datetime datetime POI: Point of Interest
  41. 41. Example: Rule based keyword extraction 13 datetime / keyword
 {“days”: 0} datetime / keyword
 {“hour”: 13} datetime datetime POI: Point of Interest
  42. 42. POI: Point of Interest Example: Rule based keyword extraction POI / NE {surface: “ ”, 
 “yomi”: “ ”, 
 “baseform”: “ ”, “dict”: {“address”: “ 1-1-1”} } 13 datetime / keyword
 {“days”: 0} datetime / keyword
 {“hour”: 13} datetime datetime
  43. 43. User’s Voice Clova 13 … … Mode ML Rule Classify Keyword Extraction Operation Mode Mode Mode NLU / DM RankingFiltering NLU action 2: Classification
  44. 44. Example: Rule based classification Key Value Domain Intention Main Goal Place Happouen Time Today, 13 O’clock NLU Results 13 datetime datetime POI: Point of Interest
  45. 45. Example: Rule based classification Key Value Domain Intention Main Goal Place Happouen Time Today, 13 O’clock NLU Results 13 datetime datetime POI: Point of Interest ● datetime - datetime - POI - weather( ) ● Domain: weather ● Intention: inform ● Main Goal: general
  46. 46. Example: Rule based classification Key Value Domain weather Intention Inform Main Goal general Place Happouen Time Today, 13 O’clock NLU Results 13 datetime datetime POI: Point of Interest ● datetime - datetime - POI - weather( ) ● Domain: weather ● Intention: inform ● Main Goal: general
  47. 47. User’s Voice Clova 13 … … Mode ML Rule Classify Keyword Extraction Operation Mode Mode Mode NLU / DM RankingFiltering NLU action 3: Filtering
  48. 48. User’s Voice Clova 13 … … Mode ML Rule Classify Keyword Extraction Operation Mode Mode Mode NLU / DM RankingFiltering NLU action 4: Ranking
  49. 49. Successful acquisition of NLU results User’s Voice Clova 13 … … M ML Rule ClassifyKeyword Operation M M M RankingFiltering NLU / DM Key Value Domain weather Intention Inform Main Goal general Place Happouen Time Today, 13 O’clock NLU Results
  50. 50. Various issues related to NLU
  51. 51. Difficulty of Speech Recognition
  52. 52. Emergence of new / unknown words 13 rare case
  53. 53. Emergence of new / unknown words 13 rare case 13 S frequently occur
  54. 54. Example: Rule based keyword extraction 13 datetime / keyword
 {“days”: 0} datetime / keyword
 {“hour”: 13} POI / NE {surface: “ ”, 
 “yomi”: “ ”, 
 “baseform”: “Shibuya FUKURASU”} {surface: “ ”, 
 “yomi”: “ ”, 
 “baseform”: “ ”} datetime datetime POI
  55. 55. Ambiguity in query notation 13 Audio noise affected Typical examples are covered by NLU Both are correct Enrich synonym entries
  56. 56. User's speech quality 13 813 eight + 13 O’clock = 813 O’clockto + kyo no = Tokyo no
  57. 57. Emergence of new / unknown words Ambiguity in query notation 13 Difficulty of Speech Recognition S User's speech quality 13 13 813
  58. 58. Difficulty of NLU
  59. 59. Free utterance of user's query ● 13 ● 13 ●13 ● ● ● ● …
  60. 60. Ambiguity in query notation 
 derived from speech recognition 13 Audio noise affected Typical examples are covered by NLU Both are correct Enrich synonym entries
  61. 61. Ranking of executable skills Query: Executable skill: Weather Chat Both (Weather & Chat) 14
  62. 62. Free utterance of user's query Ambiguity in query notation
 derived from speech recognition Query: Executable skill: Weather, Chat or both Difficulty of NLU 13 13 (13 )13 (13 ) Ranking of executable skills 13
  63. 63. Rule based Machine Learning based Methods to detect an user’s request ✅ Best precision ✅ Easy to control ❌ Low recall ❌ High cost to maintenance
  64. 64. ✅ High recall ✅ Low cost to maintenance ❌ Cold start problem ❌ Instability during data update Rule based Machine Learning based Methods to detect an user’s request ✅ Best precision ✅ Easy to control ❌ Low recall ❌ High cost to maintenance
  65. 65. ✅ High recall ✅ Low cost to maintenance ❌ Cold start problem ❌ Instability during data update Rule based Machine Learning based Methods to detect an user’s request ✅ Best precision ✅ Easy to control ❌ Low recall ❌ High cost to maintenance Clova I need both methods !!
  66. 66. Play a song that fits a title of an animation Clova user’s speech data (Play Evangelion's theme) Music
 search API Key Value Domain Music Intention Play Main Goal Track Keyword Evangelion's theme NLU Result NLU Result Key Value Evangelion A Cruel Angel's Thesis Theme synonym dictionary OK !! Let’s play “A Cruel Angel's Thesis"
  67. 67. Agenda • VoiceUI of Clova • Inside of our NLU system • Various issues related to NLU • Future work
  68. 68. Future work
  69. 69. Most attractive point : Voice I/O Synthesized Voice Human User AI Assistant Spoken Voice
  70. 70. Show the weather ( )
  71. 71. VUI is ultimately defined by you
  72. 72. Thinking human happiness more seriously than a human being

×