This document discusses NUGU, an AI assistant created by SK Telecom. It provides information on NUGU's capabilities and growth, as well as how it is used by SK Telecom, third parties, and to benefit society. Key statistics show dramatic growth in NUGU's weekly and monthly active users from 2016 to 2019. The document outlines NUGU's architecture and technology components to enable natural language interactions across devices and platforms.
This document discusses NUGU, an AI assistant created by SK Telecom. It provides information on NUGU's capabilities and growth, as well as how it is used by SK Telecom, third parties, and to benefit society. Key statistics show dramatic growth in NUGU's weekly and monthly active users from 2016 to 2019. The document outlines NUGU's architecture and technology components to enable natural language interactions across devices and platforms.
The NUGU SDK allows developers to build voice assistants across platforms using common APIs and interfaces. It provides modular components for audio playback, speech recognition, text-to-speech and more that can be customized. The SDK uses a dependency injection philosophy to isolate implementations and allow composite protocols, enabling services like automated bark recognition to be added.
The document discusses NUGU SDK, which allows developers to connect applications to NUGU AI assistants. It describes the SDK architecture including layers for the SDK, device control, and applications. It also outlines key SDK functions like device authentication flows, integrating with the SDK, and capability interfaces that applications use to control device functions through directives from the NUGU server. Extension agents allow supporting new capabilities by creating custom directives and applications.
[NUGU CONFERENCE 2019] 트랙 A-4 : Zero-shot learning for Personalized Text-to-S...NUGU developers
The document discusses NUGU's DNN TTS system and zero-shot learning speech synthesis technology. It provides an overview of T-Voice 1.0 and 1.5 systems, reviews pros and cons of different TTS approaches, and describes plans to develop a DNN-based personalized TTS system using techniques like speaker encoding networks and fine-tuning pre-trained models to generate voices for new speakers with limited training data.
[NUGU CONFERENCE 2019] 트랙 A-3 : NUGU 개인화 음악 추천 기술 소개NUGU developers
NUGU personalized music recommendation technology introduction
The document introduces NUGU's personalized music recommendation system. It discusses challenges like short listening durations and large music libraries. It presents a two-stage hybrid recommendation architecture that generates candidates with various logics before re-ranking based on user preferences and context. Context awareness considers factors like seasonality and occasions. A real-time feedback loop dynamically optimizes recommendations based on implicit and explicit user feedback. Future work includes generating personalized playlists based on time, place, and occasion preferences.
[NUGU CONFERENCE 2019] 트랙 A-2 : NUGU call 적용 기술 및 서비스 소개NUGU developers
NUGU call is a hands-free calling platform that allows connections anywhere through NUGU Touch Points. It supports multi-device connections under one account and has voice UX features like initiating and ending calls through voice commands. Voice quality is maintained through standards for loudness, frequency response, and other factors. Real-time communication uses internet protocols and signaling for call setup, media exchange, and termination. Upcoming features will expand NUGU call to support video calls and intelligent contextual commands.
1. STRAIGHT is speech synthesis software that can synthesize high-quality speech from text in various languages.
2. It uses a vocoder-based synthesis method that can synthesize natural-sounding speech that closely matches the target speaker's voice.
3. The software is open-source and free for research and educational purposes.
The NUGU SDK allows developers to build voice assistants across platforms using common APIs and interfaces. It provides modular components for audio playback, speech recognition, text-to-speech and more that can be customized. The SDK uses a dependency injection philosophy to isolate implementations and allow composite protocols, enabling services like automated bark recognition to be added.
The document discusses NUGU SDK, which allows developers to connect applications to NUGU AI assistants. It describes the SDK architecture including layers for the SDK, device control, and applications. It also outlines key SDK functions like device authentication flows, integrating with the SDK, and capability interfaces that applications use to control device functions through directives from the NUGU server. Extension agents allow supporting new capabilities by creating custom directives and applications.
[NUGU CONFERENCE 2019] 트랙 A-4 : Zero-shot learning for Personalized Text-to-S...NUGU developers
The document discusses NUGU's DNN TTS system and zero-shot learning speech synthesis technology. It provides an overview of T-Voice 1.0 and 1.5 systems, reviews pros and cons of different TTS approaches, and describes plans to develop a DNN-based personalized TTS system using techniques like speaker encoding networks and fine-tuning pre-trained models to generate voices for new speakers with limited training data.
[NUGU CONFERENCE 2019] 트랙 A-3 : NUGU 개인화 음악 추천 기술 소개NUGU developers
NUGU personalized music recommendation technology introduction
The document introduces NUGU's personalized music recommendation system. It discusses challenges like short listening durations and large music libraries. It presents a two-stage hybrid recommendation architecture that generates candidates with various logics before re-ranking based on user preferences and context. Context awareness considers factors like seasonality and occasions. A real-time feedback loop dynamically optimizes recommendations based on implicit and explicit user feedback. Future work includes generating personalized playlists based on time, place, and occasion preferences.
[NUGU CONFERENCE 2019] 트랙 A-2 : NUGU call 적용 기술 및 서비스 소개NUGU developers
NUGU call is a hands-free calling platform that allows connections anywhere through NUGU Touch Points. It supports multi-device connections under one account and has voice UX features like initiating and ending calls through voice commands. Voice quality is maintained through standards for loudness, frequency response, and other factors. Real-time communication uses internet protocols and signaling for call setup, media exchange, and termination. Upcoming features will expand NUGU call to support video calls and intelligent contextual commands.
1. STRAIGHT is speech synthesis software that can synthesize high-quality speech from text in various languages.
2. It uses a vocoder-based synthesis method that can synthesize natural-sounding speech that closely matches the target speaker's voice.
3. The software is open-source and free for research and educational purposes.
18. IncrementalClustering
-초기Cluster를새로운입력기반으로증가시키는방식
-초기Cluster가유지되기때문에항상성이유지됨
Procedure
-Initial Clustering
초기의 소량 Sample에 대해 Batch Clustering 실행
-Identification(Cluster Based, Every Sample)
새로운 입력에 대해 기존 Cluster 포함 여부 판단
-ClusterUpdate(TriggeredExecution)
입력 순서에 의해 Identification에 실패한 Sample을 위해Re-Identification
기존 Cluster에 포함되지 않은 Sample(noise sample)을 대상으로 Batch Clustering
새롭게 형성된 Cluster와 기존 Cluster의 merge(Rank-order Merge*) Periodic Or triggered
얼굴인식:FaceClustering