iPhoneによるAlexa/Lex/Pollyを利⽤した
⾳声対応クライアントの作成⽅法
⾃⼰紹介
クラスメソッド(株)
アプリケーションエンジニア
平内真⼀
・モバイルアプリサービス部(iOSエンジニア)
・フリーソフト BlackJumboDog
・Microsoft MVP (2013/1〜)
アジェンダ
1. Overview
2. Alexa
3. Lex
4. Polly
5. Summary
アジェンダ
1. Overview
2. Alexa
3. Lex
4. Polly
5. Summary
1.概要
Overview
1.概要
Skill
Skill
Skill
Skill
Lambda
Skill
1.概要
Skill
Skill
Skill
Skill
Lambda
Skill
ゴール
1.Amazonの各種⾳声サービスの概要
2.⾳声対応クライアントの実装⽅法
3.要件に応じたサービスの選択
1.概要
Alexa
Alexa アーキテクチャー
今⽇の東京の天気は?
今⽇の東京の天気は、晴れ時々曇り
最⾼気温は27度・・・・
Alexa アーキテクチャー
AWS	re:Invent 2016:	How	Capital	One	Built	a	Voice-Based	Banking	Skill	for	Amazon	Echo	(ALX201)
Alexa アーキテクチャー
AWS	re:Invent 2016:	How	Capital	One	Built	a	Voice-Based	Banking	Skill	for	Amazon	Echo	(ALX201)
Alexa Alexa Voice Service
https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/content/avs-api-overview
Alexa Alexa Voice Service
http://dev.classmethod.jp/smartphone/alexa-client-friendly-voice-assistant/
Alexa
• Endpoints/Protocol
•Interface
•Registration
•Authorization
Alexa Endpoints/Protocol
Region Supported	
Countries
URL
North	America US https://avs-alexa-na.amazon.com
Europe UK,	Germany https://avs-alexa-eu.amazon.com
Alexa Endpoints/Protocol
:method	=	GET
:scheme	=	https	
:path	=	/{{API	version}}/directives	
authorization	=	Bearer	{{YOUR_ACCESS_TOKEN}}	
ダウンチャンネルストリーム
タイマー・アラームなど
接続後10秒以内
クライアントからの半閉状態で開いたまま
接続の存続期間中はAVSから開く
⻑い休⽌があることは珍しいことではない
v20160207
Alexa Endpoints/Protocol
:method	=	GET
:scheme	=	https	
:path	=	/ping
authorization	=	Bearer	{{YOUR_ACCESS_TOKEN}}
Ping	and	Timeout
5分に1回
Alexa Endpoints/Protocol
:method	=	POST
:scheme	=	https	
:path	=	/{{API	version}}/events
authorization	=	Bearer	{{YOUR_ACCESS_TOKEN}}
content-type	=	multipart/form-data;	boundary={{BOUNDARY_TERM_HERE}}
Alexa Endpoints/Protocol
https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/docs/avs-http2-requests
JSON
イベント・ディレクティブ
Audio
⾳声
Alexa Endpoints/Protocol
audio	encoded
16bit	Linear	PCM	(LPCM16)
16kHz	sample	rate
Single	channel
Little	endian	byte	order
Alexa
• Endpoints/Protocol
•Interface
•Registration
•Authorization
Alexa Interface
Interface Ddescription
SpeechSynthesizer Alexaのスピーチインターフェース
SpeechRecognizer AVSのコアインターフェース 各ユーザ発話は、認識イベントを利⽤する
Speaker ミュートやミュート解除を含む、デバイスやアプリケーションの⾳量コントロールの
ためのインターフェイス
Settings ロケールなど、製品のAlexa設定を管理するためのインタフェース
PlaybackController
ボタンアフォーダンスを介して再⽣キューをナビゲートするためのインターフェイス
AudioPlayer オーディオ再⽣を管理および制御するためのインターフェイス
Alerts タイマーとアラームの設定、停⽌、および削除のためのインターフェース
System Alexaにクライアント情報を提供するためのインタフェース
Alexa Interface
SpeechRecognizer Interface
Event/Directive
Recognize	Event ユーザ⾳声の送信
StopCapture Directive 録⾳停⽌指⽰
ExpectSpeech Directive 録⾳開始指⽰(会話継続)
ExpectSpeechTimedOut Event タイムアウト
Alexa Interface
SpeechSynthesizer Interface
Event/Directive
Speak	Directive Alexaの送話
SpeechStarted Event 再⽣開始
SpeechFinished Event 再⽣終了
Alexa Interface
https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/reference/speechrecognizer
Alexa Interface
Event Code Directive
1 録⾳開始
2 録⾳終了
3 SpeechRecognizer.Recognize
+ 録⾳⾳声
200 SpeechSynthesizer.Speak
+ Alexa⾳声
4 SpeechSynthesizer.SpeechStarted 204 会話が継続する場合
SpeechRecognizer.ExpectSpeech
会話が継続しない場合
DIRECTIVEなし
5 再⽣開始
6 再⽣終了
7 SpeechSynthesizer.SpeechFinished 204
SpeechRecognizer.ExpectSpeechが返っている場合は、1に戻る
Alexa
• Endpoints/Protocol
•Interface
•Registration
•Authorization
Alexa Registration
https://developer.amazon.com
Alexa Registration
Alexa Registration
Alexa Registration
Application	Type	ID
認証で使⽤されます
Alexa Registration
Alexa Registration
Key	/	Bundle	Id
認証で使⽤されます
Alexa Registration
Alexa Registration
新規
Alexa Registration
Alexa Registration
必要なのは以下の3つ
*Application	Type	ID(変更できない)
*Bundle	ID(追加可能)
*Key (追加可能)
削除できない
Alexa
• Endpoints/Protocol
•Interface
•Registration
•Authorization
Alexa Authorization
https://developer.amazon.com/sdk-download
Alexa Authorization
Alexa Authorization
Swiftから利⽤するためには、Bridging-Headerを作成
#import <LoginWithAmazon/LoginWithAmazon.h>
Alexa Authorization
リダイレクトをスキームで受けれるように、URL	Schemesで、Bundle	Identifireの
前にamzn- を付けたスキームを指定します。
amzn-jp.classmethod.us.ios.Friendly
Alexa Authorization
開発者ポータル
Alexa Authorization
class AppDelegate:	UIResponder,	UIApplicationDelegate {
func application(_	application:	UIApplication,	
open	url:	URL,	
sourceApplication:	String?,	
annotation:	Any)	->	Bool	{
return AIMobileLib.handleOpen(url,	sourceApplication:	sourceApplication)
}
Alexa Authorization
@IBAction func tapLoginButton(_	sender:	Any)	{
let	SCOPE_DATA	= "{alexa:all:	{
productID:プロダクトID,
productInstanceAttributes:	{
{deviceSerialNumber:デバイスシリアル番号}
}
}
AIMobileLib.authorizeUser(forScopes:	["alexa:all"],
delegate:	self,	
options:	[kAIOptionScopeData:SCOPE_DATA])
}
開発者ポータル
Alexa Authorization
Alexa Authorization
https://www.amazon.com
Alexa Authorization
1.概要
Lex
Lex アーキテクチャー
AWS	Console			Lex()
Lex アーキテクチャー
Lex アーキテクチャー
Amazon	Polly	and	Amazon	Lex	Workshop
Lex デモアプリ
http://dev.classmethod.jp/smartphone/ios-lex-tap/
Lex
• Endpoints/API
•AWS	mobile	SDK
•Authorization
•SessionAttribute
Lex Endpoints/API
Service Region URL
Model building	
service
US	East
(N.Virginia)
https://models.lex.us-east-1.amazonaws.com
Runtime	
service
US	East
(N.Virginia)
https://runtime.lex.us-east-1.amazonaws.com
Lex Endpoints/API
CreateBot
DeleteBot
DeleteBotAlias
GetBotGetBotAlias
GetBotAliases
GetBuiltinIntent
GetBuiltinSlotTypes
PutBot
PutBotAlias
PutIntent
PutSlotType
Etc.
Amazon	Lex	Model	Building	Service
Lex Endpoints/API
PostContent
PostText
Amazon	Lex	Runtime	Service
Lex Endpoints/API
PostContent Sends	user	input	(text	or	speech)	to	Amazon	Lex
PostText Sends	user	input	(text-only) to	Amazon	Lex
POST	/bot/botName/alias/botAlias/user/userId/content	HTTP/1.1
x-amz-lex-session-attributes:	sessionAttributes
Content-Type:	contentType
Accept:	accept
inputStream
POST	/bot/botName/alias/botAlias/user/userId/text	HTTP/1.1
Content-type:	application/json
{
"inputText":	"string",
"sessionAttributes":	{
"string"	:	"string"				
}
}
Lex
• Endpoints/API
•AWS	mobile	SDK
•Authorization
•SessionAttribute
Lex AWS mobile SDK for iOS
Lex AWS mobile SDK for iOS
http://dev.classmethod.jp/smartphone/amazon-lex-ios-sdk/
1. pod install
2. マイク利⽤許可(info.plist)
3. CognitoによるIdentity発⾏とLexの初期化
4. AWSLexVoiceBottn
Lex AWS mobile SDK for iOS
source	
'https://github.com/CocoaPods/Specs.git’
target	'BotSampleApp'	do
platform	:ios,	'9.0’
use_frameworks!
pod	'AWSLex’
pod	'AWSCognito’
end
Lex AWS mobile SDK for iOS
<key>NSMicrophoneUsageDescription</key>
<string>For	interaction	with	Amazon	Lex</string>
Lex AWS mobile SDK for iOS
let poolId =	"us-east-1:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx"
let credentialsProvider =	AWSCognitoCredentialsProvider(regionType:.USEast1,identityPoolId:poolId)
let configuration =	AWSServiceConfiguration(region:.USEast1,	credentialsProvider:credentialsProvider)
AWSServiceManager.default().defaultServiceConfiguration =	configuration
let BotName =	“LexSample"
let BotAlias =	“$LATEST"
let chatConfig =	AWSLexInteractionKitConfig.defaultInteractionKitConfig(withBotName:	BotName,	botAlias:	BotAlias)
AWSLexInteractionKit.register(with:	configuration!,	interactionKitConfiguration:	chatConfig,	forKey:	"AWSLexVoiceButton")
AWSLexInteractionKit.register(with:	configuration!,	interactionKitConfiguration:	chatConfig,	forKey:	"chatConfig")
cognito
Lex
Lex AWS mobile SDK for iOS
Lex AWS mobile SDK for iOS
http://dev.classmethod.jp/smartphone/amazon-lex-ios-sdk/
Lex AWS mobile SDK for iOS
Lex AWS mobile SDK for iOS
Lex AWS mobile SDK for iOS
etc.
Lex AWS mobile SDK for iOS
• AWSLexInteractionDelegate
• AWSLexAudioPlayerDelegate
• AWSLexMicrophoneDelegate
Lex AWS mobile SDK for iOS
Fullfillment
switchMode
InteractionDelegate
Lex AWS mobile SDK for iOS
started
finished
AudioPlayerDelegate
Lex AWS mobile SDK for iOS
start
end
SoundLevelChaned
MicrophoneDelegate
Lex
• Endpoints/API
•AWS	mobile	SDK
•Authorization
•SessionAttribute
Lex Authorization
Lex Authorization
Lex
• Endpoints/API
•AWS	mobile	SDK
•Authorization
•SessionAttribute
Lex SessionAttribute
Lex SessionAttribute
Lex SessionAttribute
Lex SessionAttribute
{
"currentIntent":	{
"name":	"intent-name",
"slots":	{"slot-name":	"value",},
"confirmationStatus":	"None, Confirmed, or Denied (intent confirmation, if configured)",
},
"bot":	{
"name":	"bot-name",
"alias":	"bot-alias",
"version":	"bot-version"
},
"userId":	"User	ID	specified in	the POST	request to Amazon	Lex.",
"inputTranscript":	"Text	used to process the request",
"invocationSource":	"FulfillmentCodeHook or DialogCodeHook",
"outputDialogMode":	"Text	or Voice, based on	ContentType request header in	runtime API	request",
"sessionAttributes":	{
"key1":	"value1",
"key2":	"value2"
}
}
Lambda	Input
sessionAttribute
http://docs.aws.amazon.com/ja_jp/lex/latest/dg/lambda-input-response-format.html
Lex SessionAttribute
{
"sessionAttributes":	{
"key1":	"value1",
"key2":	"value2"
},
"dialogAction":	{
"type":	"Close",
"fulfillmentState":	"Fulfilled	or	Failed",
"message":	{
"contentType":	"PlainText or	SSML",
"content":	"Message	to	convey	to	the	user.	For	example,	Thanks,	your	pizza	has	been	ordered."
},
"responseCard":	{
"version":	integer-value,
"contentType":	"application/vnd.amazonaws.card.generic",
"genericAttachments":	[
{
"title":"card-title",
"subTitle":"card-sub-title",
"imageUrl":"URL	of	the	image	to	be	shown",
Lambda	Response
sessionAttribute
http://docs.aws.amazon.com/ja_jp/lex/latest/dg/lambda-input-response-format.html
Lex SessionAttribute
POST	/bot/botName/alias/botAlias/user/userId/text HTTP/1.1
Content-type:	application/json
{
"inputText":	"string",
"sessionAttributes":	{
"string"	:	"string"
}
}
Client	POST
sessionAttribute
http://docs.aws.amazon.com/ja_jp/lex/latest/dg/API_runtime_PostText.html
Lex SessionAttribute
HTTP/1.1	200
Content-type:	application/json
{
"dialogState":	"string",
"intentName":	"string",
"message":	"string",
"responseCard":	{
"contentType":	"string",
"version":	"string"
},
"sessionAttributes":	{
"string"	:	"string"
},
"slots":	{
"string"	:	"string"
},
"slotToElicit":	"string"
}
Client	Response
sessionAttribute
http://docs.aws.amazon.com/ja_jp/lex/latest/dg/API_runtime_PostText.html
Lex SessionAttribute
"sessionAttributes":	{
"cards":	[
{
"title":"Mocha",
"subtitle":"Please	enjoy	the	fragrant	mocha",
"body":"$4.15",
"imageUrl":"https://exsample.com/mocha.png",
"slotName":"CoffeeType",
"slotValue":"Mocha"
},
タップされた場合は、
Textで送る
Polly
Polly アーキテクチャー
https://www.slideshare.net/AmazonWebServices/amazon-polly?qid=c9c8e2c4-1c27-41c1-bc5f-
9b1e0c651c21&v=&b=&from_search=11
Polly アーキテクチャー
https://www.slideshare.net/AmazonWebServices/amazon-polly?qid=c9c8e2c4-1c27-41c1-bc5f-
9b1e0c651c21&v=&b=&from_search=11
Polly
•AWS	mobile	SDK
•Authorization
Polly AWS mobile SMD
source	
'https://github.com/CocoaPods/Specs.git’
target	'BotSampleApp'	do
platform	:ios,	'9.0’
use_frameworks!
pod	'AWSPolly’
pod	'AWSCognito’
end
Polly AWS mobile SMD
let	input	=	AWSPollySynthesizeSpeechURLBuilderRequest()
input.text	=	textView.text
input.outputFormat	=	AWSPollyOutputFormat.mp3
input.voiceId	=	AWSPollyVoiceId.joanna
let	builder	=	AWSPollySynthesizeSpeechURLBuilder.default().getPreSignedURL(input)
builder.continueOnSuccessWith	{	(awsTask:	AWSTask<NSURL>)	->	Any?	in
let	url	=	awsTask.result!
self.audioPlayer.replaceCurrentItem(with:	AVPlayerItem(url:	url	as	URL))
self.audioPlayer.play()
return	nil
}
http://dev.classmethod.jp/smartphone/amazon-polly-sdk/	
AWSPollySynthesizeSpeechURLBuilderRequest
テキスト
声の種類
AVFoundation.AudioPlayer
Polly
•AWS	mobile	SDK
•Authorization
Polly Authorization
Polly Authorization
Polly AWS mobile SMD
⾔語24
声の種類
Summary
Summary
Skill
Skill
Skill
Skill
Intent, Slots
Lambda
Skill
Intent, Slots,
SessionAttributes
Intent, Slots,
SessionAttributes
Summary
1.Amazonの各種⾳声サービスの概要
2.⾳声クライアントの実装⽅法
3.要件に応じた選択
これからもどうぞ宜しくお願い申し上げます。

Developers.io 2017 iPhoneによるAlexa/Lex/Pollyを利用した 音声対応クライアントの作成方法