SlideShare a Scribd company logo
1 of 53
Download to read offline
Otto was my monkey plush, now
is my vocal assistant.
Software
Base Software
Orchestration and virtualization
Docker is an open platform for developers and
sysadmins to build, ship, and run distributed
applications, whether on laptops, data center VMs,
or the cloud.
Runtime
Node.js® is a JavaScript runtime built on Chrome's
V8 JavaScript engine.
DBMS
MongoDB is a document database with the
scalability and flexibility that you want with the
querying and indexing that you need.
Speech Recognizer
Google Cloud Speech API enables developers
to convert audio to text by applying powerful
neural network models in an easy to use API.
NLP
Dialogflow is an end-to-end development suite
for building conversational interfaces for
websites, mobile applications, popular messaging
platforms, and IoT devices
TTS
Amazon Polly is a cloud service that converts text
into lifelike speech. You can use Amazon Polly to
develop applications that increase engagement
and accessibility.
Hotword detector
Snowboy is an highly customizable hotword
detection engine that is embedded real-time and
is always listening (even when off-line)
compatible with Raspberry Pi, (Ubuntu) Linux, and
Mac OS X.
Architecture
Architecture for client mode
TTS Server
Client
Database
NLP
SR
Architecture for messaging bots
Server
Server listens for incoming requests by messaging platforms
I/O Drivers and Accessories
I/O Drivers
I/O drivers are the way the AI handles inputs and output.

Every I/O module knows how to handle user input and
output to the user.
I/OUser
Input
App
startInput
Output
I/O Drivers
Example of I/O drivers are:
- IO.Telegram: handle I/O for a Telegram bot

- IO.Messenger: handle I/O for a Facebook Messenger bot

- IO.Test:  handle I/O using the CLI (used for test purposes)

- IO.Rest: handle I/O via HTTP REST API

- IO.Kid: handle input using microphone and speech
recognizer and output using a TTS via a speaker

IO.Kid
It uses your microphone to register your voice; once it detects an hot
word (example: Hey BOT), it sends the stream through an online
speech recognizer.
When you finish to talk, it sends the recognized speech over AI that
returns a fulfillment.



The fulfillment it's sent over an online TTS to get an audio file that is
played over the speaker.
https://github.com/kopiro/otto-ai/blob/master/src/io/kid.js
IO.Kid: HW to SR
Client
IO.Kid
User
User says:

"Hey, Otto"
SR
Redirect microphone

stream to SR
IO.Kid: SR to NLP
Client
IO.Kid
User says:

"What time is it?"
NLPSR
"What time is it?"
IO.Kid: NLP to Fulfilment to TTS
Client
IO.Kid
{ "action": "date.now" }
TTSNLP
It's 18.15
Server
Webhook for action resolution
IO.Kid: TTS to Speaker
Client

IO.Kid
audio.mp3
SpeakerTTS
Output
IO.Telegram
It listens via webhook (or via polling) the chat events of your Telegram
bot, send the text over AI that return an output.

The output is used to respond to the user request via Telegram.
https://github.com/kopiro/otto-ai/blob/master/src/io/telegram.js
IO.Telegram: Hotword to NLP
Server

IO.Telegram
Telegram

User
{ "type": "text",

"text": "What time is it?" }
NLP
"What time is it?"
IO.Kid: NLP to Fulfillment to Telegram
Server
IO.Telegram
{ "action": "date.now" }
NLP
It's 18.15
Server
Telegram

User
Webhook for action resolution
I/O Accessories
I/O Accessories are similar to drivers, but don't handle input and output
directly.

They can be attached to I/O driver to perform additional things.
Example of I/O accessories are:

- Chromecast

- GPIO_Button

- Leds

- Mopidy
I/O Accessories
Accessories listen for I/O drivers events and, when an output to a driver is
request, this output could be forwarded to accessories.
Each accessory has a method called canHandleOutput that should return:

- YES_AND_BREAK

- YES_AND_CONTINUE

- NO



Depending on this return value, the IOManager forward the output to the next
configured driver or stops the chain.
Example: https://github.com/kopiro/otto-ai/blob/master/src/io_accessories/
chromecast.js
Intents and Entities
Intents
An intent represents a mapping between what a user says
and what action should be taken by your software.
Entities
Entities are tools used for extracting parameters.
Actions
Actions
An action corresponds to the step your application will take when a
specific intent has been triggered by a user’s input. 
In the library, is a responder for an intent that has logic inside.
exports.id = 'hello.name';
module.exports = async function({ sessionId, result }, session) {
let { parameters: p, fulfillment } = result;
if (p.name == null) throw 'Invalid parameters';
return {
speech: `Hello ${p.name}!`
};
};
Actions: local vs remote
Each action can potentially run on the server or on the client.

This can be possibile thanks to the architecture based on the same
language (NodeJS) for both platforms.
In the intent, you can specify if this action should preferably run in
the server on in a client.



For example, a very computationally intensive action (algorithm to
detect next move in a chess game) should run in a powerful server and
only return the output.
Actions: trust boundary
Local

Action
Remote
Action
Internal network trusted boundary
Denied OK
Instead, if you have to control your home lights, you should run the action locally on the
client to take advantage that the client is in the same Wi-Fi network with your lights,
avoiding to expose your IoT things over Internet.
Action (ran in server mode)
NLP Server Action
1 2
34
Client
50
Action (ran in client mode)
NLP
Action
2
3
Client
10
Fulfillment
Fulfillment
A fulfillment is the output of an intent, whether it was performed by
an action or a simple output string.
Fulfillment transformer
Every fulfillment passes into a transformer where it could be filtered or
altered is some ways.
async function fulfillmentTransformer(fulfillment, session) {
fulfillment = fulfillmentSanitizer(fulfillment);
_.defaults(fulfillment.data, fulfillment.payload);
if (!_.isEmpty(fulfillment.speech)) {
fulfillment.speech = await Translator.translate(
fulfillment.speech,
session.getTranslateTo()
);
}
return fulfillment;
}
Fulfillment types
speech | String that could be spoken or written
data.error | Error object to send.
data.language | Language override for speech.
data.replies[] | List of choices that the user can select.
data.url | URL to send or to open
data.music | Music to send or to play.
data.feedback | Boolean value indicating that this is
temporary feedback until the real response will be sent
data.game | Game that can be handled via Telegram.
data.video | Video to send or to show.
data.audio | Audio to send or to show.
data.image | Image to send or to show.
data.lyrics | Lyrics object of a song.
data.voice | Audio file to send or play via voice middlewares.
Complete Flow
User
NLP Intent
Action Fulfillment
Output
Input
User
What can I do?
Intent implemented right now
Akinator

Uses machine learning to guess a
celebrity



Alarm

Set alarms, meeting, timers



Chess

Plays chess with a MinMax algorithm



Coinflip

Do a coinflip
Date.now

Tell current date

Gocrazy

Say random words



Lyrics.Search

Search a lyrics from a track



Lyrics.Track

Search a track from a lyrics



Metronome

Do a metronome
SCF

Play Sasso-Carta-Forbice

Intent implemented right now
Torrent.Download

Search and download torrents



Translate.Text

Translate a text in various languages



Weather.Search

Get informations about weather



Music.*

Search music on Spotify and play over a
speaker or Chromecast
Youtube.*

Search videos on Youtube and play over
a Chromecast
Lights.*

Power on/off Xiaomi Lights, change color
or intensity



Draw

Search an image



Knowledge.Get

Uses WolframAlpha to get all kind of
universal knowledge



Camera.Spy

Record a video and upload in Cloud
SmallTalk.*

All kind of dialogues. 

Thanks @ValentinaCiav
How to write and test an action
git clone https://github.com/kopiro/otto-ai/



... configure ...
cp ./src/actions/__example.js ./src/actions/namespace/newaction.js


... develop ...



node main.js
Hardware
Base hardware
Raspberry PI Zero W
Re-Speaker 2-Mics Pi HAT
PowerBoost 500 Charger
Additional hardware
LiPo Battery 3.5V
Push Button On/Off Switch Button
Speaker
Re-Speaker 2-Mics Pi HAT
The board is developed based on WM8960, a low power stereo codec.

There are 2 microphones on both sides of the board for collecting sounds and it also provides
3 APA102 RGB LEDs, 1 User Button and 2 on-board Grove interfaces for expanding your
applications.
PowerBoost 500 Charger
With a built-in battery charger circuit, you'll be able to keep your project running even while
recharging the battery!

This little DC/DC boost converter module can be powered by any 3.7V LiIon/LiPoly battery, and
convert the battery output to 5.2V DC for running your 5V projects.
ENGND
USB 5V
5V GND
GPIO

PINS
GPIO8
GND
Otto AI
Otto AI

More Related Content

What's hot

Local-Link Networking
Local-Link NetworkingLocal-Link Networking
Local-Link Networkingsinchume
 
Devoxx uk 2019 digital jukebox
Devoxx uk 2019 digital jukeboxDevoxx uk 2019 digital jukebox
Devoxx uk 2019 digital jukeboxScott Sosna
 
Adhearsion and Telegraph Framework Presentation
Adhearsion and Telegraph Framework PresentationAdhearsion and Telegraph Framework Presentation
Adhearsion and Telegraph Framework PresentationJustin Grammens
 
Asterisk-Java Framework Presentation
Asterisk-Java Framework PresentationAsterisk-Java Framework Presentation
Asterisk-Java Framework PresentationJustin Grammens
 
Artificial Neural Networks on a Tic Tac Toe console application
Artificial Neural Networks on a Tic Tac Toe console applicationArtificial Neural Networks on a Tic Tac Toe console application
Artificial Neural Networks on a Tic Tac Toe console applicationEduardo Gulias Davis
 
Getting started with Linux and Python by Caffe
Getting started with Linux and Python by CaffeGetting started with Linux and Python by Caffe
Getting started with Linux and Python by CaffeLihang Li
 
The Casting Demonstrator. Using the Raspberry Pi for graphics and simulated f...
The Casting Demonstrator. Using the Raspberry Pi for graphics and simulated f...The Casting Demonstrator. Using the Raspberry Pi for graphics and simulated f...
The Casting Demonstrator. Using the Raspberry Pi for graphics and simulated f...InfinIT - Innovationsnetværket for it
 
Bh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackopsBh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackopsDan Kaminsky
 
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...nehachhh
 
Building AI-powered Apps on AWS
Building AI-powered Apps on AWSBuilding AI-powered Apps on AWS
Building AI-powered Apps on AWSAdrian Hornsby
 
A quick overview of why to use and how to set up iPython notebooks for research
A quick overview of why to use and how to set up iPython notebooks for researchA quick overview of why to use and how to set up iPython notebooks for research
A quick overview of why to use and how to set up iPython notebooks for researchAdam Pah
 
Internet of Things With PHP
Internet of Things With PHPInternet of Things With PHP
Internet of Things With PHPAdam Englander
 

What's hot (14)

Local-Link Networking
Local-Link NetworkingLocal-Link Networking
Local-Link Networking
 
Devoxx uk 2019 digital jukebox
Devoxx uk 2019 digital jukeboxDevoxx uk 2019 digital jukebox
Devoxx uk 2019 digital jukebox
 
Adhearsion and Telegraph Framework Presentation
Adhearsion and Telegraph Framework PresentationAdhearsion and Telegraph Framework Presentation
Adhearsion and Telegraph Framework Presentation
 
Asterisk-Java Framework Presentation
Asterisk-Java Framework PresentationAsterisk-Java Framework Presentation
Asterisk-Java Framework Presentation
 
Artificial Neural Networks on a Tic Tac Toe console application
Artificial Neural Networks on a Tic Tac Toe console applicationArtificial Neural Networks on a Tic Tac Toe console application
Artificial Neural Networks on a Tic Tac Toe console application
 
Getting started with Linux and Python by Caffe
Getting started with Linux and Python by CaffeGetting started with Linux and Python by Caffe
Getting started with Linux and Python by Caffe
 
The Casting Demonstrator. Using the Raspberry Pi for graphics and simulated f...
The Casting Demonstrator. Using the Raspberry Pi for graphics and simulated f...The Casting Demonstrator. Using the Raspberry Pi for graphics and simulated f...
The Casting Demonstrator. Using the Raspberry Pi for graphics and simulated f...
 
Konstruktion omkring en Raspberry Pi
Konstruktion omkring en Raspberry PiKonstruktion omkring en Raspberry Pi
Konstruktion omkring en Raspberry Pi
 
Bh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackopsBh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackops
 
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
 
Building AI-powered Apps on AWS
Building AI-powered Apps on AWSBuilding AI-powered Apps on AWS
Building AI-powered Apps on AWS
 
A quick overview of why to use and how to set up iPython notebooks for research
A quick overview of why to use and how to set up iPython notebooks for researchA quick overview of why to use and how to set up iPython notebooks for research
A quick overview of why to use and how to set up iPython notebooks for research
 
Internet of Things With PHP
Internet of Things With PHPInternet of Things With PHP
Internet of Things With PHP
 
Teaming up with robot!
Teaming up with robot!Teaming up with robot!
Teaming up with robot!
 

Similar to Otto AI

Machine learning, WTF!?
Machine learning, WTF!? Machine learning, WTF!?
Machine learning, WTF!? Alê Borba
 
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...Soroosh Khodami
 
Building Voice Controls and Integrating with Automation Actions on an IoT Net...
Building Voice Controls and Integrating with Automation Actions on an IoT Net...Building Voice Controls and Integrating with Automation Actions on an IoT Net...
Building Voice Controls and Integrating with Automation Actions on an IoT Net...Intel® Software
 
The Future of Cross-Platform is Native
The Future of Cross-Platform is NativeThe Future of Cross-Platform is Native
The Future of Cross-Platform is NativeJustin Mancinelli
 
Voice Assistant Expert Services
Voice Assistant Expert ServicesVoice Assistant Expert Services
Voice Assistant Expert ServicesJamie (Taka) Wang
 
DevNet @TAG - Spark & Tropo APIs - Milan/Rome May 2016
DevNet @TAG - Spark & Tropo APIs - Milan/Rome May 2016DevNet @TAG - Spark & Tropo APIs - Milan/Rome May 2016
DevNet @TAG - Spark & Tropo APIs - Milan/Rome May 2016Cisco DevNet
 
Otra forma de hacer aplicaciones de telefonía
Otra forma de hacer aplicaciones de telefoníaOtra forma de hacer aplicaciones de telefonía
Otra forma de hacer aplicaciones de telefoníaMartin Perez
 
JmDNS : Service Discovery for the 21st Century
 JmDNS : Service Discovery for the 21st Century JmDNS : Service Discovery for the 21st Century
JmDNS : Service Discovery for the 21st CenturyGnu Alsonative
 
JmDNS : Service Discovery for the 21st Century
 JmDNS : Service Discovery for the 21st Century JmDNS : Service Discovery for the 21st Century
JmDNS : Service Discovery for the 21st CenturyGnu Alsonative
 
Game programming with Groovy
Game programming with GroovyGame programming with Groovy
Game programming with GroovyJames Williams
 
New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...
New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...
New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...Amazon Web Services
 
Build Great Networked APIs with Swift, OpenAPI, and gRPC
Build Great Networked APIs with Swift, OpenAPI, and gRPCBuild Great Networked APIs with Swift, OpenAPI, and gRPC
Build Great Networked APIs with Swift, OpenAPI, and gRPCTim Burks
 
Cloud-Native Roadshow - Google - Atlanta
Cloud-Native Roadshow - Google - AtlantaCloud-Native Roadshow - Google - Atlanta
Cloud-Native Roadshow - Google - AtlantaVMware Tanzu
 
Creating Flash Content for Multiple Screens
Creating Flash Content for Multiple ScreensCreating Flash Content for Multiple Screens
Creating Flash Content for Multiple Screenspaultrani
 
Flash for Mobile Devices
Flash for Mobile DevicesFlash for Mobile Devices
Flash for Mobile Devicespaultrani
 
Serverless APIs with Apache OpenWhisk
Serverless APIs with Apache OpenWhiskServerless APIs with Apache OpenWhisk
Serverless APIs with Apache OpenWhiskDaniel Krook
 
Minor PPT.pptx
Minor PPT.pptxMinor PPT.pptx
Minor PPT.pptxNewbGaming
 
AWS re:Invent 2016: Robots: The Fading Line Between Real and Virtual Worlds (...
AWS re:Invent 2016: Robots: The Fading Line Between Real and Virtual Worlds (...AWS re:Invent 2016: Robots: The Fading Line Between Real and Virtual Worlds (...
AWS re:Invent 2016: Robots: The Fading Line Between Real and Virtual Worlds (...Amazon Web Services
 

Similar to Otto AI (20)

Machine learning, WTF!?
Machine learning, WTF!? Machine learning, WTF!?
Machine learning, WTF!?
 
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...
 
Building Voice Controls and Integrating with Automation Actions on an IoT Net...
Building Voice Controls and Integrating with Automation Actions on an IoT Net...Building Voice Controls and Integrating with Automation Actions on an IoT Net...
Building Voice Controls and Integrating with Automation Actions on an IoT Net...
 
The Future of Cross-Platform is Native
The Future of Cross-Platform is NativeThe Future of Cross-Platform is Native
The Future of Cross-Platform is Native
 
Voice Assistant Expert Services
Voice Assistant Expert ServicesVoice Assistant Expert Services
Voice Assistant Expert Services
 
DevNet @TAG - Spark & Tropo APIs - Milan/Rome May 2016
DevNet @TAG - Spark & Tropo APIs - Milan/Rome May 2016DevNet @TAG - Spark & Tropo APIs - Milan/Rome May 2016
DevNet @TAG - Spark & Tropo APIs - Milan/Rome May 2016
 
01 introduction
01 introduction01 introduction
01 introduction
 
Otra forma de hacer aplicaciones de telefonía
Otra forma de hacer aplicaciones de telefoníaOtra forma de hacer aplicaciones de telefonía
Otra forma de hacer aplicaciones de telefonía
 
JmDNS : Service Discovery for the 21st Century
 JmDNS : Service Discovery for the 21st Century JmDNS : Service Discovery for the 21st Century
JmDNS : Service Discovery for the 21st Century
 
JmDNS : Service Discovery for the 21st Century
 JmDNS : Service Discovery for the 21st Century JmDNS : Service Discovery for the 21st Century
JmDNS : Service Discovery for the 21st Century
 
Game programming with Groovy
Game programming with GroovyGame programming with Groovy
Game programming with Groovy
 
New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...
New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...
New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...
 
Ruby voip
Ruby voipRuby voip
Ruby voip
 
Build Great Networked APIs with Swift, OpenAPI, and gRPC
Build Great Networked APIs with Swift, OpenAPI, and gRPCBuild Great Networked APIs with Swift, OpenAPI, and gRPC
Build Great Networked APIs with Swift, OpenAPI, and gRPC
 
Cloud-Native Roadshow - Google - Atlanta
Cloud-Native Roadshow - Google - AtlantaCloud-Native Roadshow - Google - Atlanta
Cloud-Native Roadshow - Google - Atlanta
 
Creating Flash Content for Multiple Screens
Creating Flash Content for Multiple ScreensCreating Flash Content for Multiple Screens
Creating Flash Content for Multiple Screens
 
Flash for Mobile Devices
Flash for Mobile DevicesFlash for Mobile Devices
Flash for Mobile Devices
 
Serverless APIs with Apache OpenWhisk
Serverless APIs with Apache OpenWhiskServerless APIs with Apache OpenWhisk
Serverless APIs with Apache OpenWhisk
 
Minor PPT.pptx
Minor PPT.pptxMinor PPT.pptx
Minor PPT.pptx
 
AWS re:Invent 2016: Robots: The Fading Line Between Real and Virtual Worlds (...
AWS re:Invent 2016: Robots: The Fading Line Between Real and Virtual Worlds (...AWS re:Invent 2016: Robots: The Fading Line Between Real and Virtual Worlds (...
AWS re:Invent 2016: Robots: The Fading Line Between Real and Virtual Worlds (...
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Otto AI

  • 1. Otto was my monkey plush, now is my vocal assistant.
  • 4. Orchestration and virtualization Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications, whether on laptops, data center VMs, or the cloud.
  • 5. Runtime Node.js® is a JavaScript runtime built on Chrome's V8 JavaScript engine.
  • 6. DBMS MongoDB is a document database with the scalability and flexibility that you want with the querying and indexing that you need.
  • 7. Speech Recognizer Google Cloud Speech API enables developers to convert audio to text by applying powerful neural network models in an easy to use API.
  • 8. NLP Dialogflow is an end-to-end development suite for building conversational interfaces for websites, mobile applications, popular messaging platforms, and IoT devices
  • 9. TTS Amazon Polly is a cloud service that converts text into lifelike speech. You can use Amazon Polly to develop applications that increase engagement and accessibility.
  • 10. Hotword detector Snowboy is an highly customizable hotword detection engine that is embedded real-time and is always listening (even when off-line) compatible with Raspberry Pi, (Ubuntu) Linux, and Mac OS X.
  • 12. Architecture for client mode TTS Server Client Database NLP SR
  • 13. Architecture for messaging bots Server Server listens for incoming requests by messaging platforms
  • 14. I/O Drivers and Accessories
  • 15. I/O Drivers I/O drivers are the way the AI handles inputs and output.
 Every I/O module knows how to handle user input and output to the user. I/OUser Input App startInput Output
  • 16. I/O Drivers Example of I/O drivers are: - IO.Telegram: handle I/O for a Telegram bot
 - IO.Messenger: handle I/O for a Facebook Messenger bot
 - IO.Test:  handle I/O using the CLI (used for test purposes)
 - IO.Rest: handle I/O via HTTP REST API
 - IO.Kid: handle input using microphone and speech recognizer and output using a TTS via a speaker

  • 17. IO.Kid It uses your microphone to register your voice; once it detects an hot word (example: Hey BOT), it sends the stream through an online speech recognizer. When you finish to talk, it sends the recognized speech over AI that returns a fulfillment.
 
 The fulfillment it's sent over an online TTS to get an audio file that is played over the speaker. https://github.com/kopiro/otto-ai/blob/master/src/io/kid.js
  • 18. IO.Kid: HW to SR Client IO.Kid User User says:
 "Hey, Otto" SR Redirect microphone
 stream to SR
  • 19. IO.Kid: SR to NLP Client IO.Kid User says:
 "What time is it?" NLPSR "What time is it?"
  • 20. IO.Kid: NLP to Fulfilment to TTS Client IO.Kid { "action": "date.now" } TTSNLP It's 18.15 Server Webhook for action resolution
  • 21. IO.Kid: TTS to Speaker Client
 IO.Kid audio.mp3 SpeakerTTS Output
  • 22. IO.Telegram It listens via webhook (or via polling) the chat events of your Telegram bot, send the text over AI that return an output.
 The output is used to respond to the user request via Telegram. https://github.com/kopiro/otto-ai/blob/master/src/io/telegram.js
  • 23. IO.Telegram: Hotword to NLP Server
 IO.Telegram Telegram
 User { "type": "text",
 "text": "What time is it?" } NLP "What time is it?"
  • 24. IO.Kid: NLP to Fulfillment to Telegram Server IO.Telegram { "action": "date.now" } NLP It's 18.15 Server Telegram
 User Webhook for action resolution
  • 25. I/O Accessories I/O Accessories are similar to drivers, but don't handle input and output directly.
 They can be attached to I/O driver to perform additional things. Example of I/O accessories are:
 - Chromecast
 - GPIO_Button
 - Leds
 - Mopidy
  • 26. I/O Accessories Accessories listen for I/O drivers events and, when an output to a driver is request, this output could be forwarded to accessories. Each accessory has a method called canHandleOutput that should return:
 - YES_AND_BREAK
 - YES_AND_CONTINUE
 - NO
 
 Depending on this return value, the IOManager forward the output to the next configured driver or stops the chain. Example: https://github.com/kopiro/otto-ai/blob/master/src/io_accessories/ chromecast.js
  • 28. Intents An intent represents a mapping between what a user says and what action should be taken by your software.
  • 29. Entities Entities are tools used for extracting parameters.
  • 31. Actions An action corresponds to the step your application will take when a specific intent has been triggered by a user’s input.  In the library, is a responder for an intent that has logic inside. exports.id = 'hello.name'; module.exports = async function({ sessionId, result }, session) { let { parameters: p, fulfillment } = result; if (p.name == null) throw 'Invalid parameters'; return { speech: `Hello ${p.name}!` }; };
  • 32. Actions: local vs remote Each action can potentially run on the server or on the client.
 This can be possibile thanks to the architecture based on the same language (NodeJS) for both platforms. In the intent, you can specify if this action should preferably run in the server on in a client.
 
 For example, a very computationally intensive action (algorithm to detect next move in a chess game) should run in a powerful server and only return the output.
  • 33. Actions: trust boundary Local
 Action Remote Action Internal network trusted boundary Denied OK Instead, if you have to control your home lights, you should run the action locally on the client to take advantage that the client is in the same Wi-Fi network with your lights, avoiding to expose your IoT things over Internet.
  • 34. Action (ran in server mode) NLP Server Action 1 2 34 Client 50
  • 35. Action (ran in client mode) NLP Action 2 3 Client 10
  • 37. Fulfillment A fulfillment is the output of an intent, whether it was performed by an action or a simple output string.
  • 38. Fulfillment transformer Every fulfillment passes into a transformer where it could be filtered or altered is some ways. async function fulfillmentTransformer(fulfillment, session) { fulfillment = fulfillmentSanitizer(fulfillment); _.defaults(fulfillment.data, fulfillment.payload); if (!_.isEmpty(fulfillment.speech)) { fulfillment.speech = await Translator.translate( fulfillment.speech, session.getTranslateTo() ); } return fulfillment; }
  • 39. Fulfillment types speech | String that could be spoken or written data.error | Error object to send. data.language | Language override for speech. data.replies[] | List of choices that the user can select. data.url | URL to send or to open data.music | Music to send or to play. data.feedback | Boolean value indicating that this is temporary feedback until the real response will be sent data.game | Game that can be handled via Telegram. data.video | Video to send or to show. data.audio | Audio to send or to show. data.image | Image to send or to show. data.lyrics | Lyrics object of a song. data.voice | Audio file to send or play via voice middlewares.
  • 42. What can I do?
  • 43. Intent implemented right now Akinator
 Uses machine learning to guess a celebrity
 
 Alarm
 Set alarms, meeting, timers
 
 Chess
 Plays chess with a MinMax algorithm
 
 Coinflip
 Do a coinflip Date.now
 Tell current date
 Gocrazy
 Say random words
 
 Lyrics.Search
 Search a lyrics from a track
 
 Lyrics.Track
 Search a track from a lyrics
 
 Metronome
 Do a metronome SCF
 Play Sasso-Carta-Forbice

  • 44. Intent implemented right now Torrent.Download
 Search and download torrents
 
 Translate.Text
 Translate a text in various languages
 
 Weather.Search
 Get informations about weather
 
 Music.*
 Search music on Spotify and play over a speaker or Chromecast Youtube.*
 Search videos on Youtube and play over a Chromecast Lights.*
 Power on/off Xiaomi Lights, change color or intensity
 
 Draw
 Search an image
 
 Knowledge.Get
 Uses WolframAlpha to get all kind of universal knowledge
 
 Camera.Spy
 Record a video and upload in Cloud SmallTalk.*
 All kind of dialogues. 
 Thanks @ValentinaCiav
  • 45. How to write and test an action git clone https://github.com/kopiro/otto-ai/
 
 ... configure ... cp ./src/actions/__example.js ./src/actions/namespace/newaction.js 
 ... develop ...
 
 node main.js
  • 48. Re-Speaker 2-Mics Pi HAT PowerBoost 500 Charger Additional hardware LiPo Battery 3.5V Push Button On/Off Switch Button Speaker
  • 49. Re-Speaker 2-Mics Pi HAT The board is developed based on WM8960, a low power stereo codec.
 There are 2 microphones on both sides of the board for collecting sounds and it also provides 3 APA102 RGB LEDs, 1 User Button and 2 on-board Grove interfaces for expanding your applications.
  • 50. PowerBoost 500 Charger With a built-in battery charger circuit, you'll be able to keep your project running even while recharging the battery!
 This little DC/DC boost converter module can be powered by any 3.7V LiIon/LiPoly battery, and convert the battery output to 5.2V DC for running your 5V projects.