SlideShare a Scribd company logo
1 of 59
RoBoHoN
黃弘偉
自我介紹
• 黃弘偉,四年鴻海肝鐵人
• 座右銘
– never try, never know
• 經歷
– 3個月的客服機器人
– 3個月的手機機器人
– 平常寫寫AP、資料分析、打雜…
2年前,我在機器人展覽...
3
工業機器人
寵物機器人
教學機器人
家用機器人
探勘機器人 服務機器人
偶像機器人
娛樂機器人
運動機器人
全場焦點
• Pepper
– 由法國 Aldebaran Robotics 開發
– Linux based + Robot OS
– 前身
• 雙足機器人 NAO
• 主要為學術研究
• 30萬左右
– 軟銀 + 鴻海 + 阿里巴巴
• 推向市場
• 企業 -> 開發者 -> 消費者
4
雙足 -> 3 個萬向輪
今天的主角就在這…
6
RoBoHoN開頭
• Video
– RoBoHoN - 月薪嬌妻跳舞
• https://twitter.com/bluemidorin/status/808659732487512064
7
Outline
• RoBoHoN 是什麼?
• RoBoHoN – Motion
• RoBoHoN – Voice
• RoBoHoN – Vision
8
機器人之父
高橋智隆
東京大學研究副教授, Robo Garage創始人兼CEO,大阪電子通信大學客座教授。
Robi 週刊
Kirobo
於是機器人手機誕生…
RoBoHoN 是個什麼樣的概念?
Robot Phone
Android based + Proprietary lib
RoBoHoN 介紹
• Video
– RoBoHoN 介紹
14
ROBOHON – MOTION
15
RoBoHoN – Motion
• Parts of the Motion
– 頭部
• 3 個伺服馬達
• 2 眼睛 LED
• 1 嘴巴 LED
– 腳步
• 各 3 個伺服馬達
– 手部
• 各 2 個伺服馬達
– 共 13 個可動部位
Head Pose Angles
RoBoHoN – Motion
• Servo motor
– 日本並木精密寶石 研發
– 世界最小的 Servo motor
– 離合器 + 伺服電路 + 馬達
– 單價 24 美元
– 312 美元(= 24 ×13)
比一般小23%
37.5Kg
20個伺服馬達
13個伺服馬達
RoBoHoN – Motion
• 動作
– SHIN-Walk 專利
• Robi Video
• RoBoHon Video
19
RoBoHoN – Motion
• Video
– The Robi Walk.mp4
• https://www.youtube.com/watch?v=AOdwe-GJIi0
– SHARP RoBoHoN Action (Walking) 4K
• https://www.youtube.com/watch?v=F9aWMXDD-ZY
– SHARP RoBoHoN Action (Sitdown) 4K
• https://www.youtube.com/watch?v=r0WgqrnCH00
– SHARP RoBoHoN Action (Standup) 4K
• https://www.youtube.com/watch?v=tX6Y3YJFdns
– SHARP RoBoHoN Action (Take Picture) 4K
• https://www.youtube.com/watch?v=b-FqT0TWgHc
20
RoBoHoN – Motion
• 動作怎麼出來的?
– 時間軸 + 13個伺服馬達狀態
– 用 Robot Emulator 拉
– 用手扳
– State Machine
21
Walk Greet
Stand
Dance
ROBOHON – VOICE
22
交友軟體暗藏Bot
23
就一堆妹子
密你
你覺得
春天到了 但…
免費會員只
能講一句話
交友軟體暗藏Bot
24
為了終身幸福
你儲值了…
面對女生強
烈的追求…
對方
人間蒸發
Chatbot
• 生活周遭開始出現 Chatbot?
25
Garbage In, Garbage Out
Keyword Handling
一問一答
RoBoHoN – Voice
RoBoHoN – Voice
• Video
– RoBoHoN 輪唱
來源: https://www.facebook.com/robohon.jp
27
RoBoHoN – Voice
附近
景點
拍照
打電話
天氣
推薦
歌曲
起床嚕
今日
行程
今日
行程
生日
快樂
聲紋/語音辨識
學習使用者行為
好感系統
Cloud
語音辨識引擎
語音辨識伺服器
語音辨識資料庫
辨識結果
HVML 對話引擎
對話腳本
對話劇本資料庫
劇本選擇命令解析
語音/聲紋辨識引擎
運動控制聲音合成
降噪
從 Mic 聲音輸入
對話劇本伺服器
RoBoHoN – Voice
• Microphone 配置
– 頭部配備3個
– 軀幹配備1個
– 一般手機配置2個
• 為了加強收音跟消除噪音使聲音清
晰,語音識別
RoBoHoN – Voice
• 降噪 (Noise Reduction)
– 噪音抑制訊號
• 噪音的反向波
噪音抑制訊號
31
RoBoHoN – Voice
• Beamforming
32
RoBoHoN – Voice
• 語音活動偵測 (Voice activity Detection)
– 介紹
• 語音片段分為 靜音段、過度段、語音段、結束。
• 擷取語音段
– VAD算法:
• 短時過零率與能量閥值
• 頻域變異數與Entropy
• 分類器,GMM?
33
RoBoHoN – Voice
• 自動語音識別 (Automatic Speech Recognition)
• 基本模型架構
– 音素 -> 拼音 -> 文字
– 特徵提取
• 分幀
• 特徵向量
– 模式訓練/匹配
• 聲學模型 (acoustic model) : 音素 -> 拼音
• 字典 (dictionary) : 拼音 -> 文字
• 語言模型 (language model ) : 語言統計規律
34
RoBoHoN – Voice
• HOYA 語音合成技術
– VoiceText
– HMM-based
35
「語音識別是技術、語音合成是藝術」
RoBoHoN – Voice
• HVML
– HVML = Hyper Voice Markup Language
– XML - compliant language
– Scenario Dialogue Technology,情境對話腳本
36
<hvml version="2.0">
<head>
<producer>jp.co.sharp.sample.simple</producer>
<situation topic_id="0001" trigger="user-word">Hello eq ${Lvcsr:Basic}</situation>
</head>
<body>
<topic id="0001" listen="false">
<action index="1">
<speech>Hello. Jin Cheng Wu.</speech>
</action>
</topic>
</body>
</hvml>
RoBoHoN – Voice
• AIML
– AIML = Artificial Intelligence Markup Language
– XML - compliant language
– 樣版比對
37
一個商業版的語料庫,有80萬句以上
SoundHound
RoBoHoN – Voice
• Android 語音識別
– Android L 後新增加的功能
• 檢測到關鍵字語句,做出對應動作
– Android SoundTrigger
• keyphrase detection
• 例如: OK Google
38
SoundTrigger
HAL
SoundTrigger
APP
onRecognition
Hardware
RoBoHoN – Voice
• Android 語音識別
– SoundModel
• 音素網路
• 關鍵詞短語的特徵
• 個用戶的語音的特徵
– 一個鲁棒性的 SoundModel
• 需要不同念法
• 需要不同性別
• 需要不年齡層
• 樣本數 100 人以上
39
HH
AE
IH IY Y
“Hey”
如果你想充實這方面的知識…
• 深度學習零基礎進階 - 雷鋒網
– 深度學習的“聖經”
– 建立深度學習的知識網
– ImageNet 革命
– 語音辨識大法好
– 等等
40
如果你想開發相關 AP…
• 首先你要有一台 RoBoHoN
– 然後去官方下載 SDK
41
RoBoHoN 應用
ROBOHON - VISION
43
RoBoHoN - Vision
• RoBoHoN 的電腦視覺
– 喚醒時
• 偵測是不是主人
– 拍照時
• 偵測人臉與微笑
– 瀏覽照片
• 評論照片內容
– 照片有幾個人
– 照片有嬰兒或寵物
– 照片的人笑得很開心
– 照片什麼時候拍的
– 照片在哪裡拍的
44
EXIF
RoBoHoN - Vision
• Omron - OKAO Vision
– 「OKAO」是日文的「臉」的意思
– 功能
• Identify faces and people's position
• Hand gesture recognition and hand detection
• Find facial parts
• Beautify a face
• Recognize facial details
• Understand images
RoBoHoN - Vision
• Identify faces and people's
position
– Face Detection
– Human Body Detection
– Object Tracking
46Haar-like + Adaboost + Cascades
RoBoHoN - Vision
• Applications
– AF(Auto focus) / AE(Auto Exposure)
– Lighting correction
– 活動區域
47
RoBoHoN - Vision
• Hand gesture recognition and
hand detection
– 手部與臉部的相對位置,是重要的
參考依據
– 透過追蹤手部運動軌跡
48
Kinect V2 - Vision
• RGBD 攝影機 + 骨架追蹤
– 超猛!
49
RoBoHoN - Vision
• Facial parts detection, face
direction estimation
– Face parts contour detection
– Gaze & Blink Estimation
• Applications
50
RoBoHoN - Vision
• Beautify a face
– Smart Beautifier
– Eye Enlargement and Eye
Beautifier
– Red-Eye Reduction
– Other correction functions
51
RoBoHoN - Vision
• Recognize facial details
– Face Recognition
• 人物搜尋與認證
– Gender & Age Estimation
52
RoBoHoN - Vision
• Recognize facial details
– Baby Recognition
• 0~3歲嬰兒的評估
– Smile Degree Estimation
53
RoBoHoN - Vision
• Recognize facial details
– Expression Estimation
• neutral, happiness, surprise,
anger and sadness
54
RoBoHoN - Vision
• Recognize facial details
– Pet Detection
• Can detect the faces of dogs and
cats.
• 犬系人 V.S 貓系人
55
RoBoHoN - Vision
• Recognize facial details
– Applications
• 圖片搜尋
• 圖片評論
• 微笑拍照
56
RoBoHoN - Vision
• Understand images
– Scene recognition
• Can recognize 7 scenes including
scenery, flower, cooking, snow,
sunset, night and twilight.
– Subject detection
• Can detect the main subject in
the image automatically
57
RoBoHoN - Vision
• Understand images
– Applications
58
Robot 解析

More Related Content

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Robot 解析

Editor's Notes

  1. 引導
  2. 創作者 高橋智隆 從頭開始研究,創造,開發,設計和製造仿人機器人原型。開發RoBoHoN,Kirobo,Robi,Ropid,Chroino,FT,Evolta,Tachikoma和VisiON。 消費性機器人的領先者 Robi 週刊在全球銷售超過10萬 湊齊全部零件組成一台機器人,需要購買70本雜誌,總計達到13萬日元(4萬5台幣)以上 Kirobo 第一隻上太空陪太空人聊天的機器人 這兩隻的特色都在於優秀的的運動系統與簡單的語音控制 http://robotstart.info/2015/10/06/robo-garage-robohon-robi-design.html
  3. http://www.jiqiren.tv/robi-robohon.html
  4. 有請 PIKO 太郎解釋
  5. https://www.facebook.com/robohon.jp/posts/814430342024009
  6. https://www.facebook.com/robohon.jp/posts/814430342024009 日本並木精密寶石 有13個可動部,分別配備馬達機構。畫面右上方的馬達放大圖上有“Namiki”字樣,可以看出是並木精密寶石的產品。包括離合器、伺服電路和外殼在內,每個的價格為24美元左右。馬達機構整體的尺寸只有小指大小,這種產品屬於全球最小的部類。 配備13個單價24美元的馬達時,僅馬達的成本就312美元(=24美元×13個)。 強制扳動 servo 很容易損壞,所以一般狀況下,使用離合器,脫離咬合
  7. http://robotstart.info/2015/10/06/robo-garage-robohon-robi-design.html
  8. http://raionnoie.blogspot.tw/2013/03/shin-walk.html http://blogs.itmedia.co.jp/honjo/2010/12/infinity-ventur.html https://www.youtube.com/watch?v=AOdwe-GJIi0 https://www.youtube.com/watch?v=F9aWMXDD-ZY 巧妙的利用 Z 型的腿部兩個支點作平衡, 令Robi 能夠提起單腿踏步, 而不倒下..... 
  9. https://www.youtube.com/watch?v=k-ZyIzwkTKY 【约炮软件暗藏陷阱】囧的呼唤211期
  10. https://www.youtube.com/watch?v=k-ZyIzwkTKY 【约炮软件暗藏陷阱】囧的呼唤211期 圖靈測試 透過一連串的問答,無法察覺對方是Bot
  11. 軟體架構圖 來源 http://ascii.jp/elem/000/001/249/1249275/
  12. https://robohon.com/sdk/app.php
  13. 日本原文 http://www.nikkei.com/article/DGXMZO05502680R00C16A8000000/ 大陸翻譯 http://big5.nikkeibp.com.cn/news/mobi/78670-201607291615.html http://big5.nikkeibp.com.cn/news/mobi/78671-201607291656.html http://big5.nikkeibp.com.cn/news/mobi/78671-201607291656.html
  14. 兩個麥克風都會同時收到環境音 底下 https://zh.wikipedia.org/wiki/%E8%AA%9E%E9%9F%B3%E5%8A%A0%E5%BC%B7
  15. 「過零率」(Zero Crossing Rate,簡稱 ZCR)是在每個音框中,音訊通過零點的次數,具有下列特性: 一般而言,雜訊及氣音的過零率均大於有聲音(具有清晰可辨之音高,例如母音)。 是雜訊和氣音兩者較難從過零率來分辨,會依照錄音情況及環境雜訊而互有高低。但通常氣音的音量會大於雜訊。 通常用在端點偵測,特別是用在估測氣音的啟始位置及結束位置。 可用來預估訊號的基頻,但很容易出錯,所以必須先進行前處理。 http://mirlab.org/jang/books/audioSignalProcessing/basicFeatureZeroCrossingRate.asp?title=5-2%20Zero%20Crossing%20Rate%20(%B9L%B9s%B2v)&language=chinese https://zh.wikipedia.org/wiki/%E8%AF%AD%E9%9F%B3%E6%B4%BB%E6%80%A7%E6%A3%80%E6%B5%8B#.E7.AE.97.E6.B3.95.E6.A6.82.E8.BF.B0
  16. https://www.zhihu.com/question/20398418 MFCC 隱瑪可夫模型(Hidden Markov Model,HMM) 第一步,構建一個狀態網路。 第二步,從狀態網路中尋找與聲音最匹配的路徑。
  17. 聲音合成 一天錄4個小時 連續一個月 http://voicetext.jp/case/robot/html/casestudy20160603_robohon.html
  18. 用以敘述場景對話
  19.  Hound 比較強的地方包括導航,在地搜尋,天氣、股票、時區、地理等資訊,飯店資訊、航班資訊、新聞、圖片和影片搜尋、貨幣換算等
  20. Bism for RoBoHoN http://blog.goo.ne.jp/robologoo/e/fb59f3fd3a77f8395c7dc74061bd1aca
  21. 系統解析
  22. 人臉偵測
  23. 主要是抓兩眼間的距離,以及兩眼中點連到鼻尖或嘴唇的距離,然後去判斷這個比例。年紀小的人,這兩個比例會差距較小,年表的人比例會拉大。
  24. 梅長蘇 V.S都敏俊
  25. 顯著性區域 最大輪廓