5. Azure
Cognitive
Services
From faces to feelings, allow your
apps to understand images and video
Hear and speak to your users by filtering noise, identifying
speakers, and understanding intent
Process text and learn how to recognize what
users want
Tap into rich knowledge amassed from
the web, academia, or your own data
Access billions of web pages, images, videos, and news with
the power of Bing APIs
9. Bing Search
• Allow developers to integrate a search function to their apps that
allows users to find webpages, images, news, locations, and more
without advertisements
• For knowledge mining
12. Speech-to-Text
• Speech-to-text service
• Improves meeting efficiency by transcribing conversations in real-time
• Help safeguard data with industry-leading security and compliance
certifications.
• Integrates with a variety of meeting conference solutions including
Microsoft Teams and other third-party meeting software.
• SDK is available.
15. Speaker Verification
• Text-dependent verification means
speakers need to choose the same
passphrase to use during both enrollment
and verification phases.
• Text-independent verification means
speakers can speak in everyday language in
the enrollment and verification phrases.
16. Text-to-Speech
• Convert text into human-like synthesized speech.
• Offer 75+ standard in more than 45 languages and locales, and 5
neural voices
• Tune voice output by easily adjusting rate, pitch, pronunciation,
pauses, and more.
• Speech synthesis
• Asynchronous synthesis of long audio
• Speech Synthesis Markup Language (SSML)
23. Language Understanding
• Applies custom machine-learning intelligence to a user's
conversational, natural language text to predict overall meaning, and
pull out relevant, detailed information.
• Often used in Chatbots or conversational bots.
25. Use Cases of Language Understanding
• Automate capturing order for your application
• Automate social media feedback & response
• Conversation bot for your HR service, IT service, or other customer
services.
• Integrate with speech services for enabling your app to responding
voice request from users.
26.
27. QnA Maker
• Natural Language Processing (NLP) service
• Create a natural conversational layer over your data
• Find the most appropriate answer for any input from your custom
knowledge base (KB) of information
29. Immersive Reader
• Embed text reading and comprehension capabilities into applications
• Features:
• Reading aloud,
• translating languages, and
• focusing attention through highlighting
• No machine learning expertise is required.
36. What proves that Immersive Reader helps people
with reading?
• A 2017 study by RTI International showed that reading comprehension
among groups of fourth-grade students improved an average of 10 percent.
Read the research.
39. Text Analytics
• For text mining and text analysis
• Understand the context in a conversation better
• Sentiment analysis, opinion mining, key phrase extraction, language
detection, and named entity recognition
• More than 20 languages being supported (reference)
40.
41. Text Analytics for Health
• Extract information from unstructured English-language text in clinical
documents such as: patient intake forms, doctor's notes, research
papers and discharge summaries
43. Personalizer
• Provide information about your users and content and receive the top
action to show your users.
• No need to clean and label data before using Personalizer.
• Provide feedback to Personalizer when it is convenient to you.
• View real-time analytics.
• Use Personalizer as part of a larger data science effort to validate
existing experiments.
46. Where can I use Personalizer?
• Personalize what article is highlighted on a news website.
• Display a personalized "recommended item" on a shopping website.
• Suggest user interface elements such as filters to apply to a specific
photo.
47. Where can I use Personalizer?
• Send information (features) about your users and the content
(actions) to personalize. Personalizer responds with the top action.
• Send feedback to Personalizer about how well the ranking worked as
a number typically between 0 and 1.
49. Computer Vision
• Computer vision is an area of artificial intelligence (AI) in which
software systems are designed to perceive the world visually, though
cameras, images, and video.
• Computer vision is one of the core areas of artificial intelligence (AI),
and focuses on creating solutions that enable AI-enabled applications
to "see" the world and make sense of it.
50. Use Cases of Computer Vision
• Analyze an image and suggest an appropriate caption.
• Suggest relevant tags that could be used to index an image.
• Categorize an image.
• Identify objects in an image.
• Detect faces and people in an image.
• Recognize celebrities and landmarks in an image.
• Read text in an image.
51. What can CV tell us?
• A black and white photo of a city
• A black and white photo of a large city
• A large white building in a city
52. Not only that! It tags too!
• Tagging
• Type of identified object
• Bounding Box
• Set of coordinates (Top, left, width and height)
59. Custom Vision
• Azure Custom Vision is an image recognition service that lets you
build, deploy, and improve your own image identifiers.
• An image identifier applies labels (which represent classes or objects)
to images, according to their visual characteristics.
• The Custom Vision service uses a machine learning algorithm to
analyze images.
60. What can Custom Vision do?
• Classification
• Object Detection
• Export as standalone offline
model for your app
development.
66. Video Indexer
• Video Indexer provides ability to extract deep
insights (with no need for data analysis or coding
skills) using machine learning models based on
multiple channels (voice, vocals, visual).
• The service enables deep search, reduces
operational costs, enables new monetization
opportunities, and creates new user experiences on
large archives of videos (with low entry barriers).
67. Video Indexer
• Keywords extraction
• Named entities extraction
• Topic inference
• Artifacts Sentiment analysis: Identifies positive, negative, and neutral
sentiments from speech and visual text.
69. Use Cases of Video Indexer
• Deep search
• Content creation
• Accessibility.
• Monetization
• Content moderation
• Recommendations
70. Video Indexer
Face detection
Celebrity identification
Account-based face identification
Visual text recognition
Visual content moderation
Labels identification
Scene segmentation
Shot detection
Black frame detection
Keyframe extraction
Rolling credits
Animated characters detection
Editorial shot type detection
Audio transcription
Automatic language detection
Multi-language speech identification and transcription
Two channel processing
Closed captioning
Noise reduction
Transcript customization (CRIS)
Speaker enumeration
Speaker statistics
Textual content moderation
Audio effects
Emotion detection
Translation
71. Form Recognizer
• Extract text and data from business’s forms and documents.
• Easily extract text and structure, with simple REST API
• Pre-trained model:
• Receipt
• Business Card
• Layouts
• Custom Trained Model
• Supports printed and handwritten forms, PDFs and images.
• Container support
72. What can you do with Form Recognizer?
• Automate written text > digital text conversion
• Automate capturing receipt data
• Automate converting business card into digital contacts
73.
74. Sample Form Recognizer tool
• Client library / REST API quickstart (all languages, multiple scenarios)
• Web UI quickstarts
• Train with labels - sample labeling tool
• REST samples (GitHub)
• Extract text, selection marks and table structure from documents
• Extract layout data - Python
75. Sample Form Recognizer tool
• Train custom models and extract form data
• Train without labels - Python
• Train with labels - Python
• Extract data from invoices
• Extract invoice data - Python
• Extract data from sales receipts
• Extract receipt data - Python
• Extract data from business cards
• Extract business card data - Python
76.
77. Things you may be interested in.
•AI Labs@hmheng (GitHub):
https://github.com/hmheng/AILabs