This document summarizes the history and development of speech recognition technology. It discusses how early systems used statistical methods and large computational resources to train models. It outlines key applications and areas like dictation, command and control, telephony, automobiles, and mobile devices. The document describes the basic process of how speech recognition works including feature extraction, labeling, and classification. It highlights challenges for the future like improving recognition in noisy environments, better integration across devices and languages, and combining modalities like voice, text, and sensors.
What's been cooking in india: Presentation at Indian Digital Summit, 2012LunaErgonomics
This presentation was made by Luna Ergonomics during the India Digital Summit, New Delhi, 2012 by Abhijit Bhattacharjee, CEO. Indian Language technologies.
A Quick Check List Before Buying Smartphone.Chico Mobile
Looking to buy a smartphone?10 Things You Must Check Before Buying a New Phone. Smartphones are now more important than Luxuries.
official website : chicomobile.ph
IP's 20 year evolution - adaptation or extinction Design And Reuse
From its infancy 20 years ago, the semiconductor IP industry has evolved into the major driving force of today’s semiconductor landscape. This talk will take a historical view of the changes in the industry over the past twenty years, looking at how the landscape (environment) has changed and how individual companies either adapted or simply went away. What will the IP industry and players in the future look like?
A Mobile Centric View of Silicon Valley - January 2011Lars Kamp
A presentation held at Opinno in San Francisco to a delegration from PromoMadrid. Goal was to provide a quick overview of major trends in mobile in 30 min.
What's been cooking in india: Presentation at Indian Digital Summit, 2012LunaErgonomics
This presentation was made by Luna Ergonomics during the India Digital Summit, New Delhi, 2012 by Abhijit Bhattacharjee, CEO. Indian Language technologies.
A Quick Check List Before Buying Smartphone.Chico Mobile
Looking to buy a smartphone?10 Things You Must Check Before Buying a New Phone. Smartphones are now more important than Luxuries.
official website : chicomobile.ph
IP's 20 year evolution - adaptation or extinction Design And Reuse
From its infancy 20 years ago, the semiconductor IP industry has evolved into the major driving force of today’s semiconductor landscape. This talk will take a historical view of the changes in the industry over the past twenty years, looking at how the landscape (environment) has changed and how individual companies either adapted or simply went away. What will the IP industry and players in the future look like?
A Mobile Centric View of Silicon Valley - January 2011Lars Kamp
A presentation held at Opinno in San Francisco to a delegration from PromoMadrid. Goal was to provide a quick overview of major trends in mobile in 30 min.
Jaký je váš postup při návrhu webu? Jak do procesu zapojujete uživatele? Právě o tom mluvil Tomáš Ludvík z Dobrého webu na českobudějovické komunitním setkání UX čtvrtkon.
Orientace na User Experience jako marketingová strategie?Jakub Krčmář
Prezentace z UX Konference 2013, kde jsem se snažil vysvětlit jakým způsobem do firmy implementovat orientaci na UX jako marketingovou strategii. Doplněno o příklady a řadu doporučení. Hodně jsem k tomu povídal, takže slidy samotné toho moc neřeknou. :)
UX mindset – aneb jak dělat digitální projekty pořádněJakub Krčmář
Přednáška pro studenty 2. ročníku UTB ve Zlíně, 13. 10. 2014.
Spolu s Janem Habichem, který přednášel na téma Everyone is a @Project manager: https://www.youtube.com/watch?v=chzSDVbRW70&list=UUQld-8imlOodQQrNYqj4lvg
Cílem User eXperience designu je navrhovat software, který lidé rádi používají. Jejich potřeby a zvyklosti jsou mnohdy úplně jiné, než si myslíme :-) Přednáška představuje základní nástroje UX designu včetně praktických ukázek: uživatelský výzkum, persony, skicování, wireframy, detailní návrhy a testování použitelnosti.
User Experience (UX) neboli uživatelský prožitek vyjadřuje celkovou míru spokojenosti návštěvníka s webem.
Pokud je prožitek dobrý, jsou návštěvníci vašeho webu spokojení … a dojdou k vámi vytýčenému konverznímu cíli.
Pokud je špatný, pak si váš web neoblíbí ... a konkurence je pouze o pár kliků myši vedle ...
Use of assembly language[edit]Historical perspective[edit]Assemb.pdfannethafashion
Use of assembly language[edit]
Historical perspective[edit]
Assembly languages, and the use of the word assembly, date to the introduction of the stored-
program computer. TheElectronic Delay Storage Automatic Calculator (EDSAC) had an
assembler called initial orders featuring one-letter mnemonics in 1949.[18] SOAP (Symbolic
Optimal Assembly Program) was an assembly language for the IBM 650 computer written by
Stan Poley in 1955.[19]
Assembly languages eliminate much of the error-prone, tedious, and time-consuming first-
generation programming needed with the earliest computers, freeing programmers from tedium
such as remembering numeric codes and calculating addresses. They were once widely used for
all sorts of programming. However, by the 1980s (1990s on microcomputers), their use had
largely been supplanted by higher-level languages, in the search for improved programming
productivity. Today assembly language is still used for direct hardware manipulation, access to
specialized processor instructions, or to address critical performance issues. Typical uses are
device drivers, low-level embedded systems, and real-time systems.
Historically, numerous programs have been written entirely in assembly language. Operating
systems were entirely written in assembly language until the introduction of the Burroughs MCP
(1961), which was written in Executive Systems Problem Oriented Language (ESPOL), an Algol
dialect. Many commercial applications were written in assembly language as well, including a
large amount of the IBM mainframe software written by large corporations. COBOL,
FORTRAN and some PL/Ieventually displaced much of this work, although a number of large
organizations retained assembly-language application infrastructures well into the 1990s.
Most early microcomputers relied on hand-coded assembly language, including most operating
systems and large applications. This was because these systems had severe resource constraints,
imposed idiosyncratic memory and display architectures, and provided limited, buggy system
services. Perhaps more important was the lack of first-class high-level language compilers
suitable for microcomputer use. A psychological factor may have also played a role: the first
generation of microcomputer programmers retained a hobbyist, \"wires and pliers\" attitude.
In a more commercial context, the biggest reasons for using assembly language were minimal
bloat (size), minimal overhead, greater speed, and reliability.
Typical examples of large assembly language programs from this time are IBM PC DOS
operating systems and early applications such as the spreadsheet program Lotus 1-2-3. Even into
the 1990s, most console video games were written in assembly, including most games for the
Mega Drive/Genesis and the Super Nintendo Entertainment System.[citation needed]According
to some[who?] industry insiders, the assembly language was the best computer language to use
to get the best performance out of the Sega Saturn, a co.
Wearable Computing and Human Computer InterfacesJeffrey Funk
These slides discuss how improvements in ICs, MEMS, cameras, and other electronic components are making wearable computing and new forms of human-computer interfaces economically feasible. Improvements in digital signal processing ICs and MEMS-based microphones are rapidly improving the technical and economical feasibility of voice-recognition based interfaces. Improvements in 2D and 3D image sensors (e.g., camera ICs) are rapidly improving the technical and economical feasibility of gesture-based interfaces, augmented reality, and virtual reality. Improvements in ICs, MEMS, displays and other components are rapidly making many forms of wearable computing economically feasible; these include many forms of head, arm, torso, and leg-mounted displays. Improvements in the materials for both non-invasive and invasive brain scans are rapidly improving the technical and economical feasibility of neural interfaces.
Mobile application development is the process of creating software applications that run on a mobile device, and a typical mobile application utilizes a network connection to work with remote computing resources. In these slide you will learn about basics of android operating system.
This presentation was delivered to a "Web Enabled Business" class at Simon Fraser University in Vancouver. The topic is speech recognition technology, and the presentation covers its origins, how it works, issues, latest trends and future opportunities.
The web is the platform - why FirefoxOS mattersTristan Nitot
Tristan Nitot, Principal Mozilla Evangelist and Mozilla Europe founder explains why FirefoxOS (Mozilla's mobile platform) and why Web browsers matter when it comes to freedom and innovation.
Radisys - Engage Digital - TADSummit Nov 2022Alan Quayle
TADSummit 2022 8/9 Nov Aveiro Portugal
Advancing Cloud Communications Well Beyond the Basics – Leveraging AI and ML in all aspects of programmable communications applications and media analytics
Adnan Saleem, CTO – Software and Cloud Solutions at Radisys Corporation
Programmable cloud communications to date, though well adopted, providing predominantly only basic calling and messaging services
Advancements in AI and ML technologies enable a broad range of new rich and immersive digital engagement experiences, including AR/MR/XR and spatial codecs
AI/ML driven virtual assistants with speech and interactive computer generated video technology and computer vision capabilities are complex and compute intensive
Video and speech analytics from live media streams with NLU and computer vision technologies enable a wide range of new applications in numerous verticals
Radisys Engage Digital platform streamlines and accelerates the creation and deployment of these sophisticated applications with low-code to no-code based APIs/SDKs and Visual Design Tools
1. Voice: The New UI for Mobile Devices
Jan Šedivý
WORLD USIBILITY DAY – 2012
1
2. Fred Jelinek (1932-2010)
During 21 years at IBM Research and
nearly two decades at Johns
Hopkins, he has pioneered the
statistical methods that enable modern
computers to understand spoken
language.
“He envisioned applying the mathematics of
probability to the problem of processing speech
and language,” said Sanjeev Khudanpur, a Johns
Hopkins associate
2
4. Speech reco benefits
Speech •
•
Speech is much richer then two mouse buttons
Disambiguation, dialog
• Show me all emails from David about Linux server
is rich • “Call David”, David Smith or Stone? Home or cell?
Text • Speech expresses not only text entry but C&C,
search, URI entry
• Speech entry is part of the keyboard
entry • “command box”, general source of information
WYSIWYG == What You Say
Is What You Get
4
5. Elements of success
• Access to huge content: Internet, YouTube,
maps, music, pictures, SMS, email…
• Train on all available data: contact, location
Best names addresses, email, documents content,
history, personalization and other sensors: GPS,
accuracy: accelerometers, camera, compass
• Computationally expensive - huge clusters of
computers to speed up training
• speech reco must not introduce any friction to
the interface
• keyboard, touch screen, multi-touch, keyboard,
Great UI speaker, microphone
• OS control, part of the OS, noise reduction, AD
design: converter
• Use all sensors available on the phone to inject
extra information to app
5
7. Speech recognition areas
Command Creation of
Telephony
control, digit texts, dictati
IVR
dictation on
Mobile Voice
Automotive
devices search
Speech is the most natural
way
we communicate
7
8. The main areas in time perspective
PC – C&C, dictation
Telephony
Automotive
Mobile devices
UI
1995 2000 2005
8
9. Little more history
1993 IBM Personal Dictation System IBM PC, audio adapter card
1996 VoiceType (Win 95, dictation, isolated words, email, …)
1996 Nuance deployed its first commercial speech application
1997 Dragon Systems unveiled its Naturally Speaking
1999 VoiceXML
2000 Telephony applications, IVR
2002 Car control (control car equipment, make a phone call, select
music, dictate address to navigation)
2003 Microsoft includes speech to Office 2003
2007 Growth of mobile phones/devices
2008 Google launches speech to Search iPhone
2009 Nuance Acquires IBM's patents Speech Technology rights
2011 iOS 5, Siri
9
11. Speech recognition – high level
Digitize audio
AD convertor
FFT, Non-lin,
DFFT Front End
feature extraction
Application
API
Labeling
triphones, prototy
pes
Text output Search
LM, HMM, Viter Back End
bi classification
11
13. IBM speech recognition – the early days
Large vocabulary, dictation (1990…)
Office correspondence task – Tangora
Written in Fortran
IBM RISC System/6000, AIX, Tangora
Albert Tangora (July 2, 1903 – April
7, 1978) set the world speed record for
sustained typing on a manual keyboard for
one hour, 147 words per minute, on
13
October 22, 1923.
14. How to get reco running on PC -1994
• Add-on board with ASIC
Front End • Integer version on CPU
• Input - 39 dim cepstrum coeffs feature vector
Hierarchical each 10 ms
• Output - 100 most likely prototypes out of
labeler 30k, diagonal Gaussians
• Statistical LM – high compression, log,
Search • Viterbi search, Hidden Markov Models
14
15. How get reco running on Embedded 1999
• Resource efficient speech recognition engine
Easy Port to • Written in C/C++
• Integer implementation, GCC compiler
Embedded • Simple API to customize for any platform
• Grammar support for command control
applications
Basic reco • Special emphasis on digit recognition
• Robust front end for noisy environments
• Command control
Cars • Digit and name dialing
• Navigation control
applications: • On-board entertainment control
15
21. Factors accelerating better mobile apps
Basic phone
More powerful CPU more memory
Connectivity, Internet
Much better UI, multi-
touch screen
Rapid growth of mobile phones/devices is
driving the adoption of speech recognition
21
22. Why is reco so important for mobile?
Small screen
Limited keyboard
Difficult text entry
Difficult to navigate
Slow, not reliable connectivity (latency)
Speech is fundamentally
changing the mobile user
22
experience
27. iOS Siri versus Google search
Siri are "natural language
processing" apps that use statistical
Siri is deep in iOS, start apps,
make calls, set meetings
Google is deep in the search engine
Can't launch apps with Google, you
can dictate an email or a text message.
Google is faster (much faster)
Future – combination of AI and
different UI
27
29. Future challenges
Better recognition, ROBUSTNES (noisy conditions,
dictation)
Better UI integration (speech button)
Multiple languages (how would a German native search for
an address in France?)
Switching between multiple languages
UI combining multiple
modalities, (voice, text, video, sensors)
Work on dictated text correction
Better integration of speech reco to special applications
29 ECSS 2010, 10/12/2010