This was a good idea because people were working on it back then, and think it's been close to perfected. Real simple, have a Speech to Text Engine on the other side of phone lines, let people talk and then have the output sent to them via email or stored somewhere. I had done a deal for Harman back in 1995 looking at related technologies and wondered why this hadn't been done by the time I thought of it, think late 1990s.
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
32 Speech to Text Phone Based Translation
1. 1CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
Date: January 2001 Reference: Sprectre1 BBN GTE.ppt
Presentation:
Jay Martin
Dick Williamson
Spectre1 BBN/GTE
Speech Recognition
2. 2CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
We (Spectre) need to do four things
Spectre1 Speech What we need to do?
• A. New business issues
– Incorporation (do we need to?) and expenses
• B. Develop an approach to talk to them.
– Research companies and latest developments in this area (Jay)
– Select partners and decide on order (Jay)
– Contacts - start meeting people in field, find out who-is-who
– Avenues in - business development and marketing
– Logistics
– Non-disclosures (Roger)
• C. Develop “our pitch” to the Speech Recognition companies.
– Think about how we will partner with them (‘get paid’)
– What do we want? (How much of the revenue stream?) %?
• D. Develop details of how it would function/operate.
Spectre
3. 3CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
Notes to myself
Spectre1 Speech What we need to do?
• What is it and how will it work?
• What we need to do to start it and close a deal?
• Think of security issues
– Sending bogus data
– Using it and not paying
– Stealing peoples info
– Using recorded voice
• Dead time to eliminate ambient noise when starting
• Training the system to your voice
• Free trials for a period or number of emails/volume of data
Spectre
4. 4CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
Notes to myself 2
Spectre1 Speech What we need to do?
• Which suppliers do we talk to?
• Who do we talk to before hand?
• What story do we make up to avoid spilling the beans?
• Look over list of speech organizations
• Web portal could be base for Verizon ads
• Hook up to translator software
• How to segment and target people who already use voice
• Cues for customer service via text feedback from recordings
• How to tie with existing telecommunication services and billing
Spectre
5. 5CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
II Situation & IdeaSituation & Idea
IIII Industry & CompetitionIndustry & Competition
IIIIII Description of OperationsDescription of Operations
IVIV Risks and Unresolved IssuesRisks and Unresolved Issues
VV The Offer & Next StepsThe Offer & Next Steps
Agenda
1
6. 6CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
Voice and Speech Recognition have improved greatly, but the focus
of the challenge of PC-based systems has consumed resources.
Situation State of Voice & Speech Recognition
• Everyone trying to get algorithms to fit on to PCs
• Continued fear of use and resistance by many
• After more than 10 years of developing products, the market is
consolidating and competitors are starting to be eliminated
• Bad-press due to recent problems at L&H
• Highly public failings of major player(s)
• Frustration with released interactive products - poor performance
The question everyone wants to answer is “how to get a decent voice
product on a PC?” - we don’t think that is the right question to ask.
7. 7CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
Even with quantum leaps in PC computing power, mainframe
systems capabilities will always be superior to enabled-PCs.
Situation PC vs. Mainframe
• Need for hardware to support PC-based systems, trying to catch-up
• PC based systems require massive RAM and hard drive space
• Upgrading to latest development costly or cumbersome
• Requirement that PC must be with you, on and set-up to accept speech
• Need to have microphone or . .
• Accept problems in quality with using the PC built-in microphone
• Additional costs in marketing, packaging, manufacturing, distribution and
support of PC-based systems
The question everyone wants to answer is “how to get a decent voice
product on a PC?” - we don’t think that is the right question to ask.
8. 8CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
Our idea recognizes of the superiority of mainframe versus PC for
this application, and we believe the best answer for using speech.
• Create a speech recognition system utilizing mainframe algorithms with
telephone as the input device and email or fax as the output.
– Latest and most powerful speech is always being applied
– Cost of consumer and retail products eliminated
– Challenges of PCs completely eliminated
- (RAM, storage requirements, upgrading, operating, user support)
The Idea Summary
9. 9CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
Need to show interfaces
• Interfaces and possible functions
–Mouth
–Handset
–PC Screen
–Mouse
–Keyboard
The Idea Summary
10. 10CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
The concept is that callers would speak to a very powerful voice-text
translating computer, and would receive files by email.
Text Delivery
Powerful
Computer System/
Voice Translator
Phone Call
1 2 3
• Meetings
• Stenographers
• Calling in trip details
• Brainstorming
• Reporting on
meetings/sales visits
• Interview notes
• Translate from
speech to text
• Look for spelling
errors and
highlight
questionable
items
• Vocal
• Print out
• Electronic text
• Email
• Get off net
• Have made into
final deliverable
The Idea Basic System Operation
11. 11CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
The benefits of this mostly are derived from reduced time spent
communicating or digitizing, and the applications are many.
• Benefits
– Reduced hardware required for all users
– Flexible
– Cheap
– Easy to pick up text
– Always have latest/most powerful algorithm
– PC not needed
– Allows multi-tasking
– Software maintenance not needed by user
– Can store digital speech, save and on cue process it, does not need
to do instantaneously
The Idea Benefits and Applications
12. 12CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
The benefits of this mostly are derived from reduced time spent
communicating or digitizing, and the applications are many.
• Applications
– Record meeting minutes or a presentation transcript
– Dictate while on the road
– Create text of speeches and voicemails
– Write letters
– Record impressions versus writing down notes
– Record/document phone conversations
The Idea Benefits and Applications
13. 13CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
The benefits of this mostly are derived from reduced time spent
communicating or digitizing, and the applications are many.
• Applications
– Dictations to yourself
– Emails to list via voice
– Meeting minutes
– Class notes
– TV programs
– Support for deaf
– Text response to voicemails - build Vmail system with link to copy into
text and respond in writing
– Software to record .wav files and upload via Internet
– Eavesdropping
– Tests - speak answers and can be evaluated
– Court and legal meetings
The Idea Benefits and Applications
14. 14CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
The benefits of this mostly are derived from reduced time spent
communicating or digitizing, and the applications are many.
• Applications
– Reporting crime scenes
– Trip reports
– News from remote locations
– Direct updates of text to owners websites
– Record on PC remotely and upload later, like replicating
– Three way conference call where third is translator
– Analyze conversations in other languages
– Help desk conversations
– 911 Operator logs
– Speeches and Question & Answer
– Why this instead of PC based?
The Idea Benefits and Applications
15. 15CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
The benefits of this mostly are derived from reduced time spent
communicating or digitizing, and the applications are many.
• Applications
– Record meeting minutes or a presentation transcript
– Dictate while on the road
– Create text of speeches and voicemails
– Write letters
– Record impressions versus writing down notes
– Record/document phone conversations
The Idea Benefits and Applications
16. 16CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
The benefits of this mostly are derived from reduced time spent
communicating or digitizing, and the applications are many.
• Applications
– Dictations to yourself
– Emails to list via voice
– Meeting minutes
– Class notes
– TV programs
– Support for deaf
– Text response to voicemails - build Vmail system with link to copy into
text and respond in writing
– Software to record .wav files and upload via Internet
– Eavesdropping
– Tests - speak answers and can be evaluated
– Court and legal meetings
The Idea Benefits and Applications
17. 17CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
The benefits of this mostly are derived from reduced time spent
communicating or digitizing, and the applications are many.
• Applications
– Reporting crime scenes
– Trip reports
– News from remote locations
– Direct updates of text to owners websites
– Record on PC remotely and upload later, like replicating
– Three way conference call where third is translator
– Analyze conversations in other languages
– Help desk conversations
– 911 Operator logs
– Speeches and Question & Answer
– Why this instead of PC based?
The Idea Benefits and Applications
18. 18CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
Execution will require software development and hardware
acquisition, unless capacity is already available.
Text Delivery
Powerful
Computer System/
Voice Translator
Phone Call
1 2 3
Audio Receipt
Software/Router
Email Server
Speech to Text
Conversion
Voice Date
Processing and
Storage
Multiple Call
Receiving
Capacbility
Voice to Text
Conversion
Text to Email
Conversion
The Idea Major Hardware/Software Requirements
19. 19CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
The methods of charging and delivery of the product(s) both have a
variety of options.
• Retrieval and Delivery
– Sent via Email
– Faxed
– Picked up on the Internet
– Printed, cleaned up, made into a report and sent by Mail/Fedex
• Payment Options
– Account on-line
– 900 number charges
– Credit card or calling card
– Caller ID or specialty numbers for large accounts
– Have on internet server waiting to be picked up and paid for
– PayPal with return receipt for accepting incoming emails
The Idea Details
20. 20CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
To better understand how it will operate, we will break the entire
system into six discrete parts, including the user/customer.
User(s) Phone/PC Receiving
Hardware
Main
Computer(s)
Sending
Hardware
Recipient
A B C D E F
CRAY
• Select 800 or 900
• Dial phone #
• Speak
• Enter options
• Enter account #
• Confirm account
• Select
recipient(s)
• Receive speech
• Convert speech
to digital
• Receive
instructions
• Transmit signal
• Transmit
instructions to
‘receiving
hardware’
• Receive signal
• Receive
instructions
• Record signal
• Queue signal
• Archive signal
• Confirm back to
user
• Interact with user
• Sort and prepare
to enter translator
• Confirm ID by
voice recognition
• Give a
customized
greeting to user,
double as
security
• Pass to ‘main’
• Receive signal
• Process speech
• Translate to text
• Highlight
questionable
items
• Review structure
• Translate
language if
needed
• Format
• Pass to ‘sending’
• Receive text
• Add address(es)
• Put on notes to
text or email
separately
• Put on “path”
• Send to ‘clean-
up’ department if
requested
• Sending billing
information or
update/receipt
• Charge credit
card or wait for
payment
• Inform back
office support
groups
• Release ‘user’
text
• Receive/send
• Email
• Fax
• Text waiting at
site
• Mailed in
electronic (disk)
and hard copy
• Website update
• Printer
• Cell or PDA
• Pager
The Idea Detailed Schematic
21. 21CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
There are various paths that this could take, and understanding this
will help better identify our customers and their needs.
Phone
Mail
1 User
2+ User
Spectre
Email
Our
Site
Their
Site
Fax
Mail
Printer
Other
Above is a logic tree structure for users (one or more), input device
options and output options. Shade indicates expected frequency.
User(s) Sending Receiving
PC or
Net
The Idea Paths
22. 22CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
XXXXXX
• Single user
– Will only need to process one set of speech algorithms
– No overlap
– Will not need to separate and identify from others
– Should be able to remain close to the receiver or microphone
• More than one user
– Confusion between speaker
– More noise, talking over each other
– Will need to distinguish who said what
– Training required, will need to introduce each speaker to system
The Idea User(s) Issues
23. 23CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
XXXXXX
• How will they speak? Can we accept any rate of speed?
• Noise
• Voice training
• Confusion between voices
• Security
• Fraud
• Pranks
• How will they enter in account information?
The Idea Sending Issues
24. 24CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
XXXXXX
• Security
• Billing (how to)
• How will you deliver text? How many sites? Store or speak?
The Idea Receiving Issues
25. 25CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
Separate Slides
Spectre1 Speech XXXX
• Base of Operations
• Slice by Area
– Mouth/finger (paragraphs, accounts, list-self, group, other) to phone
– Phone to computer
– Computer to user, internet site, email, fax, emails, website, other
• Interface/connection
The Idea Summary
26. 26CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
XXXXXX
Spectre1 Speech Attracting Customers
• How will you attract new customers?
• Open up account and try for free first time
• How will you Advertise? Where?
• Target high-end and likely users (review applications & segment)
• Who will be the most likely users?
The Idea Summary
27. 27CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
How would we obtain ‘cash’ for this service?
Spectre1 Speech Revenue
• 900 number
• Special dial-in number
• 800 number with credit card for account, could use Voice Recognition
• PayPal or on-line, pay at acceptance of email
• Site with charge option
The Idea Summary
28. 28CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
Billing - how we would collect revenue
Spectre1 Speech Charges
• How will we charge? Collect?
• Volume of length of text or length of speech? Variety of options/incentive
• Options
– Preparation into report or letter, clean-up services
– Translation from other languages
– Frequency of use or group discounts
The Idea Summary
29. 29CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
XXXXXX
Spectre1 Speech XXXX
• Concept & Technical Details/Options/Additional Services
• Rationale
• Applications
• Customers
The Idea Summary
30. 30CODE
This presentation is property of
Edward J Martin, Jr. of Irving, Texas
214.574.7554
Jay Martin, who is the author.
No copying of it has been authorized.
There were another 30 pages to this deck, mostly pages to be filled in.
Cut them off given it was already too long for my SlideShare.
Spectre1 Speech XXXX
• Pretty sure this idea was being done to an extent when I thought of it,
think I found a medical transcription company in Atlanta doing something
similar. Fun coincidence is that one of the gurus of this area was Ray
Kurzweil, whose company I was familiar with. Didn’t find out until 10
years later, but one of his big backers at the start of his career was one
of my own Uncles back in the early 1980s.
• Also, just swapped emails with Dick Williamson for one of the first times
in 15 years, another interesting coincidence. Forget why I included him,
as I know he was at Polaroid.