What's the three types of speech technologyDocument Transcript
What’s the three types of speech technologyFirst of all, not just the voice, or artificial intelligenceAlthough Apple "guru" Steve Jobs is gone, but not the iPhone 5 appear as expected, but the launch of theiPhone 4S yet managed to Apple fans and industry attention. According to the American Telephone andTelegraph Company (AT & T) said, the iPhone 4S in the press release within 12 hours of receipt of theorder of 20 million copies, and one of the characteristics most convincing of Daxin,SONY VAIO VGN-FZSeries is known as the voice assistant Siri.On the morning of October 4, Apple, Phil Schiller, vice president of global production and is responsiblefor IOS software Scott Forstall, vice president of the new conference presentation Siri.What is the Siri?Forstall demonstrated in the field, he picked up the iPhone 4S, on the phone asked, "how the weathertoday?" On the screen immediately displays the weather today. He then asked, I can use an umbrella?Siri replied immediately, it will rain today. Subsequently, he has also shown that the application offersresearch and set the alarm, booking and other Sony VGP-BPS21A/B Battery functions.Siri with an ordinary voice search is not the same, he can understand what you say, understand what youmean, and even answer your question. It feels like a real personal assistant has a similar structure, butalso an understanding of the personal assistant. No matter how you use questions, it can be to peoplesminds to think and react, rather than the default response for the program.Siri can do for you over the answer, it can be done to you in person some basic things. For example, youcan send text messages to Siri told your father, the dentist appointment to remind you, help you find thedestination of the route, do not worry Siri not smart enough, because it can accomplish these things needto think about what the application is running, you can understand exactly how to call the object SonyVGP-BPS13B/Q battery.Siri also rumors of a "voice to text", you simply press the microphone, the content you want to send, Siriwould you say that you can convert the contents of the text, and send. In addition to sending messagestext on the outside, Siri also included in some third-party applications, so you only need to utter words,you can update Facebook, Twitter or send instant chat messages.If you think that Siri is a simple voice control, and your voice on Android phone NOKIA assistant or voicecan do that, then you are wrong.Let us explore what the Siri blood, the company was recently acquired by Apple comes directly from theU.S. Army CALO (Cognitive Assistant that learns and organizes), which is the largest in the history ofartificial intelligence project brought together the world of artificial intelligence aspects of the leadingexperts.If you read the Hollywood director Steven Spielberg film "AI", I think, have some understanding ofintelligence, artificial artificial intelligence technology by the robot can "dialogue, natural languageunderstanding, vision, speech, machine learning, to develop planned, rational thinking, on behalf of allthe integrated services. "Siri technology is derived from artificial intelligence, rather than research andspeech recognition. It is able to independently analyze the user to issue voice commands, and gives thecorrect answer and guidance, all without the VGP-BPS13A/Q battery user pre-employment study.Technology in a video recording of a foreign blog, the examiner made a number of issues Siri cryptic orambiguous. For example: "It is not romantic French restaurant nearby?" Such as? " Octave piano, howmany "?" Why is the sky blue "as in humans, these hackneyed phrases, but to change the machine isdifficult to understand these terms, particularly" romantic "this adjective, it is extremely difficult, however, ,Siri can answer these questions.
You can even declare for Siri, he said: "I love you!" It is very wonderful response: "I hope that you will notsay another cell phone."The Compaq Presario CQ50 battery reviewers then wrote in a blog: "Actions Android Voice Systemtechnology is a very, but really, this is not the same level of product Siri Siri is very cool, but in factcompared the vocal actions. Let us save typing, touch operating processes, but the operation is toocomplex, only those Geek will use it, however, mothers who choose to Siri. "Second, the three types of speech technologyIn addition to artificial intelligence, essential function Siri is always based on speech technology voicerecognition, speech recognition engine from Nuance, the companys global mobile phone method ofentry into a monopoly.This technology is not a revolutionary change, before the invention of the computer, the idea of automaticspeech recognition has been on the agenda, the early vocoder can be considered the prototype of theSpeech recognition and synthesis. The 1920 production of "Radio Rex" toy dog can be the first voicerecognition unit, when the dogs name is called when it emerges from the base.Past two decades, the Sony VGP-BPS13A/B voice recognition technology has made significant progress,started from the laboratory to the marketplace. It is understood that many large companies like IBM,Apple, Microsoft, Google, AT & T and NTT, several years ago, are all practical recognition speech systemto invest heavily in research. Current solutions public voice technology, including IBM Via Voice andDragon Systems Naturally Speaking launches, the voice platform Nuance Nuance Voice Platform,Microsoft Whisper, VoiceTone sun, and the information iFLYTEK mouth and so on."Voice technology is an interdisciplinary science typically involves many aspects, not to say that moneycan do, there is a certain level, you can go to download software from our experience APP". According toJiang Tao Vice President of Engineering Times Electronic iFLYTEK, at present there are three branchesof speech technology in general:The first is the technology of speech synthesis (TTS) is the text to voice, able to read the text of thetechnology, this technology is relatively early development is more mature.The second category is the technology of speech recognition (ASR), it has some segments, there is arelatively mature technology, the command recognition (voice command) in a limited space to performthe specified command as you said, many speech recognition at the beginning of mobile phones This lotis level. There is a branch of the voice evaluation, may be different for you to assess the degree oflanguage standards, assessment and Sony VGP-BPS18 guidance.The third category is the voice recognition technology model, the human vocal cords because of thelanguage with unique physical properties, such as fingerprints, iris, as everyones voice is unique, thistechnology is mainly used in the current voice encryption, it can identify different people vote.According to Jiang Tao Electronic Times, engineering revealed that the more complex the technology ofspeech recognition (ASR), although in general, recognition of the industry as a criteria for evaluatingsoftware, but environmental factors, influenced by the speakers voice tone, speed, noise, microphone,including the Sony VGP-BPS9A/B Battery identification of clouds, the quality of the transmission channel,there are many, many factors beyond the control "of the system to measure both types of recognitionagent, which is what really matters, because the specific environment of each person with not much more.Final decisions on consumers vote with their mobile phones. "The speech recognition technology to get rid of various environmental factors. Currently, the greatestimpact on the result of voice recognition or voice in the heat of public opinion is that one can hardlyexpect the phone can understand your words, it sounds to loss and loss. Obviously, this limits the HP
Pavilion dv5-1002nr battery application of voice technologies today are in noisy environments, thanks tovoice recognition technology has a particular noise (Noise Cancellation) for the microphone, which is notrealistic most users. In public buildings, individuals can volunteer for the vote and the environment, theyneed access to certain sounds, such as voice recognition technology can achieve this? In fact it is adifficult task.Some micro blogs in the industry, he said. "It allows you to filter the HP Pavilion G50， iPhone 4, iPhone4S with the vices of the background noise of the microphone, the voice of Nuance for users of text withthe product is no doubt very familiar with this situation: voice input to to ensure good sound quality andreduce noise, even when the voice input is not 100% accurate results on this basis, we believe that theiPad, iPod touch, the microphone input of low quality the voice is the best result, it is impossible in theshort time of Siri. "Moreover, the Compaq Presario CQ45 bandwidth problems can also affect the effective delivery of voice,voice technology depends mainly on the flow requirements of voice quality, the greater the greater thefidelity of voice traffic, and now widespread language technologies are 8-bit 16-bit encryption andencoding of both. A tax rate of less than 1000 bit / s bit weak to transmit the search for the encoding of alanguage very different from the normal situation, such as bandwidth, especially in some of the narrowchannel of communication Voice and Acoustic communication underground, secure voicecommunications and strategy to be effective in these cases, speech recognition, which has the specialproperties of the sound signal as much bandwidth slow or prevent the other.3, multi-language recognition and word recognition are mixed in an infiniteLike the current model of the acoustic model and the voice is limited, so users do not use a specificvocabulary for speech recognition in particular. If suddenly the Russian Chinese into English or French,the team does not know what to do and give it a few unintelligible phrases or occasional users of thejargon used in the specialized field, such as "the signal from the noise of money" and so, you can getstrange reactions. According to the authors experience, mixed Chinese and English, also increase thenumber of characteristic inclusions are always difficult.At present, the voice makes the development is not easy to publish.Cloud computing security and recognition, which are associated with the cloud, implemented on theserver, in terms of phone calls, send text messages, in fact, there are backup operators, there is nosafety issue it own, in fact, still practices and management controls as part of several large companies, Ido not think too chaotic. E-mail in as many years, security issues, but not because he did not.4, the practical aspects ofForeign Science and Technology, Michael Okuda bloggers skeptical about the practice of Siri. "It just ademo, now, to say nothing of the revolution. Siri large number of text entry can not be the voicerecognition will be translated and locally and within the limits of recognition must be particular in itsapplication always believed that education Apple Aspire One A110 battery is a user-initiated actions. "He believes that it is perhaps the input language is not as effective. "Imagine, I would like to search forimages, I say," at the top, left, down one, 3362 edition of the picture, not the left. "This is simply more thanthe image to be much slower." Michael said. "I think that natural language is bound to meet a lot oflimitations."Fourth, the cloud platform, to accelerate the arrival of Chinese speech recognitionStart Siri field has attracted the attention of the speech recognition technology, it is understood that theSyrians in English, French, German and can be launched in the iPhone 4S DV9600 battery used aboveis probably the issue of domestic manufacturers in China with Voice to create a huge positive impact. But
Siri is not just because the voice recognition, but recognition of the semantics of natural language that isunderstandable and appropriate response. Chinese semantics of natural language is the face ofdifficulties and obstacles Syrians. Apple itself attaches great importance to the Chinese market, theoperating system of iPhone first-generation iPhone at the time of publication, including the Chineselanguage and input method had to be seen.In fact, in 1997, IBM ThinkPad T43P battery ViaVoice Chinese speech recognition system able to identifythe Shanghai dialect, Cantonese and Sichuan dialect accent ViaVoice98 areas such as speechrecognition. It comes with a basic vocabulary of 32,000 words can be up to 65,000 words, including theentry of the same premises, a "correction mechanism" to reach the detection rate of 95% can beextended. The Chinese Academy of Sciences Institute of Automation and Art of Technology (Pattek) in2002 jointly launched their platforms and applications "common language", published in Chinese voiceproducts - PattekASR, the end products of Chinese speech recognition by 1998. foreign companies hadmonopolized the story.Perhaps you remember a few years ago appeared in the TV series "Wang Jin Voice" to send and receivetext messages via telephone and the voice, but also with the voice reading the documentation, we cansay that the domestic manufacturers of mobile network , speech at the first attempt.Zhuangzhuan T State Kun, Vice President of Technology for fast Electronic Engineering, told reportersthat, similar to voice recognition on the part of Wang Jin product technology, the number of predefinedcommands or combinations of control, the first Jin Wang is basically a bounded on the telephonevoice Sony Laptop Battery processing processor, computing speed and memory, detects only thestatements of a thousand limited experience is not good. IFLYTEK Vice President Jiang Tao is revealedin this cloud Jinli voice iFLYTEK Ji has a new version on the server in the cloud can make hundreds ofthousands of identification.It is understood that the present understanding of the customers national voice technology in its infancy,the maturity of the Chinese market, the solutions of the voice, not more. Executive Vice President XuMing-Jing Electronics iFLYTEK timely, told reporters that Apple is the practical application of Siri, promote,etc., to accelerate the development and dissemination of HP Laptop Battery industry.After the introduction of Jing-Ming Xu, iFLYTEK two types of speech recognition in the course of feeding,developed their own product information and flight information for input of voice information, thedevelopment of cloud-voice, a large database, The company is currently the voice input to identify thecorrect standard Mandarin has increased to 95%, the second application developers access to thesecretariat to the open cloud platform, including the expansion of the current year, including the financialaccounting software , mapping software, including Kay, were built in voice recognition iFLYTEK.Moreover, with China Telecom, the battery for HP HSTNN-IB72 companys technology with applicationsoftware to improve communication should begin connected.From October 28, 2010, has successfully launched Voice Cloud iFLYTEK the date to hear the voice ofmembers, over 500, including Sina, Sohu, Tencent, Lenovo dig, money, what other music as the music,cloud users Voice over 20 million, and over iFLYTEK, Nuance, the voice of the same version of the OpenCloud Platform, developers can make their speech engine developed a variety of third-party voiceapplications.When Siri applied artificial intelligence, domestic producers in the current joint CAS Tencent launched asmall robot-Q is an important attempt to achieve some thought and understanding, I think with the iPhone4s and Dell Inspiron Mini 12 to promote vigorously the development and smart phones, Chinesemanufacturers have developed artificial intelligence software, your vote early.