Published on

IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. D.Ambika, V.Radha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 2, Issue 5, September- October 2012, pp.1044-1049 Secure Speech Communication – A Review D.Ambika*, V.Radha** *(Department of Computer Science, Avinashilingam Institute for Home Science and Higher education for women, Coimbatore-43) ** (Department of Computer Science, Avinashilingam Institute for Home Science and Higher education for women, Coimbatore-43)ABSTRACT Secure speech communication has been of and strong encryption for providing securegreat importance in civil, commercial and military communication.communication systems. As speechcommunication becomes widely used and even This paper is organized as follows. Section 2more vulnerable, the importance of providing a gives details about the Speaker Identification Sectionhigh level of security becomes a major issue. The 3 presents the Speech Coding. Section 4 deals withmain objective of this paper is to increase the the Speech Encryption and Decryption. Section 5security, and to remove the redundancy for speech explains about the Literature Survey Section 6 givescommunication system under the global context of about the Proposed Methodology. Finally, thesecure communication. So it deals with the Conclusion is summarized in Section 7 with futureintegrating of speech coding, with speaker work.authentication and strong encryption. This paperalso gives an overview and techniques available in 2. SPEAKER IDENTIFICATIONspeech coding, speaker Identification, Encryption During communication the enemy mayand Decryption. The primary objective of this attempts to act as the authorized user and also tries topaper is to summarize some of the well known gain access in it, so to avoid it the sender have tomethods used in various stages for secure speech prove that he is an authenticate speaker againstcommunication system. masquerade. A masquerade is a kind of attack where the enemy attempts to act as the authorized user andKeywords: Acoustic Environment, Speaker also tries to gain access in the communication .ThereAuthentication, Speech coding, Speech encryption by using control one may pass the operational orderwith decryption, Speaker Identification favorable to him which may leads to vulnerability. In such cases a security alert is needed which will save1. INTRODUCTION information. So using speech recognition algorithm, The increasing demand of multimedia the speaker is identified. All the authorized speakers’applications in communication system has tiled the names are indexed and by using the hashingway for secure communication. This is necessary to algorithm the speaker can be retrieved from theovercome unauthorized modifications and unwanted database. The identified speaker can be authenticateddisclosure while transmitting speech and other data, and by combining secure hashing algorithm bits withespecially in wireless channels. In secure speech encoded speech the message digest is recovered. Thecommunication systems the redundancy of the hash functions are used to compress an n (arbitrarily)language plays an important role. The more large number of bits into a small number of bits (e.g.redundant the language, the easier it is for an intruder 512) secure information is transmitted towards theto decipher the information with ease and receiver through different channels. If there is anyconvenience. That is why many real-world mismatch the destination recipient will get alert.cryptographic implementations use a compression There are many ideal cryptographic hash functionsprogram to reduce the size of the signal before available [3]. Some of the hashing functions areencryption. [1]. Compression of signal to lower rates Message Digest, Secure Hash Algorithm (SHA), andwith good speech quality not only eliminates the RIPEMD, GOST and HAVAL. Although there is aredundancy issue but also provides a lower long list of hash functions many of the functions arebandwidth signal, which solves multiple problems in found to be vulnerable. The SHA-0 and SHA-1werecommunication and multimedia applications [1]. The developed by National security Agency. Still there ispossible threats which could attack in passive or a competition for the replacement for SHA-2[4], andactive way includes eavesdropping, modification, also to ensure the long term toughness of applicationsreplay, masquerading, penetration and repudiation that uses hash function.[2]. Speech is used by human to convey informationto one another and this paper concentrates on the 3. SPEECH CODINGintegrating of speech coding, speaker identification The need for elimination, reduction of the redundancy or irrelevant information from the analog 1044 | P a g e
  2. 2. D.Ambika, V.Radha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 2, Issue 5, September- October 2012, pp.1044-1049signals and gave birth to an area of speech coding. capable of providing speech at 9.6-32 kbps withCommercial systems that rely on efficient speech speech quality comparable to that of ADPCM andcoding include cellular communication, voice over ADM. In transform coding the signal is transformedinternet protocol (VOIP), videoconferencing, to its representation in another domain in which it canelectronic toys, archiving, and digital simultaneous be compressed well than in its original form.voice and data (DSVD), as well as numerous PC-based games and multimedia applications. [5].The 3.2 Parametric Coding Methodsproperties of speech coders include low bit rate, high The parametric coding methods are capablespeech quality, low coding delay, robustness in the of providing good quality of speech. Linearpresence of channel errors [6]. The speech coding can Predictive Coding (LPC), Residual Excited Linearbe Lossy or Lossyless. In lossy compression the predictive Coding (RELP) and Mixed Excitationactual data can be recovered from the compressed Linear Predictive Coding (MELP) are the popularfile. In Lossless compression, the actual data cannot example under this class. The LPC method canbe retrieved from the compressed file even it gives produce intelligible speech at 2.4 kbps and it is one ofbest possible quality for the given technique [7]. The the earliest speech coders proposed in literature. LPCquality of speech drops drastically if the encoding bit at bit rates 600 BPS were given by Kang et al [12].rate is reduced beyond a limit. Different speech Atal and Remde established a Multipulse Excitationcoding schemes have resulted into various speech Model and given the improvement related to classicalcodecs and it can be broadly classified depending LPC, and then Self Excitated Vocoders and Residualupon the bit rate at which they operate [8]. There is Excited Linear Predictive (RELP) coders wereanother way of classifying the speech coding introduced. The encoded prediction residual in RELPtechniques which is based on the concepts utilized is used to excite the synthesis filter. Speech qualityand it can broadly be categorized into: offered by RELP coders at 4.8 kbps is higher than  Waveform Coding that of two-state excited LPC coders. To remove the  Parametric Coding annoying artefacts in LPC such as buzzes, tonal  Hybrid Coding noises etc., the MELP uses sophisticated excitation and improved filter model with additional parameters3.1 Waveform Coding to capture signal dynamics with improved accuracy. Waveform coding is used to analyze code MELP utilized vector quantization for LSFand reconstruct original speech sample by sample. It parameters and achieved improvement in speechincludes time domain coding and frequency domain quality with naturalness, smoothness and adaptabilitycoding. The method such as Pulse Code Modulation to diverse signal conditions, in comparison with 2.4(PCM), Differential PCM (DPCM) [9], Adaptive kbps LPC [13] without elevating the bit rate.DPCM (ADPCM), Delta Modulation (DM), andAdaptive PCMID are some of the popular time 3.3 Hybrid Codersdomain waveform coding techniques and Transform Hybrid coding combines strengths ofCoding (IC), Sub band Coding (SBC) are a few waveform coding and parametric coding techniques.spectral domain waveform coding techniques. The Like a parametric coder, it relies on speechPulse Code Modulation (PCM) [9] which is used to production model. Hybrid coders are used to encodedigitize the signals through signal conversion. speech, and the bandwidth requirements lies betweenDifferential Pulse Code Modulation (DPCM) can be 4.8 and 16 kbps. The hybrid coders include CELP,analog signal or a digital signal. It uses the baseline MPE and RPE coders. Multipulse Pulse Excitedof PCM but it adds some functionality based on the coding (MPE) and Regular Pulse Excited codingprediction of the samples of the signal. In DPCM, (RPE) techniques try to improve the speech qualityfirst an estimate of each sample is found based on by giving a better representation of the excitationprediction from past few samples and then the signal and it produces high quality speech at 9.6difference of estimate from the original. The DPCM kbps. Codebook Excited Linear Prediction (CELP)can provide PCM quality of speech at 56kbps. The technique can be also called as analysis-by-synthesisADPCM Adaptive Differential Pulse Code (Abs) technique with the bit rates of 4.8 kbps. CELPModulation (ADPCM) [10] which is used to provide and its variants are the most outstandingmuch lower data rates by using a functional model of representatives of this class which dominates mediumthe human speaking mechanism at the receiver end bit rate coders. Idea of CELP was born as an attempt[11]. The frequency domain includes sub band to improve on LPC coder. It basically covered a widecoding and the transform coding. Advantages with range of bit rates 4.8-16 kbps.SBC is that quantization noise in each band getsisolated from others and also bit rate optimization can 4. SPEECH ENCRYPTION ANDbe achieved by assigning more number of bits to DECRYPTIONspeech signal in lower frequency bands(that is Cryptography is the science of convertingresponsible for intelligibility) than in higher information from the comprehensible form tofrequency bands. Variants of sub band coding are incomprehensible form for its secured 1045 | P a g e
  3. 3. D.Ambika, V.Radha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 2, Issue 5, September- October 2012, pp.1044-1049communication over the insecure channel. and the receiver should know the secret key, whichEncryption is a very common technique for will be helpful to encrypt and decrypt all thepromoting the security and it is a much stronger information’s. Fastness is the major advantage ofmethod of protecting speech communication than any using symmetric encryption. Some of the commonlyform of scrambling. The various advantages of used symmetric encryption algorithms are listed inencryption includes that [14] it can protect table 2[16]information stored on the computer fromunauthorized access and while it is in transit from one 4.2 Asymmetric Encryptioncomputer system to another it can protect In asymmetric encryption, different keys caninformation. In general, encryption techniques are be used to encrypt and decrypt the data. Theclassified into two broad categories [15] such as encryption key is public whereas the decryption keySymmetric and Asymmetric Encryption is private. However, it has two major disadvantages such as it is based on nontrivial mathematical4.1 Symmetric Encryption computations, and it is very slower than theThe symmetric encryption also called as single-key symmetric ones. Some of the popular examples ofencryption, one-key encryption or private key asymmetric encryption algorithm include RSA, DSAencryption. It uses the same secret key to encrypt and and PGP [17].The RSA encryption is the best knowndecrypt the information. It is essential that the sender public keySymmetric Developer BlockEncryption Cryptanalysis resistance Security sizeAlgorithmAdvanced Vincent Rijmen 128-, It is very Strong against truncated More secureEncryption and Joan Daemen 192- or differential, linear, interpolationStandard in 2000 256-bit and square attacksData Encryption IBM in 1977 64bit Vulnerable to differential and linear ProvenStandard block crypt analysis inadequateTriple Data 1978 64bit Vulnerable to differential, Brute One only weakEncryption block force attacker could be analyze which is exit inStandard plain text using differential crypt DES analysisCAST Carlisle Adams 64bit Very fast and and Stafford block efficient [18] Tavares in1996algorithm, named after its investors: Rivest, Shamir presented an automated method for cryptanalysis ofand Adleman. The key used for encryption is a public DFT based analog speech scramblers throughkey and the key used for decryption is a private key. statistical estimation treatment. E.V.Stansfield et al inThe Digital Signature Algorithm (DSA) is a United Speech processing techniques for HF radio securityStates Federal Government standard or FIPS for [20] explains the techniques used to provide securedigital signatures. It was proposed by the National conversational speech communications over HF radioInstitute of Standards and Technology (NIST) in channel. An efficient implementation of multi-primeAugust 1991 for use in their Digital Signature on DSP Processor proposed by K. Anand etal inStandard (DSS). PGP stands for Pretty Good Privacy 2003[21] implemented Montgomery squaringwhich is a public-private key cryptography system reduction method which speed ups by 10.15% forallows for users to easily integrate the use of various key sizes Texas Instrument TMS320C6201encryption in their daily tasks, such as electronic mail DSP and authentication, and protecting files Cryptanalysis of adaptive arithmetic codingstored on a computer. It was originally designed by encryption scheme presented by J Lim et al inPhil Zimmerman. It uses idea, cast or triple des for 1997[22] carried an analysis on different plaintextactual data encryption and RSA (with up to 2048- and cipher text and subsequent results were evaluatedbit key) or DH/DSS (with 1024-bit signature key and accordingly. Speaker Recognition from Coded4096-bit encryption key) for key management and Speech and the Effects of Score Normalization wasdigital signatures. proposed by R.B. Dunn et al [23]. This paper explains about the effect of speech coding on5. REVIEW OF LITERATURE automatic speaker recognition where training and W.W Chang et al., in the “automated testing conditions are matched and mismatched. Incryptanalysis of DFT-based speech scramblers” [19] the recognition performance there is little loss in the toll quality speech coders and more loss when lower quality speech coders are used. Both types of score 1046 | P a g e
  4. 4. D.Ambika, V.Radha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 2, Issue 5, September- October 2012, pp.1044-1049normalization considerably improve performance, multi-hop wireless links that addresses both theand it can eliminate the performance loss when there efficient use of network, node resources and securityis a mismatch between training and testing against unwanted eavesdroppers. It is shown thatconditions. when the Shannon lower bound is satisfied with equality for rate distortion optimal scalable coding, Jan Silovsky et al presented their work in the transmission of the enhancement layer in-the-clearpaper titled Assessment of Speaker Recognition on provides no information regarding the core layer.Lossy Codecs used for Transmission of Speech in2011[24].This paper investigates the effect of lossy 6. PROPOSED METHODOLOGYcodecs used in telephony on text-independent speakerrecognition. Here the speaker recognition Input speechperformance is degraded due to the bandwidth usage,transmission packet loss and utilization of Speaker Identificationdiscontinuous transmission techniques. There is onlylittle loss in recognition performance for codecsoperating at bit rates of about 15 kb/s and the bestoverall performance was observed for the SILK Speech Encodingcodec. R.B. Dunn et al present their work on”Speaker Recognition from coded Speech and theEffects of Score Normalization”, in [25] MIT Lincoln EncryptionLaboratory, Lexington. The author investigates theeffect of speech coding on automatic speakerrecognition when training and testing conditions arematched and mismatched. This paper use standardspeech coding algorithms (GSM, G.729, G.723,MELP) and a speaker recognition system based ongaussian mixture models adapted from a universal Decryptionbackground model for experimentation.Aman Chadha, Divya Jyoti, M. Mani Roja, reviewedtheir work in the paper titled” Text-Independent Speech DecodingSpeaker Recognition for Low SNR Environmentswith Encryption” in [26]. The main objective of thispaper is to implement a robust and secure voicerecognition system using minimum resources Speaker Identificationoffering optimum performance in noisyenvironments. The proposed text-independent voicerecognition system makes use of multilevel Synthetic Speechcryptography to preserve data integrity while intransit or storage. The experimental results show that Is speakerthe proposed algorithm can decrypt the signal under Identity matchtest with exponentially reducing Mean Square Error Yes Noover an increasing range of SNR. Further, itoutperforms the conventional algorithms in actualidentification tasks even in noisy environments. Speaker Accepted Authentication A. R. Stauffer, A. D. Lawson ,” SpeakerRecognition on Lossy Compressed Speech using the Fig. 1 Secure Speech Alert CommunicationSpeex Codec” in 2009 in[27]. This paper examinesthe impact of lossy speech coding with Speex on The overall proposed methodology forGMM-UBM speaker recognition (SR). Results show secured speech communication is given in figure 1that Speex is effective for compression of data used which is illustrated as follows:in SR and that Speex coding can improve o First of all the input speech signal is given by theperformance on data compressed by the GSM codec sender..J. D. Gibson, A. Servetti focus on Selective o But before transmitting to the receiver theEncryption and Scalable Speech Coding for Voice speaker is identified by using the speechCommunications over Multi-Hop Wireless Links in recognition algorithm to prove that he is an[28]. This paper proposes and investigates a authorized speaker.combination of scalable speech coding and selectiveencryption for secure voice communication over 1047 | P a g e
  5. 5. D.Ambika, V.Radha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 2, Issue 5, September- October 2012, pp.1044-1049o Along with the speaker identification the original Applications (0975 – 8887) Volume 38– speech is compressed using the compression No.11, January 2012 algorithm [7] Dr.V.Radha et al,” Comparative Analysis ofo As the next step the compressed speech is Compression Techniques for Tamil Speech encrypted and the secure information is Datasets, IEEE, ICRTIT, June 3-5, 2011 transmitted towards the receiver through [8] Venkatesh Krishnan,” A Framework For different channels Low Bit-Rate Speech Coding In Noisyo The receiver decompresses the data and decrypts Environment”, A school of Electrical and the information to get the original speech and Computer Engineering also he checks by using the hashing algorithm [9] P.Cummiskey et al,”Adaptive Quantization that the original message came from the in Differential PCM Coding of speech, “Bell authorized user. sys.Tech.J.Vol.52.No.7.p.1105.sept.1973o If the speaker identity matches then the receiver [10] G.Kang and D.Coutler,”600 Bit-per-second follows the instructions according to the sender voice digitizer(linear predictive formato If there is any mismatch the destination recipient coder),”NRL Report 8043,Nov 1976 will get alert. [11] Jorgen Ahlberg,” Speech & Audio Coding” TSBK01 Image Coding and Data7. CONCLUSION AND FUTURE WORK Compression Lecture 11, 2003 Speech processing for secured [12] G.Kang,”Application of linear prediction tocommunication has been in development for more a narrow band voice digitizer,”NRL Reportthan 50 years. This paper gives the various 7774,Oct.1974techniques in the field of Speech Coding, Speaker [13] Benesty,Sondhi, Huang, “SpringerIdentification with Encryption and Decryption. The Handbook of Speech Processing”various approaches available for developing a [14] Nehaluddin Ahmad,,”Privacy and the Indiansecured communication are clearly explained. In Constitution: A case study of Encryption”,recent years, the need for secured communication Communication of the IBIMA volumeresearch based on Speech Coding with cryptography 7,2009 ISSN:1943-7765has highly increased. This paper is based on [15] S.Rajanarayanan and A. Pushparaghavan,compression technologies that exploit this technology “Recent Developments in Signal Encryptionwith encryption and decryption for an environment – A Critical Survey”, International Journalthat promotes and facilitates the use of safe of Scientific and Research Publications,communication. More specifically, the application Volume 2, Issue 6, June 2012, ISSN 2250-takes responsibility for the speaker identification 3153using the cryptographic hash functions so that the [16] Hamdan.O.Alanazi etal, “New Comparativeunauthorized speakers will find difficult to trace out Study Between DES, 3DES and AES withinthe original data. The future work is to find out the Nine Factors”, Journal of computing,optimal method suitable for this environment to volume2, issue3, march 2010provide better secured communication. [17] tric-encryption-vs-asymmetric.htmlREFERENCES [18] Http:// [1] A. Jameel et al, ” A robust secure speech key-infrastructure/symmetric- encryption- communication system using ITU-T G.723.1 algorithms.htm and TMS320C6711 DSP”, Microprocessors [19] W.W Chang et al., ”The automated and Microsystems, volume 30,2006,Pages cryptanalysis of DFT-based speech 26-32 scramblers,” IEICE Trans .Information and [2] A. Jameel, “Transform-domain and DSP system,E83,pp.2107-2112,2000 based secure speech communication”, [20] E.V Stansfield et al., “Speech processing Microprocessors and Microsystems,2007, techniques for HF radio security,” IEEE 335–346 Proce.,vol.136,1989 [3] [21] K.Anand,et al.,”An efficient implementation hash_function of multi-prime on DSP [4] Processor,”inProc.ICASSP,2003,pp.413-416 3/index.html [22] J.Lim et al.,”Cryptanalysis of adaptive [5] Mark Hasegawa-Johnson et al,” Speech arithmetic coding encryption scheme,” in Coding: Fundamentals And Applications”, Proc. ACISP,1997 pp.216-227 Wiley Encyclopedia of Telecommunications [23] R.B. Dunn, T.F. et al, “Speaker Recognition [6] Akella Amarendra Babu et al, ,”Robust from Coded Speech and the Effects of Score speech processing in EW environment”, Normalization”, IT Lincoln Laboratory, International Journal of Computer Lexington, MA 1048 | P a g e
  6. 6. D.Ambika, V.Radha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 2, Issue 5, September- October 2012, pp.1044-1049[24] Jan Silovsky, Petr Cerva, Jindrich Zdansky,”Assessment of Speaker Recognition on Lossy Codecs Used for Transmission of Speech”, 53rd International Symposium ELMAR-2011, 14-16 September , Zadar, Croatia[25] R.B. Dunn, T.F. Quatieri, D.A. Reynolds, J.P. Campbell, “Speaker Recognition from Coded Speech and the Effects of Score Normalization”, IT Lincoln Laboratory, Lexington, MA[26] Aman Chadha, Divya Jyoti, M. Mani Roja, ”Text-Independent Speaker Recognition for Low SNR Environments with Encryption”, International Journal of Computer Applications (0975 – 8887) Volume 31– No. 10, October 2011[27] A. R. Stauffer, A. D. Lawson,” Speaker Recognition on Lossy Compressed Speech using the Speex Codec” 2009 ISCA 6-10 September, Brighton UK[28] J. D. Gibson, A. Servetti ,” Selective Encryption and Scalable Speech Coding for Voice Communications over Multi-Hop Wireless Links “ 1049 | P a g e