Psychoacoustic Approaches to Audio Steganography Report


Published on

This paper explores methods of audio steganography with emphasis on psychoacoustic approaches. Specifically, it describes a project that had the requirement of hiding a text-based message inside an audio signal with minimal or no distortion of the signal as perceived by the human ear. The theory and experimental results of each approach are discussed.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Psychoacoustic Approaches to Audio Steganography Report

  1. 1. ECES 434 1. The first and simplest method we implemented REPORT is known as the Least-Significant Bit (LSB) Psychoacoustic Approaches to Audio Steganography method. In the LSB method, the least signifi- cant bit of each sampling point of the original Cody A. Ray signal is substituted with a binary message. Drexel University 2. The second method we demonstrated was a Fall 2009 amplitude modulation (AM) algorithm for the time-domain. We slice the time signal into Introduction “blocks” and scale each block according to bits Steganography is the art and science of writing hidden of the message. messages in such a way that no one, apart from the 3. The last method we explored was use of the sender and intended recipient, suspects the existence of MPEG Model 1 Layer 1 psychoacoustic model the message, a form of security through obscurity. The to calculate the unnecessary bits using the word steganography is of Greek origin and means signal-to-mask ratios (SMR). Then we replace "concealed writing". Apart from the obvious applica- the unnecessary bits with those of the message. tions of transporting hidden information between enti- ties, the methods of steganography are also used within Least-Significant Bit Method copyright protection, the detection of content manipu- The method of least-significant bit (LSB) coding is lation, fingerprinting, and watermarking. the simplest technique for embedding information in a digital audio file. The least-significant bit of The objective of this project was to explore meth- each sample in the signal is substituted with a bit ods of audio steganography with emphasis on psy- from the secret message. One bit is embedded per choacoustic approaches. Specifically, the project has each sample; thus, the LSB method allows for en- the requirement of hiding a text-based message in- coding a large amount of data. side an audio signal with minimal or no distortion of the signal as perceived by the human ear. In all ap- To recover the message hidden inside an LSB en- proaches, we assume that the length of the message coded audio track, the receiver needs to know the to be hidden is much smaller than the number of sequence of indices corresponding to each embed- samples in the original sound track. We did not con- ded sample. There are a number of methods used to sider the resilience of the embedded message to choose the subset of samples in which to embed bits attacks or otherwise “friendly” transformations of of the message; however, whatever the method, the the host signal. receiver must also know the algorithm used for se- lecting the samples. One trivial method starts at a Approaches constant distance from the beginning of the audio We will compare and contrast three different ap- track and perform LSB coding until the message has proaches to audio steganography. been completely embedded within the signal, with- out changing any of the remaining samples. How- ECES 434 Report 1
  2. 2. ever, this approach creates an easy-to-detect statisti- is the ratio of the lengths of the samples to the mes- cal anomaly as the probabilities are non-uniform sage. Correspondingly, a smaller message can be en- across the sample set. coded in this technique. One way to avoid this issue is by padding the mes- To recover the message hidden inside an TDAM sage with random bits in order to make the message encoded audio track, the receiver needs access to length the same as the number of samples. However, the original audio file, and must know the scale fac- we’re now embedding far more information than tors used in coding the message. Extraction is done required to convey the given message. By modifying by scaling the original file by the lowest scale factor, more of the file than necessary, we’re increasing the and comparing whether each frame of the “dirty” amount of noise in the signal, which in turn in- signal is greater than the scaled original. creases the probability of detection of the hidden Many of the issues addressed in the previous section message. on LSB coding apply to TDAM as well. These issues A more sophisticated approach involves the use of a will not be covered again here. Note, however, that random number generator to spread the secret mes- this method doesn’t require any additional data to sage out over the audio track in a random manner. be embedded, and the signal is modified uniformly. One popular approach uses a shared secret as a seed for the random number generator, allowing the Amplitude Modulation via Psychoacoustic Models sender and receiver to independently construct the The most sophisticated approach is amplitude same pseudorandom sequence of sample indices. modulation in the frequency domain based upon One drawback is the necessity to avoid collisions MPEG Model 1 Layer 1 psychoacoustic model. The created by using the same sample index twice; a basic algorithm is as follows: bookkeeping system can be used to track previous indices. Alternatively, a pseudorandom permutation 1. Calculate the power spectrum. of the entire set can be constructed through the use 2. Identify the tonal and non-tonal components. of a secure hash function. 3. Decimate the maskers to eliminate all irrelevant All of the above variants do not require the original maskers. audio track to recover the message. 4. Compute the individual masking thresholds. Since we did not consider resilience to attacks in this study, we implemented the trivial method out- 5. Compute the global masking threshold. lined above. As a matter of practical concern, we also prefixed the message with an identifier string to 6. Determine the minimum masking threshold in mark the file as containing a secret message, and each subband. included the size of the secret message to guide the 7. Shape the power of the message below the mask- receiver as to where to stop decoding the signal. ing threshold. Time-domain Amplitude Modulation Method The psychoacoustic model shows components in Time domain amplitude modulation (TDAM) capi- the signal that do not affect perception. The mask- talizes on the difficulty of differentiating between ing threshold defines the frequency response of the subtle changes in perception of loudness. The signal loudness threshold minimum filter, which is used to is sliced in the time domain, and the message is en- shape the message. The filtered message is scaled to coded as a scale factor applied to each time slice. shift the message noise and added to the delayed One bit is encoded per block, where the block size original signal in order to produce the “dirty” track. ECES 434 Report 2
  3. 3. Results LSB Coding for Mono Wav LSB Coding for Stereo Wav ECES 434 Report 3
  4. 4. Time Domain Amplitude Modulation for Mono Wav Time Domain Amplitude Modulation for Stereo Wav ECES 434 Report 4
  5. 5. Discussion and Conclusion Bibliography We tested on both mono and stereo channel wave Arnold, Michael. “Audio Watermarking.” Published: files. These are depicted in the results section above. November 1, 2001. Access: December 2, 2009. It should be noted that the magnitude access for the mono channel is always twice as large as that of the Cvejic, Nedeljko. “Algorithms for Audio Watermark- stereo files due to an artifact from the preprocessing ing and Steganography.” University of Oulu. 2004. state, where we converted the original stereo wav file to a mono wav file. Also, note that our time- Garcia, R.A. “Digital Watermarking of Audio Signals domain amplitude modulation approach currently using Psychoacoustic Auditory Model and Spread Spectrum Theory.” Preprints-Audio Engineering Society. outputs a mono channel WAVE format file regard- Citeseer. 1999. less of the number of channels available in the input. Petitcolas, Fabien. “MPEG for MATLAB.” Pub- In LSB coding, when modifying the least significant lished: August 11, 2003. Access: December 2, 2009. bit in the first coding system, the “bin” into which the quantized signal falls is being directly modified. Welsh, Eric. Chen, Alex. Shehad, Nader. Virani, Since we’re only modifying the quantization level by Aamir. “W.A.V.S Compression.” one, at worst, we’re only modifying the time-domain signal by a small value that’s dependent on the num- Wikipedia. “Steganography.” Access: November 17, ber of bits used for quantization. Effectively, we’re 2009. increasing the noise due to quantization and hiding the message in this noise. This induces a small pen- Wilson, Scott. “Microsoft WAVE soundfile format.” Published: January 20, 2033. Access: November 19, alty due to being an audibly perceptual modification. 2009. In TDAM coding, we’re decreasing the amplitude of at/ the time-domain signal by 1-2%. This will primarily affect the perceived loudness of the sound. Addi- tionally, this coding system could slightly affect per- ception of pitch, due to intensity-dependent factors related to the perception of pitch. However, because the scale is small, this system produces a “dirty” audio signal that yields a negligible difference from the magnitude of the original signal. Unfortunately, time did not allow for the comple- tion of the MPEG-based steganography system im- plementation prior to final reporting. However, the hypothesis that each approach is successively better than the previous was true, which indicates that when completed this technique will be superior to the others. ECES 434 Report 5