Reverse-engineering a proprietary sound sample format: A detective story Andrew Bulhak  http://dev.null.org/acb/
The Problem <ul><ul><li>You make electronic music with softsynth plugins </li></ul></ul><ul><ul><li>Your drum machine plug...
Why? <ul><ul><li>To use your samples (or presets) with other software </li></ul></ul><ul><ul><li>To use samples with hardw...
The Drum Plugin <ul><li>Linplug RMIV </li></ul>
The Drum Plugin (2) <ul><li>Linplug RMIV </li></ul><ul><ul><li>A VST/AudioUnit plugin that works with sequencer software <...
Examining the D4T format <ul><li>Looking at the same sounds in both AIFF and D4T format: </li></ul><ul><li>Do you notice a...
The D4T Format (2) <ul><ul><li>Each D4T file is roughly twice the size of its corresponding WAV </li></ul></ul><ul><ul><li...
Testing the Hypothesis <ul><li>Using  hexdump(1) , examine the first nonzero samples of a WAV and a D4T. Then see if they'...
Testing the Hypothesis (2) <ul><li>% hexdump -C 606bd.wav |head  </li></ul><ul><li>00000000  52 49 46 46 62 50 00 00  57 4...
Testing the Hypothesis (3) <ul><li>We can use Python's struct module to unpack binary floats. </li></ul><ul><li>>>> struct...
A First Attempt <ul><li>We can now write a simple Python script for converting sound files. Our script will: </li></ul><ul...
A First Attempt (2) <ul><li>Our script has a few limitations: </li></ul><ul><ul><li>It assumes a default sample rate (4410...
The D4T Header <ul><li>The secrets must be encoded in the nonzero bytes at the top of the file. </li></ul><ul><li>% hexdum...
Number of Channels <ul><li>We examine two D4T files, one mono and one stereo: </li></ul><ul><li>% hexdump 606bd.D4T|head  ...
File Size <ul><li>The group of bytes immediately before the channel count is proportional to the file size, though doesn't...
File Size (2) <ul><li>We create a few AIFFs of specific sizes, get RMIV to convert them into .D4Ts, and examine those: </l...
File Size - The Answer <ul><li>It appears that each byte in the file size can only contain a value under 100 (0x64). </li>...
Sample Rate <ul><li>Now that we know about binary-coded centimal, the rest of the puzzle falls into place. </li></ul><ul><...
Summary: The D4T Format <ul><li>We now know the structure of a .D4T: </li></ul><ul><ul><li>8 zero bytes </li></ul></ul><ul...
Conclusion <ul><li>Armed with this knowledge, it is possible to write a Python script to convert D4T files to AIFF. </li><...
Upcoming SlideShare
Loading in …5
×

Reverse-Engineering a Proprietary Sound Sample Format

2,739 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,739
On SlideShare
0
From Embeds
0
Number of Embeds
17
Actions
Shares
0
Downloads
18
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Reverse-Engineering a Proprietary Sound Sample Format

  1. 1. Reverse-engineering a proprietary sound sample format: A detective story Andrew Bulhak http://dev.null.org/acb/
  2. 2. The Problem <ul><ul><li>You make electronic music with softsynth plugins </li></ul></ul><ul><ul><li>Your drum machine plugin uses a proprietary format for its samples </li></ul></ul><ul><ul><li>You want to move your samples to an open format, like AIFF </li></ul></ul>
  3. 3. Why? <ul><ul><li>To use your samples (or presets) with other software </li></ul></ul><ul><ul><li>To use samples with hardware devices </li></ul></ul><ul><ul><li>Because open formats are always better than proprietary ones </li></ul></ul><ul><ul><li>For the challenge </li></ul></ul>
  4. 4. The Drum Plugin <ul><li>Linplug RMIV </li></ul>
  5. 5. The Drum Plugin (2) <ul><li>Linplug RMIV </li></ul><ul><ul><li>A VST/AudioUnit plugin that works with sequencer software </li></ul></ul><ul><ul><li>Can play both sample-based and synthesised sounds </li></ul></ul><ul><ul><li>Stores samples in a custom format named .D4T </li></ul></ul><ul><ul><li>Can import sounds in AIFF and WAV formats, converting them to .D4T </li></ul></ul>
  6. 6. Examining the D4T format <ul><li>Looking at the same sounds in both AIFF and D4T format: </li></ul><ul><li>Do you notice a pattern? </li></ul>% ls -l -rw-r--r-- 1 acb staff 20586 26 May 2000 606bd.wav -rw-r--r-- 1 acb staff 40904 11 Apr 14:06 606bd.D4T -rw-r--r-- 1 acb staff 15182 26 May 2000 606ch.wav -rw-r--r-- 1 acb staff 30096 11 Apr 14:06 606ch.D4T -rw-r--r-- 1 acb staff 33900 26 May 2000 606ht.wav -rw-r--r-- 1 acb staff 67536 11 Apr 14:06 606ht.D4T -rw-r--r-- 1 acb staff 31426 26 May 2000 606lt.wav -rw-r--r-- 1 acb staff 62588 11 Apr 14:06 606lt.D4T
  7. 7. The D4T Format (2) <ul><ul><li>Each D4T file is roughly twice the size of its corresponding WAV </li></ul></ul><ul><ul><li>The WAVs use 16-bit samples </li></ul></ul><ul><ul><li>Therefore, D4T uses 32-bit samples </li></ul></ul><ul><li>Hypothesis: D4T uses 32-bit float samples, between -1.0 and 1.0 (as that makes more sense than 32-bit ints). </li></ul>
  8. 8. Testing the Hypothesis <ul><li>Using hexdump(1) , examine the first nonzero samples of a WAV and a D4T. Then see if they're equivalent. </li></ul><ul><li>I.e., if W = the WAV sample and D the D4T one, if W = int(D*0x8000) </li></ul>
  9. 9. Testing the Hypothesis (2) <ul><li>% hexdump -C 606bd.wav |head </li></ul><ul><li>00000000 52 49 46 46 62 50 00 00 57 41 56 45 66 6d 74 20 |RIFFbP..WAVEfmt </li></ul><ul><li>00000010 10 00 00 00 01 00 01 00 44 ac 00 00 88 58 01 00 |........D?...X.. </li></ul><ul><li>00000020 02 00 10 00 64 61 74 61 d0 4f 00 00 77 03 96 0b |....data?O..w... </li></ul><ul><li>% hexdump 606bd.D4T|head </li></ul><ul><li>0000000 00 00 00 00 00 00 00 00 00 04 08 40 01 04 29 00 </li></ul><ul><li>0000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 </li></ul><ul><li>0000020 00 00 00 00 00 00 00 00 00 c0 dd 3c 00 60 b9 3d </li></ul><ul><li>Thus our first samples are 0x0377 and the float represented by 00 c0 dd 3c. </li></ul>
  10. 10. Testing the Hypothesis (3) <ul><li>We can use Python's struct module to unpack binary floats. </li></ul><ul><li>>>> struct.unpack('<f', 'x00xc0xddx3c') </li></ul><ul><li>(0.027069091796875,) </li></ul><ul><li>Aha, a float between -1.0 and 1.0. </li></ul><ul><li>>>> int(struct.unpack('<f', 'x00xc0xddx3c')[0] * 0x8000) </li></ul><ul><li>887 </li></ul><ul><li>>>> '%x' % 887 </li></ul><ul><li>'377' </li></ul><ul><li>Eureka! </li></ul>
  11. 11. A First Attempt <ul><li>We can now write a simple Python script for converting sound files. Our script will: </li></ul><ul><ul><li>Open a .D4T file </li></ul></ul><ul><ul><li>Skip 40 bytes </li></ul></ul><ul><ul><li>Read the remainder of the file, treating it as floats </li></ul></ul><ul><ul><li>Write all that to an AIFF file (using Python's aifc module). </li></ul></ul><ul><ul><ul><li>You could just as easily write WAV </li></ul></ul></ul>
  12. 12. A First Attempt (2) <ul><li>Our script has a few limitations: </li></ul><ul><ul><li>It assumes a default sample rate (44100, in this case) </li></ul></ul><ul><ul><li>It assumes samples have one channel </li></ul></ul><ul><ul><ul><li>If a D4T has two channels, the resulting AIFF will be mono, with one after the other. </li></ul></ul></ul><ul><li>As such, we need to extract more information from our D4T files. </li></ul>
  13. 13. The D4T Header <ul><li>The secrets must be encoded in the nonzero bytes at the top of the file. </li></ul><ul><li>% hexdump 606bd.D4T|head </li></ul><ul><li>0000000 00 00 00 00 00 00 00 00 00 04 08 40 01 04 29 00 </li></ul><ul><li>0000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 </li></ul><ul><li>0000020 00 00 00 00 00 00 00 00 00 c0 dd 3c 00 60 b9 3d </li></ul><ul><li>But what do they mean? </li></ul>
  14. 14. Number of Channels <ul><li>We examine two D4T files, one mono and one stereo: </li></ul><ul><li>% hexdump 606bd.D4T|head </li></ul><ul><li>0000000 00 00 00 00 00 00 00 00 00 04 08 40 01 04 29 00 </li></ul><ul><li>0000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 </li></ul><ul><li>0000020 00 00 00 00 00 00 00 00 00 c0 dd 3c 00 60 b9 3d </li></ul><ul><li>% hexdump BgBeatSnare.D4T|head -2 </li></ul><ul><li>0000000 00 00 00 00 00 00 00 00 00 1d 2c 00 02 04 29 00 </li></ul><ul><li>0000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 </li></ul><ul><li>Aha! We've found our channel count. </li></ul>
  15. 15. File Size <ul><li>The group of bytes immediately before the channel count is proportional to the file size, though doesn't translate into a sensible number of either bytes or samples. </li></ul><ul><li>% ls -l </li></ul><ul><li>-rw-r--r-- 1 acb wheel 40904 11 Apr 14:06 606bd.D4T -rwxr-xr-x 1 acb wheel 697976 8 Jul 00:42 808Kick.D4T </li></ul><ul><li>-rwxr-xr-x 1 acb wheel 8152 8 Jul 00:42 GenericSynBass.D4T </li></ul><ul><li>% hexdump 606bd.D4T | head -1 </li></ul><ul><li>0000000 00 00 00 00 00 00 00 00 00 04 08 40 01 04 29 00 </li></ul><ul><li>% hexdump 808Kick.D4T | head -1 </li></ul><ul><li>0000000 00 00 00 00 00 00 00 00 00 45 4f 24 02 04 29 00 </li></ul><ul><li>% hexdump GenericSynBass.D4T | head -1 </li></ul><ul><li>0000000 00 00 00 00 00 00 00 00 00 00 51 0c 01 01 63 4d </li></ul>
  16. 16. File Size (2) <ul><li>We create a few AIFFs of specific sizes, get RMIV to convert them into .D4Ts, and examine those: </li></ul><ul><li>% ls -l </li></ul><ul><li>-rw-r--r-- 1 acb admin 132 8 Jul 14:18 test_23x1.D4T -rw-r--r-- 1 acb admin 136 8 Jul 14:18 test_24x1.D4T -rw-r--r-- 1 acb admin 140 8 Jul 14:18 test_25x1.D4T </li></ul><ul><li>% hexdump test_23x1.D4T |head -1 </li></ul><ul><li>0000000 00 00 00 00 00 00 00 00 00 00 00 5c 01 04 29 00 </li></ul><ul><li>% hexdump test_24x1.D4T |head -1 </li></ul><ul><li>0000000 00 00 00 00 00 00 00 00 00 00 00 60 01 04 29 00 </li></ul><ul><li>% hexdump test_25x1.D4T |head -1 </li></ul><ul><li>0000000 00 00 00 00 00 00 00 00 00 00 01 00 01 04 29 00 </li></ul><ul><li>At small sizes, this looks like a byte count, though after 96 bytes, there's a discontinuity. </li></ul>
  17. 17. File Size - The Answer <ul><li>It appears that each byte in the file size can only contain a value under 100 (0x64). </li></ul><ul><li>From this, we can conclude that the size is encoded in binary-coded centimal . </li></ul><ul><ul><li>Each byte contains a base-100 digit, or two decimal digits, as a number from 0 to 99. </li></ul></ul><ul><ul><li>I.e., 10632 would be 0x01 0x06 0x20 </li></ul></ul><ul><ul><li>Why? I have no idea. </li></ul></ul>
  18. 18. Sample Rate <ul><li>Now that we know about binary-coded centimal, the rest of the puzzle falls into place. </li></ul><ul><li>The remaining 3 bytes after the channel count are the sample rate in BCC, i.e.: </li></ul><ul><li>% hexdump 606bd.D4T|head </li></ul><ul><li>0000000 00 00 00 00 00 00 00 00 00 04 08 40 01 04 29 00 </li></ul><ul><li>0x04 0x29 0x00 = 04 41 00 = 44100 </li></ul>
  19. 19. Summary: The D4T Format <ul><li>We now know the structure of a .D4T: </li></ul><ul><ul><li>8 zero bytes </li></ul></ul><ul><ul><li>4 byte total length ( L ) of sample data, in bytes, DCC-encoded </li></ul></ul><ul><ul><li>1 byte channel count C </li></ul></ul><ul><ul><li>3-byte sample rate, DCC-encoded </li></ul></ul><ul><ul><li>24 zero bytes </li></ul></ul><ul><ul><li>for i in 1.. C : </li></ul></ul><ul><ul><ul><li>( L / C )/4 samples, comprising that channel </li></ul></ul></ul>
  20. 20. Conclusion <ul><li>Armed with this knowledge, it is possible to write a Python script to convert D4T files to AIFF. </li></ul><ul><li>That script lives at </li></ul><ul><li>http://dev.null.org/code/dermiv/ </li></ul>

×