How to Remove Document Management Hurdles with X-Docs?
Digital speech within 125 hz bandwidth
1. Digital Speech within 125 Hz Bandwidth (DS-125)
by Michael E. Lebo
Abstract: Sound cards have 24 bit accuracy and 96,000 Hz sample rates, which lets
artificial intelligence methods to find unique sounds within live voice every .008
seconds. A new binary code sends the speech and timing within 125 Hz bandwidth
to a lookup table of audio clips, which are played and understood in real time. We
can recognize sounds within this synthetic live voice and ignore distortion between
these sounds. Someone, anyone, please finish this project for me.
Distortion: The Shannon-Hartley Theorem implies voice bandwidth cannot be less
than the bandwidth of voice without adding distortion. Let’s make it at the right time.
The project name is digital speech, not digital voice. What if up to 40% of the live
voice is distorted? Would that constitute failure? What if, your mind didn’t
recognize distortion at that 40% of the time? There are known sounds that make
up speech and there must be transition times between those known sounds,
because the parts of the mouth, tongue, lips, jaw, etc. take time to change from one
known sound to another. Although your ears can hear these transitions, why should
your mind use them?
Synthetic live voice is played on the computers speaker using extremely short audio
clips. The audio clips were made in the past from your live voice. But the phase at
the end of any audio clip must match the phase at the beginning of the next audio
clip or distortion occurs. If the amplitude of all audio clips at the beginning and end
was zero, the phase would always match, but this reduction of amplitude creates a
new kind of distortion. The solution is to extend the duration of these audio clips by
.04 seconds (five times faster than a eye blink or 3840 samples of the master clock)
and use that time to slowly reduce the amplitude at the end of the first audio clip
from full to zero and to slowly raise the amplitude at the beginning of the next audio
clip from zero to full, then overlap the audio clips by .04 seconds. This is the time
when the 40% distortion fulfills the Shannon-Hartley Theorem.
2. Proof-of-concept: For a development system only one computer is needed to take
your live voice from the microphone to a headset. Others who want to repeat this
proof-of-concept must use their live voice to make the required parts of this project.
In the final system a one or a zero goes into a transport device every .008 seconds
(768 samples), which is the 125 Hz bandwidth. A sound detector that continually
finds sounds in .008 second steps is required to know, if the next transport digit is a
one or zero. But sound detectors can make distortion. I have limited the total number
of sounds in voice to 88, but what if more are needed? Linguists have found
between 40 to 47 phonemes that make up the English language, meaning there is a
low risk of adding distortion. I have divided voice into 16 logarithmic bands of
frequencies for this sound detector. The Cochlear implant, used by deaf people to
hear, has only eleven frequency bands, meaning there is a low risk of adding
distortion. The 16 filters are not perfect, but they are very fast. Because the same
filters are used to make the numbers and compare those numbers, the errors cancel.
Each of the 16 filters has an amplitude detector and frequency detector. The output
numbers from the 32 detectors continuously make their first and second derivatives,
which total 96 numbers every .008 seconds. Some of the 96 numbers make the 88
comparisons for your known sounds. But most of the time the 96 numbers from the
sound detector do not match any of your 88 known sounds. Those .008 second time
slots are assigned as unknowns.
You cannot change your mouth, tongue, lips, jaw, etc. in .008 seconds. When the
sound detector finds a match to your unique cluster of numbers for one of your
known sounds, that cluster often repeats. Sounds come in groups. Between your
known sounds are unknown sounds. The rule to fix the unknown sounds is to make
half of the unknown time slots match your earlier known sound, and the rest of the
unknown time slots match your next known sound. Each of your new known sound
groups without unknown time slots must fit a unique code of ones and zeros at 125
Hz rate to be sent into the transport device for each of your 88 sounds.
3. The Lebo code: I invented the Lebo code of ones and zeros, which contain both
speech and timing. All Lebo codes start with one and end with two or more zeros,
making 88 minimum number of digit Lebo codes with eleven or less digits. To extend
the time of any of the 88 Lebo codes in .008 second steps, add extra zeros to its
minimum number of digits. The time to send any Lebo code with extensions through
a transport device is exactly the same time that plays its extended audio clip. A
special Lebo code turns on the squelch that mutes the audio just before the transport
device stop sending. The minimum number of digits in the 88 Lebo codes ranges
from 3 to 11 with associated audio clips lasting from .024 to .088 seconds (2304 to
8448 samples). In live voice some sounds have a smaller duration and should be
assigned to smaller minimum number of digit Lebo codes. The smallest Lebo code is
assigned to “no sound”, which is the most used of all sounds, the easiest sound to
detect and has no distortion. Sounds with a larger duration should be assigned to
larger minimum number of digit Lebo codes.
When a sender talks too fast for a new sound group to fit half or more of its minimum
number of digits for that Lebo code, the full minimum number of digit Lebo code is
sent, but one zero is removed from each of the future Lebo codes with extra zeros
for each missing digit added. If a new sound group has less than half of the minimum
number of digits for that Lebo code, the new sound group is removed, and one extra
zero is added to each of the future Lebo codes for each digit of the removed new
sound group. This adds a little time distortion, but our minds can compensate. You
would tell the sender to slow down.
QPSK-125 error correction and encryption could modify the Lebo code for DS-125.
Why do this? Long ago writers needed a transport device to rapidly send written
words to far away readers. Morse code with a small bandwidth filled part of that
need, even though the penmanship was removed. Today astronauts to Mars need a
transport device to send their live voices to and from Earth. The Lebo code with a
small bandwidth fills part of that need.
4. Radios are used to send live voice from one place to another. But noise gets in the
way of the signal. If there is not enough signal compared to the noise, you can not
hear the live voice. Traditionally the signal is increased by amplifiers or bigger
antennas, but the DS-125 software reduces voice bandwidth which reduces the
noise. The cost of bigger amplifiers and/or antennas is prohibitive, while DS-125
software can be free and has no weight. The future uses of DS-125 include live
voice from the Earth to the Moon to the Earth (EME), live voice to be able to
modulate an Extremely Low Frequency (ELF) radio signal for global deep
underwater voice communications, saving 32,000 hours of only your encrypted live
voice on a 16 Gbit flash drive, etc.
The result: When all 88 of your sounds can be detected, and your unknown time
slots removed, and your new sound group with less than half of their minimum
number of digits removed, you can hear your synthetic live voice by playing your
pre-recorded audio clips (arrays of numbers or samples) on the headset. If you can
understand yourself all the time, the development system works. If you can’t
understand yourself, the development system fails. The development system only
works, if all parts are complete.
How do you find your unique cluster combinations for each of your 88 sounds to
compare with your 96 unknown numbers every .008 seconds? Example: If the 16
peak amplitude numbers from the 16 filters are below a fixed value, then the
unknown sound must be “no sound”, which is the first of the 88 known sounds. Your
other 87 unique combinations of cluster numbers and your audio clips need to be
made. How do you make them?
To test the proof-of-concept someone needs to finish doing the project with or
without my help. If I am still alive, please contact me at mike.lebo@gmail.com