1. CIIC 5995-100 / ICOM 5995-100
Human Perspective in Artificial Intelligence
(HPAI)
Professor José Meléndez, PhD
"Music is the universal language of mankind."
- Henry Wadsworth Longfellow (1807-1882)
2. Previously in HPAI CIIC/ICOM 5995
• The “Environment” is Reality
• Only 5 types of Sensible
Energies or Interactions.
• Our “Representations” of our
Observable Environment Create
Our Realities
• Human senses only detect a
very small part of what is
sensible of Reality
• Some animals can do better…
3. Next Up – Human vs. Artificial Sensing
• Sound Sensing
• How we Hear
• Frequency Range
• Dynamic Ranges
• Artificial Hearing
• Light Sensing
• Smell Sensing
• Taste Sensing
• Touch Sensing
• Internal Sensing
4. Human Ear – aka “Mic”
• Outer Ear collects vibrating air to move Ear Drum
• Middle Ear couples energy into vibrating bones
• Inner Ear couples energy into vibrating fluid
• Vibrating Cochlea hairs transform energy into electrical impulses
17. Sound Intensity vs. Volume
http://hearinglosshelp.com/blog/converting-decibels-to-sound-intensities/
18. Sound Range at Middle Frequency
http://www.cochlea.org/en/hear/human-auditory-range
At 1-2 kHz
19. Sound Range at Middle Frequency
http://www.cochlea.org/en/hear/human-auditory-range
• Human conversation takes place well within the frequency range heard
• How low a volume we can hear varies with frequency
• How high a volume we can hear also varies with frequency
23. Next Up – Human vs. Artificial Sensing
• Sound Sensing
• Artificial Hearing
• Light Sensing
• Smell Sensing
• Taste Sensing
• Touch Sensing
• Internal Sensing
24. Alexander Graham Bell’s Mic
• Upper Box collects vibrating air to move Lower Diaphragm (“Drum”)
• Drum couples energy into attached vibrating Pin
• Pin contact area with conducting fluid (acid) changes
• Vibrating energy transforms into oscillating electrical current
https://www.antiquetelephonehistory.com/images/first-liquid.jpg
25. • Microphones transform vibrations into digital audio signals
• Digital audio signals transmitted inside ear by radio frequencies
• RF signals converted to electrical impulses at electrode array in cochlea
Cochlear Implants, Blausen.com staff (2014). "Medical gallery of Blausen Medical 2014"
Cochlear Implants
26. Sound Sensors – “Microphones”
Courtesy of Vesper
• Vibrating air moves Diaphragm
• Vibrating Diaphragm and
“backplate” transform energy
into electrical capacitance
27. Sound Sensors – “Microphones”
Courtesy of Vesper
• Vibrating air moves Plates
• Plates couple energy into
vibrating Piezoelectric Film
• Vibrating Piezoelectric film
transforms energy into
electrical charge
30. Human Sound Applications of AI
• Voice Machine Communications
• Audio Fingerprinting
• Audio Classification / Labeling
• Voice to Text
• Audio Source Separation
• Audio Search
• Music Transcription
• Voice Language Translation
• Music Recommendation
SOME EXAMPLES
31. Audio Classification Example
• Audio Detection (microphone)
• Convert vibrational energy to analog electric signals
• Convert analog electric signals to digital data
• Digitization and Storage
• Convert digital data to audio file format of choice
• Save data
• Access and Conditioning
• Convert saved data back to digital data
• Extract features of interest from the audio data
• Train AI on the selected features
33. Audio Classification Example
Audio Signal Chain – Analog to Digital Conversion
Resolution = 2b where b= # of bits (16-24 typical)
Sampling Frequency (> typically 2x maximum content frequency)
https://www.analyticsvidhya.com/blog/2017/08/audio-voice-processing-deep-learning/