PSYCHOACOUSTIC AUDIOMETRY AND EVOKED PHYSIOLOGICAL MEASUREMENT OF AUDITORY SENSITIVITY By Dr. Syed Salman Hussaini
PURE-TONE AUDIOMETRY The purpose of pure-tone audiometry is to determine hearing threshold levels for pure tones. The threshold of hearing is defined as the level of a sound at which, under specified conditions, a person gives 50 percent of correct detection responses on repeated trials. Specified conditions means the type of sound and ways of presenting the test sound. The normal test sound is pure tone pulses at standardized frequencies in the range of 125-8000 Hz and the normal presentation mode is monaurally by means of a standardized type of earphone.
Hearing thresholds are sensitive indicators of the functional state of the peripheral part of the auditory sense organ. Pure tone audiometry has become the standard method for quantitative description of degree of hearing loss. It also provides information regarding the localization of the lesion that causes the hearing loss.
Equipment A pure tone audiometer is designed in terms of a set of basic functions. Sinusoidal signals, pure tones, at standardized frequencies in octave steps from 125 to 8000 Hz constitute the basic test signals generated by the oscillator function. In order to obtain improved resolution in the range where most hearing losses occur, the intermediate frequencies 1500, 3000 and 6000 Hz are usually also included. Equipment providing test signals at higher frequencies (8000 to 16,000 Hz) than the conventional range is also available.
Tone pulses are formed by the pulse former function which controls rise time, duration and fall time. The attenuator is a volume control that allows the change of signal level in calibrated steps along a decibel scale. For the normal manual procedures, as well as computer- controlled versions, the optimum step size is 5 dB. The attenuator scale is calibrated in decibel hearing level. This scale has its zero at a sound level which for each tone frequency corresponds approximately to the average normal threshold of hearing for young, otologically normal subjects.
After suitable amplification, the signal is finally delivered to one of the earphones. The most commonly used earphone type is the supra-aural Telephonics TDH-39 or its later versions Telephonics TDH- 49 and 50. Larger circum-aural earphones, Sennheiser HDA-200, are housed in relatively effective noise-excluding muffs. Insert earphones of type Etymotic Research EAR-Tone 3A, coupled to the ear canal by means of foam inserts, offer another alternative with superior attenuation of ambient noise compared to supra-aural earphones.
Bone conduction Bone vibrator as an alternative output transducer, whereby mechanical vibrations coupled to the skull bone at the mastoid process behind the external ear are used to stimulate the cochlea through bone conduction. The sensitivity to detect these mechanical vibrations depends on the sensorineural function alone with negligible influence from the outer and middle ears. Thus, a comparison of results obtained by air and by bone conduction provides evidence for the localization of a lesion.
In addition to the direct mechanical pathway to the cochlea there are three additional routes that may interfere. 1. The vibrations of the skull give rise to a relative motion of the ossicular chain, the middle ear component. 2. They also give rise to vibrations of the walls of the external auditory canal, in particular the outer cartilaginous part - this is the external ear component. When the external ear is open, this component is relatively weak, but when the ear canal is occluded, e.g. by an earphone, a hearing aid or a hearing protector, it may become significantly larger.
3. Finally, a fluid component may also provide a pathway for bone conduction stimuli to the inner ear via the cerebrospinal fluid (CSF) and its connection to the perilymph of the inner ear. The actual stimulation of the cochlea by bone conduction stimuli is determined by the vectorial sum of all these components.
Masking noise A test signal from a bone vibrator will reach both cochleae at approximately equal levels. In air conduction testing with earphones, a certain degree of vibratory energy is transferred to the skull. For the supra-aural earphones, the level of this vibratory signal is on average 60 dB below that of the air conduction signal generated, but may be as low as 40 dB below. Thus, when testing the poorer ear in cases of asymmetrical hearing loss, if the side difference is 40 dB or greater there is a risk that the test signal heard is actually a bone conducted signal picked up by the contralateral ear rather than an air- conducted signal picked up by the poorer test ear.
Masking noise In order to allow reliable measurements of hearing thresholds for bone conduction in general, as well as for air conduction in the poorer ear when there is a side difference of 40 dB or more in air conduction hearing thresholds, masking of the contralateral nontest ear is necessary. Insert earphones have a much weaker mechanical coupling to the listeners head, and therefore transfer less energy to the skull. When insert earphones are used, the level of the transferred bone conduction component is at least 20 dB lower than that generated by supra-aural earphones.
Masking is performed by the presentation of narrow-band filtered random noise with a centre frequency equal to the test tone frequency being used. Thus, a clinical pure-tone audiometer needs a second channel where random noise is generated, band-passfiltered, attenuated and amplified in order to be presented as a continuous sound through one of the earphones available. The calibration of the attenuator when used for adjusting the masking noise level is in decibel effective masking level.
Test environment Psychoacoustic tests require a concentrated undisturbed listener. This concerns absence of visual and other sensory distractions, acceptable ventilation and sufficiently low ambient sound levels. The requirement on ambient sound levels is of particular importance in threshold audiometry in order to avoid test stimuli of very low sound levels being masked by unwanted sounds in the test room.
Test method The test should always begin with the better ear if the test subject is aware of a side difference. A clear instruction to the subject is an essential part, informing about the test procedure, the listening task, i.e. trying to detect test tones that may be very faint, and the response task, usually pressing a button in response to detected tone. Two alternative methods for the manual determination of hearing thresholds are described.
Test method The ascending method (modified Hughson-Westlake method). After familiarization by presenting a clearly audible test tone, it is based on repeated ascents from inaudible to just audible stimuli in steps of 5 dB. As soon as the listener responds, the level is decreased by 10 dB and a new ascent is started. The hearing threshold level is defined as the stimulus level at which the listener first has given three correct responses after three to five ascending series of stimuli. The first test frequency is 1000 Hz followed by the higher frequencies in rising order and finally the lower frequencies in falling order.
Test method The alternative manual method is the bracketing method, which is based on alternating ascending and descending series. When a response has occurred in an ascending series, the level is increased another 5 dB and a descending series is started in 5 dB steps. When no response occurrs the level is lowered another 5 dB and a new ascending series is presented. This is repeated until three ascending and three descending series have been completed. The hearing threshold level is the average of the three lowest audible levels in the ascending series and the three lowest audible levels in the descending series.
When masking is needed, narrowband masking noise should be presented to the non-test ear. This procedure is called the plateau-seeking method. The masking noise is presented continuously to the contralateral ear and is introduced first at the level where it is just audible. It is increased in steps of 5 dB as long as each increase in masker level requires an increase in test tone level in order for the test tone to remain audible. When the masking level can be increased by three 5 dB steps or more without affecting the audibility of the test tone, this constitutes the masking plateau.
Screening audiometry Screening audiometry is a simplified procedure which aims at identifying those listeners whose hearing threshold levels exceed a certain level, the screening level, without spending time on those whose hearing threshold levels are better than this level. Thus, it allows a possibility of saving test time, which is meaningful provided that no valuable information is lost. The normal prerequisites for choosing a screening procedure and a specific screening level are that the subjects identified by the procedure can be offered a meaningful intervention and/or that hearing threshold levels better than the screening level represent variations within the normal limits.
Screening audiometry Screening audiometry is often performed when testing school children and a common screening level is then 20 dB HL. This level represents the statistical borderline for normal hearing threshold levels, corresponding approximately to the average plus two standard deviations.
Sources of errorPHYSIOLOGICAL SOURCES The auditory pathways always have a certain spontaneous activity, also in the complete absence of any audible sound. This activity can be considered as a physiological Noise. When a test signal is presented at a barely audible sound level, it will introduce a certain regularity in the otherwise irregular spontaneous activity in the auditory pathways. Because of this irregularity there will be a certain degree of randomness with regard to the probability of the test tone being audible or not.
PSYCHOLOGICAL SOURCES Each test is to start with the tester instructing the test subject about the procedure and the subjects role in the process. The test subjects concentration is probably the dominating source of error. Even in a relatively limited test session involving only air- conduction testing, the goal of keeping a high concentration level on the listening task without thinking of anything else is virtually impossible to reach. When the test involves not only air-conduction but also bone conduction and masking, fatigue will of course affect the listeners ability to concentrate and cooperate. The test environment should be designed for optimum concentration.
METHODOLOGICAL SOURCES The instruction of the test subject is an important part of any psychophysical test. The various parameters of the actual test method, i.e. the presentation of the test tones, their duration, intervals between successive test tones and order of varying the test tone level, are other factors that may affect the test result. An important part of the standardized method is also the definition of hearing threshold level and how it is determined according to the subjects response pattern relative to the presented stimuli.
PHYSICAL/ACOUSTICAL SOURCES Supra-aural earphones are supposed to provide a close fit to the outer ear without leakage, but exceptions often occur in practice. A leakage usually affects the lowest test frequencies, making the actual sound levels generated at the ear drum lower than they should be without leakage. At the highest test frequencies the wavelength of the sound is of the same order of magnitude as the dimensions of the enclosed cavity underneath the earphone. Even relatively small changes in position of the earphone in relation to the ear canal opening may affect the sound pressure at the ear drum.
The placement of the bone vibrator on the mastoid process behind the pinna has to take place without the aid of any clear anatomical landmark. The use of insert earphones makes the placement somewhat easier. The only important factor to control here is to insert the foam ear tip by its full length into the ear canal. Ambient sound levels constitute another potential source of error. Correct calibration of the test equipment is obviously important. The equipment may change slowly over time due to ageing in electronic components or it may change suddenly.
Clinical applications An air-conduction pure-tone audiogram is the basic test measure which is used to express the degree of hearing loss. It is an important basis for assessing the needs for intervention, including the fitting of hearing aids. A complete audiogram, i.e. using both air- and bone- conduction, provides important information for the topical diagnosis in terms of differentiation between conductive and sensorineural disorders. Equal loss for air and bone-conduction indicates a sensorineural lesion whereas a larger loss by air-conduction than by bone-conduction, an air-bone gap, indicates the presence of a conductive lesion. The air-bone gap at a single frequency needs to be at least 15 dB in order to be considered statistically significant.
Special tests with pure tones In clinical applications related to hearing aid fitting, use is being made of test methods describing the loudness function of the ear to be fitted as a basis for compression of input sound levels to be provided by the hearing aid for the ear with reduced dynamic range. A simple way to provide additional information is to measure the uncomfortable loudness level (UCL), as a measure of the upper limit of the dynamic range. Stimuli may be either pure tones or narrow bands of noise, presented as pure tone or noise pulses at increasing levels until the subject indicates the highest acceptable loudness.
SPEECH AUDIOMETRY The measurement of speech recognition ability is an important functional complement to pure tone audiometry. Speech audiometry is a more complex procedure. In pure tone audiometry, the listening task is simply detection of the stimuli. However, in speech audiometry the usual task is not detection but recognition which requires both detection and identification of the phonemes and recognition of sets of phonemes as words. Thus, depending on the choice of speech test material and the listeners task, the test result reflects not only the auditory function, but is also affected by cognitive and linguistic functions.
Equipment Equipment for speech audiometry is usually integrated with a clinical pure-tone audiometer. Recordings should preferably be digital. The digital technology provides the significant advantage of constant quality of the recording, independent of how many times it has been used. Live voice should be avoided where possible. In conventional clinical speech audiometry, the test signal is presented monaurally by means of earphones. Contralateral masking may be needed to prevent the test signal from being heard by the better ear when testing the poorer ear in cases with large asymmetries.
Speech audiometry is often performed in sound field where the speech signal is presented by means of a loudspeaker. This test situation is much more difficult to standardize than when earphones are used. The characteristics of the test room are of great importance. A free sound field can be achieved only in an anechoic room, where no reflective surfaces are present and only the direct sound from the loudspeaker will reach the listener.
In sound field audiometry, the loudspeaker should be placed in front of the listener at a distance of at least 1 meter and at the same height as the head of the listener. If the speech signal is to be presented with a background of competing noise, the noise may be presented either through the same loudspeaker as the speech or through a pair of loudspeakers located on either side of the frontal loudspeaker, at a recommended angle of ± 45°. When the competing noise is presented through the two noise loudspeakers, the result of the speech recognition test is affected by the listeners ability to take advantage of binaural hearing.
Speech level Hearing level for speech, expressed as dB HL, is mesaured as the speech level relative to the reference level for speech. Reference level for speech is defined as the median value of the speech recognition threshold levels of a sufficiently large number of otologically normal persons, of both sexes, aged between 18 and 25 years inclusive and for whom the test material is appropriate.
Speech material The speech test material may differ widely from nonsense combinations of consonants and vowels, so-called logatoms, to natural connected speech. The simplest test items such as logatoms or monosyllabic words primarily reflect the function of the peripheral part of the auditory system. Bisyllabic words provide considerable redundancy (repetition) - only parts of the complete word may be sufficient for the listener to recognize in order to be able to guess correctly by using his linguistic knowledge.
Speech material Sentences may be designed to offer low or high redundancy. When high-redundancy sentences and connected speech is used as test material, the listeners cognitive and linguistic functions will have significant effects on the test results in addition to the auditory function. Logatoms and single words provide the best test reliability at the expense of validity. Sentences and connected speech provide increased validity but at the expense of reliability.
Speech in background noise In order to simulate real life situations, speech audiometry is often performed with the speech signal presented against a background noise. The degree of masking of the speech signal by the noise depends on the sound pressure level of the noise in relation to the sound pressure level of the speech signal. This relation will affect to what extent the speech signal will be audible above the noise. The difference between speech level and the sound pressure level of the noise is called the signal-to-noise ratio or the speech-to-noise (SIN) ratio, expressed in dB.
Test method The speech recognition threshold level is defined as the lowest speech level at which the speech recognition score is equal to 50 percent for a given test subject, a specified speech signal and a specified manner of signal presentation. The instruction of the test subject before commencing the actual measurement is essential. The tester is to inform about which ear to be tested when using earphones, what type of test items will be presented, and the listeners response task, which usually is to repeat orally what he thought he heard. The recommended procedure that is most commonly used starts with familiarizing the listener by presenting the first test item at a clearly audible level.
Test method A suitable starting level is a hearing level for speech that is 20-30 dB above the average pure tone hearing thresholds at 500, 1000 and 2000 Hz. Then the speech level is reduced in steps of 5 dB, presenting two test items on each level, until the listener no longer responds correctly to all test items. At this level, a set of test items, consisting of at least ten items, is now presented. If more than 50 percent of the items are correctly recognized, the level is reduced by 5 dB and another set of test items is presented. This descending procedure is repeated until the score at a certain level is below 50 percent. The speech recognition threshold level is the integer value of the level corresponding to 50 percent correct.
When speech recognition in a background of competing noise is to be determined, this can be carried out according to two alternative methods.1. The speech signal and the noise are presented at fixed sound levels and the speech recognition score under these conditions is determined.2. The speech signal is presented at a fixed level and the noise level is varied in order to determine the SIN at which the listeners speech recognition score reaches a certain value, usually 50 percent correctly recognized test items.
Sources of error An important factor in speech audiometry is the effect of the listeners linguistic skills. The test lists should present words that are included in the listeners vocabulary. This is important when testing children with a constantly expanding vocabulary. The language used for testing should be the listeners native language. Especially in more difficult listening situations, the difference in performance between native and non-native listeners becomes a significant factor.
Cognitive functions constitute another area that influences speech recognition ability, in particular speech recognition in noise. It is well known that cognitive functions on average deteriorate progressively above approximately 70-75 years of age. Thus, not only reduced auditory acuity, but also cognitive decline, may affect speech recognition performance among elderly listeners.
CLINICAL APPLICATIONS OF SPEECH AUDIOMETRY In general, the speech recognition threshold (SRT) is expected to agree with the average pure tone hearing threshold levels at 500, 1000 and 2000 Hz pure tone average (PTA) within 10 dB. If the SRT is significantly poorer than the PTA, this may indicate a retrocochlear or central lesion. If, on the other hand, SRT is significantly better than the PTA, this might indicate nonorganic hearing loss.
For listeners with normal or only mild cochlear hearing loss, the maximum speech recognition score should reach 100 percent or close. This also holds for pure conductive hearing loss, although obviously the speech level has to be raised above normal in order to reach this maximum. For more pronounced cochlear lesions, maximum speech recognition score may be significantly below 100 percent. This is likely to be due to both various kinds of distortion caused by the cochlear lesion, as well as the typical large difference in sensitivity between low and high frequency ranges.
Retrocochlear lesions often show significantly poorer performance than expected according to their pure tone audiogram. Patients with central auditory lesions are likely to perform relatively normally in conventional speech audiometry in quiet but below normal on tests using dichotic speech or distorted speech.
Otoacoustic emissions Otoacoustic emissions (OAE) are acoustic signals emitted from the cochlea to the middle ear and into the external ear canal where they are recorded. They are generated by active mechanical contraction of the outer hair cells, spontaneously or in reponse to sound. There are four types of OAEs: spontaneous OAEs (SOAE), transient evoked OAEs (TOAE), distortion product OAEs (DPOAE), and stimulus frequency OAEs (SFOAE).
Otoacoustic emissions All four types of OAEs are recorded with a sensitive, low noise microphone that is placed in the sealed external ear canal. When OAEs are evoked, the sealed probe includes a tube for sound delivery to the ear canal, in addition to the recording microphone. The microphone records all sounds in the ear canal, and these include, in addition to OAEs, the sound evoking the OAEs when TEOAEs or DPOAEs are recorded, as well as other patient-generated and ambient sounds.
TRANSIENT-EVOKED EMISSIONS The delay between stimulus offset and onset of the evoked emissions varies between 4 ms, for high frequencies, and 20 ms for low frequencies. This temporal separation helps in visual identification and separation of the transient-evoked emissions from the stimulus that evoked them, that is also recorded. Thus, TEOAEs are typically presented as an amplitude/time plot of the acoustic waveform recorded from the ear canal. TEOAEs greater than 20 dB sound pressure level (SPL) can be recorded from newborns, while responses from children and adults range between 10 and 15 dB SPL. The most effective stimulus to evoke TEOAEs is a tone burst in the mid-frequencies.
DISTORTION PRODUCT EMISSIONS DPOAEs are generated in the cochlea in response to two simultaneous pure-tone stimuli (primary tones). This tonal response is not present in the eliciting stimuli, and is therefore referred to as a distortion. Because DPOAEs are separated in frequency from the eliciting stimuli, they can be recorded in the presence of the stimulating tones and separated from them by spectral analysis. Their magnitude is very small (5-15 dB SPL), approximately 60-70 dB below the level of the stimuli used to evoke them. DPOAEs are attributed to nonlinearity of motion of the outer hair cells, particularly at low stimlus levels.
Neurogenic evoked potentials Electric signals generated by the auditory pathway from the cochlea to the cerebral cortex are the most reliable and widely used physiological estimators of auditory sensitivity. Such recordings belong to a class of electrophysiology called evoked potentials, or event-related potentials, defined as changes in voltage that occur at a particular time before, during or after a change in the physical world and/or some psychological process that gave rise to these voltage changes. When evoked by a stimulus in the physical world outside the brain, they are called exogenous, whereas if evoked by a psychological, cognitive process within the brain they are termed endogenous.
COCHLEAR POTENTIALS Cochlear potentials that can be recorded include the cochlear microphonic potentials, the summating potential and the compound action potential of the auditory nerve. These potentials are most readily recorded in the electrocochleogram (ECog), which can be recorded from an electrode that is inserted as close as possible to the cochlea. In electrocochleography, a needle electrode is inserted through the tympanic membrane to rest on the promontory. In case of a perforated eardrum, a ball-tipped electrode is inserted through the perforation to rest on the round window. Transtympanically recorded cochlear potentials are 20 times larger than those recorded noninvasively from the ear canal, and 10 times larger than those recorded noninvasively from an electrode resting on the tympanic membrane.
Cochlear potentials can be recorded noninvasively from the ear canal using an electrode resting on the tympanic membrane. The cochlear compound is recorded as a major negative peak of a few uV, called N1 or action potential (AP), at approximately 1.5 ms after stimulus onset at the eardrum, followed by a minor negativity called N2, at approximately 2.5 ms. The summating potential (SP), preceding N1 as a negative step-like deflection from baseline. The Ecog is affected by auditory sensitivity in the range of 1000 to 4000 Hz and is independent of the subjects state of arousal or the effects of drugs.
AUDITORY NERVE AND BRAINSTEM POTENTIALS Auditory nerve and brainstem evoked potentials (ABEP) are optimally recorded from the scalp by disc or cup electrodes in response to high intensity clicks presented at a rate of approximately ten per second. The normal waveform includes a series of five to seven voltage oscillations, approximately 1 ms apart during the first 6-10 ms after stimulus onset. The first peak in the sequence, peak I, is the only one to survive section of the auditory nerve central to the internal auditory canal, placing its generator in the cochlea. It is synchronous with N1 of the ECog
The second peak, II, is synchronous with proximal auditory nerve activity. Overlapping activity from the auditory nerve and from the cochlear nucleus. Peak II is generated in the viscinity of the auditory nerves entry into the brainstem. Peak III generators span the lower brainstem between the cochlear nucleus, through the trapezoid body to the superior olivary complex. For practical clinical purpose the generators may be attributed to the lower pons.
The fourth component is not always identified. Peak IV is usually partially merged with V, creating a bifid IV-V complex. Generators of this complex are the upper pons, between the superior olivary complex, through the lateral lemniscus, with possible contribution from the inferior colliculus. For clinical purposes, the IV -V complex can be attributed to the upper pons and its junction with the midbrain.
Audiometric ABEPs measures include the lowest stimulus intensity at which a response is detected (detection threshold). Peak latencies measure the time lapse between stimulus onset and the time of highest synchronous activity. Multiple factors that contribute to peak latency have led to the definition of interpeak latency difference measures. The most widely used interpeak latency differences are V-I, between cochlea and ponto-midbrain junction III-I, between cochlea and ponto-medullary junction, and V-III, along the pons
A significant interaural difference in respective measures, in response to left and right ear stimulation, may be indicative of a unilateral functional abnormality. ABEPs are affected by a variety of nonpathologic factors which include the subjects age, body temperature and gender, as well as stimulus factors such as frequency composition, intensity, presentation rate and envelope. ABEP peak latencies shorten with increasing stimulus intensity. During childhood, the peak amplitudes are typically larger than in adults, particularly component I which is often larger than V.
TYMPANOMETRIC ACOUSTIC REFLEX Tympanometry is the continuous measurement of middle ear impedance as air pressure in the sealed external ear canal is varied. The measurement device delivers a tone toward the tympanic membrane, and the impedance or admittance of the middle ear are quantified based on the intensity and other properties of the tone in the ear canal. The graph describing the mechanical properties of the middle ear as a function of pressure in the external ear canal is the tympanogram.
TYMPANOMETRIC ACOUSTIC REFLEX Tympanograms typically show compliance (the inverse of stiffness) as the measured aspect of impedance, and there are three main types: Type A, which shows a clear peak of compliance between 0 and -100 mm of water, associated with normal function; Type B, where no peak in the compliance is noted, typically associated with fluid in the middle ear cavity; and Type C, resembling type A but peaking at a pressure more negative than -100 mm of water, most commonly found in patients with inadequate ventilation of the middle ear, such as with Eustachian tube dysfunction.
TYMPANOMETRIC ACOUSTIC REFLEX A variant of type A (As) with abnormally low compliances (unusually high impedance) is found in patients with fixation of the ossicular chain, such as in otosclerosis. Conversely, an abnormally high compliance (unusually low impedance) type A tympanogram (Ad) is typical in patients with disruption of the ossicular chain, rendering the ear drum abnormally mobile and compliant. When the tympanic membrane is ruptured or perforated, tympanometry cannot be conducted and the ear canal volume indicated by the tympanometer, when the sealing of the canal is verified and measured, is larger than 2.5 cc.
The acoustic impedance of the ear changes when the muscles of the middle ear contract. The acoustic reflex is the contraction of the middle ear stapedius muscle, attached to the posterior part of the stapes, in response to medium to high intensity sounds. The reflex arc includes the auditory nerve, brainstem neurons connecting the cochlear nucleus ipsilateral to the stimulated ear with bilateral neurons in the motor nuclei of the facial nerve.