Sensation and Perception Chapter 10

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

nerve deafness

Damage to the Cochlea, Or Path to Cortex 1. Cilia or Hair Cells 2. Basilar Membrane 3. Auditory Nerve 4.Olive 5.Auditory Tract 6. Inferior Colliculus 7. MGN of Thalamus 8. Auditory Projections

causes of type II nerve deafness

Degenerative nerve disease Congenital disorder Infection Stroke Trauma treatment: No cure at present Stem cells research is designed to allow new nerve growth in damaged areas. Has already worked with Parkinson's and Tourrette's Should work for vision, audition, Alzheimers, Epilepsy, stroke, etc

auditory scene analysis

Separation of sounds into distinct streams - how is it that complex sounds are grouped appropriately? Common Time Course or Sequential Integration - sounds are grouped together according to which ones vary together over time - eventually distinct sources of sound will produce distinct sounds Sound Location or Simultaneous Integration - sounds which emanate from the same source Spectral Harmonics - Harmonic frequencies are multiples of the fundamental frequency Familiar sounds - groups of sound which have been perceived as together in the past are more easily grouped together in the present Visual (or other perceptual) input can help to sort out which sounds belong with which source How do listeners know how far a sound is? -Simplest cue: Relative intensity of sound -Inverse-square law: As distance from a source increases, intensity decreases faster such that decrease in intensity is distance squared -Spectral composition of sounds: Higher frequencies decrease in energy more than lower frequencies as sound waves travel from source to one ear -Relative amounts of direct vs. reverberant energy The acoustic environment can be a busy place. In most natural situations, the Wolfe Sensation and Perception 4/e: sound source that one is listening to is not the only source present Consider, for example, conversing with a friend at a party where many other people are talking, music is playing, chips are being munched, the door is being opened and closed, and so on. Now consider simpler environments, such as those you choose for studying. To read this chapter, you probably chose the spot where you are right now because it was relatively quiet, but stop and listen carefully. Is a heater, air conditioner, computer, refrigerator, or some combination of these devices humming in the background? environments with multiple sound sources are the rule, not the exception. Note that the visual system also has to contend with a busy world, but eyes can be directed to any part of the scene that is of interest to the visual system. Moreover, the rods and cones on the right side of the retina always see the objects on the left, politely leaving the receptors on the other side of the retina to collect data about objects on the right. For an auditory scene, the situation is greatly complicated by the fact that all the sound waves from all the sound sources in the environment are summed together in a single complex sound wave. You can move your ears around all you want, but everyone's voice at the party still has to be picked up by the same two sets of cochlear hair cells. Separating different sounds from one another is like living in a world in which everything is made of glass: it is difficult to distinguish separate objects, because they all merge into a single combination of shapes Somehow, however, the auditory system contends quite well with the situation: our perception is typically of a world with easily separable sounds. We can understand the conversation of a dance partner at a party, and we can pick out a favorite instrument in the band. This distinction of auditory events or objects in the broader auditory environment is commonly referred to as source segregation or auditory scene analysis.

summary

1. Listeners use small differences, in time and intensity, across the two ears to learn the direction in the horizontal plane (azimuth) from which a sound comes. 2. Time and intensity differences across the two ears are not sufficient to fully indicate the location from which a sound comes. In particular, they are not enough to indicate whether sounds come from the front or the back, or from higher or lower (elevation). 3. The pinna, ear canal, head, and torso alter the intensities of different fre- quencies for sounds coming from different places in space, and listeners use these changes in intensity across frequency to identify the location from which a sound comes. 4. Perception of auditory distance is similar to perception of visual depth because no single characteristic of the signal can inform a listener about how distant a sound source is. Listeners must combine intensity, spectral composition, and relative amounts of direct and reflected energy of sounds to estimate distance to a sound source. 5. Many natural sounds, including music and human speech, have rich harmonic structure with energy at integer multiples of the fundamental frequency, and listeners are especially good at perceiving the pitch of harmonic sounds. 6. Important perceptual qualities of complex sounds are timbre (conveyed by the relative amounts of energy at different frequencies) and the onset and offset properties of attack and decay, respectively. 7. Because all the sounds in the environment are summed into a single wave- form that reaches each ear, a major challenge for hearing is to separate sound sources in the combined signal. This general process is known as auditory scene analysis. Sound source segregation succeeds by using mul- tiple characteristics of sounds, including spatial location, similarity in fre- quency and timbre, onset properties, and familiarity. 8. In everyday environments, sounds to which a person is listening often are interrupted by other, louder sounds. Perceptual restoration is a process by which missing or degraded acoustic signals are perceptually replaced. 9. Auditory attention has many aspects in common with visual attention. It is a balance between being able to make use of sounds one needs to hear in the midst of competing sounds, and being on alert for new auditory information.

sensory substitution

ASL: American Sign Language Closed Captioning Foundation

auditory distance perception

Although it is important to know what direction a sound is coming from, none of the cues we've discussed so far (ITD, ILD, DTF) provide much information concerning the distance between a listener and a sound source that is much more than an arm's length away. The simplest cue for judging the distance of a sound source is the relative intensity of the sound. Because sounds become less intense with greater distance, listeners have little difficulty perceiving the relative distances of two identical sound sources. Unfortunately, this cue suffers from the same problem as relative size in depth perception. Interpret- ing the cue requires one to make assumptions about the sound sources that may turn out to be false (e.g., the softer sounding frog might be very close, with its croaks muffled by surrounding vegetation). The effectiveness of relative intensity decreases quickly as distance increases, because sound intensity decreases according to the inverse-square law. When sound sources are close to the listener, a small difference in distance can produce a relatively large intensity difference. - For example, a sound that is 1 meter away is more intense by 6 decibels (dB) than a sound that is 2 meters away. But the same 1-meter difference between sound sources 39 and 40 meters away produces an intensity change of only a fraction of 1 dB. -As one might expect from these facts, listeners are fairly good at using intensity differences to determine distance when sounds are presented within 1 meter of the head (Brungart, Durlach, and Rabinowitz, 1999), but listeners tend to consistently underestimate the distance to sound sources farther away, and the amount of underestimation is larger for greater distances Intensity works best as a distance cue when the sound source or the listener is moving. If a croaking frog starts hopping toward you, you will know it because its croaks will become louder and louder. Listeners also get some information about how far away a source is when they move through the environment. This is because, in a manner akin to motion parallax in the perception of visual depth, sounds that are farther away do not seem to change direction in relation to the listener as much as nearer sounds do. Another possible cue for auditory distance is the spectral composition of sounds. The sound-absorbing qualities of air dampen high frequencies more than low frequencies, so when sound sources are far away, higher frequencies decrease in energy more than lower frequencies as the sound waves travel inverse-square law a principle from the source to the ear. Thus, the farther away a sound source is, the "mud- dier" it sounds. This change in spectral composition is noticeable only for fairly large distances, greater than 1000 meters. You experience the change in spectral composition when you hear thunder from near your window or from far away. Note that this auditory cue is analogous to the visual depth cue of aerial perspective A final distance cue stems from the fact that, in most environments, the sound that arrives at the ear is some combination of direct energy (which arrives directly from the source) and reverberant energy (which has bounced off surfaces in the environment). -The relative amounts of direct versus reverberant energy inform the listener about distance because when a sound source is close to a listener, most of the energy reaching the ear is direct, whereas reverberant energy provides a greater proportion of the total when the sound source is farther away. Suppose you're attending a concert. The intensities of the musi- cian's song and your neighbor's whispered comments might be identical, but the singer's voice will take time to bounce off the concert hall's walls before reaching your ear, whereas you will hear only the direct energy from your neighbor's whispers As it happens, reverberations appear to be important for judging the loudness of sounds. You do not judge that a coyote is howling softly just because it is far away, so how do you estimate how loud the howls really are? Recall that the sound to which you are listening rapidly decreases in energy, following the inverse-square law. However, reverberations do not fall off so quickly, because the surfaces that sounds bounce off do not move when the sound source becomes closer or farther away.

attack and decay

Another very important quality of a complex sound is the way it begins (the attack of the sound) and ends (the sound's decay) (Figure 10.17). Our auditory systems are very sensitive to attack and decay characteristics. For example, important contrasts between speech sounds in the words bill and will or the words chip and ship relate to differences in how quickly sound energy increases at the onset (the rate of attack). The same musical instrument can have quite different attacks, from the rapid onset of a plucked violin string to the gradual onset of a bowed string (Figure 10.17a and b). How quickly a sound decays depends on how long it takes for the vibrating object creating the sound (e.g., the violin string) to dissipate energy and stop moving. One of the more challenging aspects of designing music synthesizers that could mimic real musical instruments was learning how to mimic the attacks and decays of the instruments.

continuity and restoration effects

As already discussed, the sound we're trying to listen to at any given time is usually not the only sound in the environment. In addition to dealing with overlapping auditory streams, we also often have to deal with the total masking of one sound source by another for brief periods. Suppose you're listening on your cell phone as a friend gives you directions to the restaurant where you're to meet for lunch. A car may honk, a baby may cry, or your cell phone may produce a short burst of static, but if you're paying attention and the interruption is not too long, you will probably be able to "hear through" the interruption. This effect is consistent with the Gestalt principle of good continuation: the continuous auditory stream is heard to continue behind the masking sound. Auditory researchers have labeled these phenomena "continuity effects" or "perceptual restoration ef- fects"—the latter label arising because the auditory system appears to restore the portion of the continuous stream that was blocked out by the interrupting sound (R. M. Warren, 1984). In this sense, auditory restoration is analogous to the visual system's filling in the portions of a background object that is sitting behind an occluding object. Continuity effects have been demonstrated in the laboratory with a wide variety of target sounds and interrupting sounds. The simplest version of such an experiment is to delete portions of a pure tone and replace them with noise. The tone will sound continuous if the noise is intense enough to have masked the tone, had it been present. Kluender and Jenison (1992) used signal detection methodology with a slightly more complex version of the continuity effect (Figure 10.24b). In their experiments, listeners heard tone glides, in which a sine wave tone varies continuously in frequency over time (the resulting sound is similar to that of a slide whistle). When intense noise is superimposed over part of the glide, listeners report hearing the glide continue behind the noise. Kluender and Jenison created stimuli in which the middle portion of the glide either was present with the noise or was completely removed. In trials in which the noise was shortest and most intense, the signal detection measure of discriminability (d′) dropped to 0. For these trials, perceptual restoration was complete: listeners had no idea whether or not the glide was actually present with the noise. The compelling nature of perceptual restoration suggests that at some point the restored missing sounds are encoded in the brain as if they were actually present in the signal. Imaging studies of humans, who can report when they do and do not hear the tone through the noise, show metabolic activity in A1 that is consistent with what listeners report hearing, whether or not the tone is present (Riecke et al., 2007, 2009). Macaque monkeys also hear tones being restored even when interrupted by noise (Petkov, O'Connor, and Sutter, 2003), and neurons in A1 of monkeys show the same responses to real and restored tones (Petkov, O'Connor, and Sutter, 2007). These data from the auditory cortex cannot tell us whether the glides were restored in the cortex or at a point earlier in auditory processing. However, they make it easier to understand why perceptually restored sounds really sound present.

Auditory Pathways

Auditory Nerve - Axons from hair cells Cochlear Nucleus - Sends information from the auditory nerve to the Superior Olive and to the Inferior Colliculus Superior Olive - Analogous to the Optic Chiasm - information from both ears crosses over to be sent to both hemispheres Inferior Colliculus - Analogous to the Superior Colliculus for vision - Orienting and reflexive localization -- recent studies show multimodal neurons in the colliculus which share visual and auditory information for orientation movements Medial Geniculate Nucleus (MGN) Relays information from the SO to A1

auditory attention

Auditory attention can also differ from visual attention in ways that reflect the differences between senses. You just learned about how the auditory system is exquisitely fast and sensitive. Earlier, we pointed out that hearing works great at a distance, in the dark, and around obstacles. These facts make hearing our primary sense for being vigilant in our surroundings—your first line of defense in a sensory world. We see the auditory system playing the role of sentinel in the acoustic startle reflex. Just like its name implies, this is the very rapid bodily movement that arises following a loud abrupt sound. This reflex is very fast: muscle twitches may follow the sound by as little as 10 ms (Musiek, 2003). Because the time between ear (sound) and spinal cord (movement) is so brief, there can be no more than a few brainstem neurons between them. Being afraid increases acoustic startle (Davis, 2006). Directors of horror movies seem to know this as they gradually ramp up your anxiety before the big event, and then the surprise is usually loud.

complex sounds

But pure sine wave tones, like pure single-wavelength light sources, are rare in the real world, where objects and events that matter to listeners are more complex, more interesting, and therefore more challenging for researchers to study. -harmonics

summary of nerve deafness

Caused by damage between cochlea and cortex Cochlear Implants Stem Cell Research

restoration of complex sounds

Complex sounds such as music and speech can also be perceptually restored. When DeWitt and Samuel (1990) played familiar melodies with notes excised and replaced by noise, listeners perceived the missing notes as if they were present. Restoration was so complete that listeners could not report which notes had been removed or replaced with noise. The researchers also tested whether familiarity of the melodies mattered. Just as you might expect, listeners were much less likely to "hear" a missing note in an unfamiliar melody. Seeba and Klump (2009) trained European starlings (Figure 10.25) to peck when they heard a difference between two parts of starling song, called motifs. When starlings heard motifs with short snippets filled with noise or with silence, they were more likely to peck as if there was a difference between european starlings perceptually restore bits of starling songs, and they are more likely to restore song parts when they are familiar with the starling that produced an intact and an interrupted motif when silence filled the gap. This observation suggests that the starlings restored the missing bits of motifs when noise was inserted into the gap. Not all starling songs are equal, however. In the same set of experiments, the researchers used bits of song that were either familiar to the starling in the experiment (the bird's own song or the song of a cage mate) or unfamiliar (from starlings the subject had never heard). Just like humans listening to familiar and unfamiliar melodies, starlings are more likely to restore missing bits of a familiar song. music. R. M. Warren and Obusek (1971) played the sentence "the state governors met with their respective legi*latures convening in the capital city" with the first s in legislatures removed and replaced by silence, a cough, a burst of noise, or any of a few other sounds. Despite the missing s, listen- ers heard the sentence as if it were intact and complete. Even when listeners were explicitly warned that a small part of the sentence had been removed and replaced with silence or another sound, they were unable to accurately report where the sentence had been changed, except when the missing s was replaced with silence. Listening to familiar melodies and to real speech sentences, as opposed to simple sounds such as sine waves and tonal glides, permits listeners to use more than just auditory processing to fill in missing information. Clearly, these "higher-order" sources of information are used for listening to sources in the face of other acoustic clutter. In a particularly compelling example, listeners restored a missing sound on the basis of linguistic information that followed the deletion (R. M. Warren and Sherman, 1974). For example, Warren and Sherman played utterances such as "the *eel fell off the car" and "the *eel fell off the table" (the asterisk here represents a patch of noise). Listeners were much more likely to report hearing "wheel" in the former utterance and "meal" in the latter, even though the context information came well after the missing phoneme in the sentences. As this result hints, meaningful sentences actually become more intelligible when gaps are filled with intense noise than when gaps are left silent. Think about this: adding noise improves comprehension!

Types of Hearing Impairment

Conduction Deafness - any damage to the middle ear which impairs hearing Nerve Deafness- Effects High Frequencies - Less elasticity in the Basilar membrane - Loss of nutrients to cochlea - Cumulative effects of noise. When related to old age, this is called Presbicusis.

pinna and head cues

Dragonfly Media Group Another reason why cones of confusion are not major practical problems for the auditory system is that time and intensity differences are not the only cues for pinpointing the location of sound sources. Remember that the pinnae funnel sound energy into the ear canal. Because of their complex shapes, the pinnae funnel certain sound frequencies more efficiently than others. In addition to pinnae, the size and shape of the rest of the body, especially the upper torso, affect which frequencies reach the ear most easily. Because of these effects, the intensity of each frequency varies slightly according to the direction of the sound. This variation provides us with another auditory localization cue. Suppose you're in an anechoic room, a room in which the walls are padded so that very little sound enters from the outside and very little sound bounces (reverberates) off the walls. The room is full of speakers at many locations—up, down, and all around you. Tiny microphones are inserted inside your auditory canals, right next to your eardrums. Now you can measure just how much energy from different frequencies actually reaches your eardrums from different locations. Figure 10.10a shows the measurements at the eardrum in a similar experimental setup—in this case for sounds played over a speaker 30 degrees to the left of a listener and 12 degrees up from the listener's head. Although the amounts of energy at all frequencies were equally intense coming from the speaker, you can see that the amounts of energy were not equally intense at the eardrum. Some frequencies (e.g., 5000 Hz) had higher intensity when they arrived at the eardrum; others (e.g., 800 Hz) had less intensity. As you can see, the relative intensities of different frequencies continuously change with changes in elevation as well as azimuth. The sum total of these intensity shifts can be measured and combined to determine the directional transfer function (usually referred to by its acronym, DTF) for an individual.

when sounds become familiar

In addition to the simple Gestalt principles that we've already discussed, listen- ers make use of experience and familiarity to separate different sound sources. When you know what you're listening for, it's easier to pick out sounds from a background of other sounds. An obvious example is how quickly you rec- ognize someone saying your name even though there are many other sounds, including other voices, in the room. To test how much experience listeners need in order to benefit from familiarity, McDermott, Wroblewski, and Oxenham (2011) created complex novel sounds by combining natural sound characteristics in ways that listeners had never heard before. They repeatedly played these sounds at the same time and intensity as a background of other novel sounds that did not repeat, as shown in Figure 10.23. Although listeners could not segregate a sound from its background when they listened to a single instance, they could nonetheless segregate and identify the sound when it repeated. Listeners needed only a few repetitions to perform well above chance, even though they had never heard the complex sounds before they came to the laboratory.

grouping by onset

In addition to timbre, sound components that begin at the same time, or nearly the same time, such as the harmonics of a music or speech sound, will also tend to be heard as coming from the same sound source. One way this phenomenon helps us is by grouping different harmonics into a single complex sound. Fre- quency components with different onset times are less likely to be grouped. For example, if a single harmonic of a vowel sound begins before the rest of the harmonics of the vowel, that lone harmonic is less likely to be perceived as part of the vowel than if the onsets are simultaneous R. A. Rasch (1978) showed that it is much easier to distinguish two notes from one another when the onset of one precedes the onset of the other by at least 30 ms. He noted that musicians playing together in an ensemble such as a string quartet do not begin playing notes at exactly the same time, even when the musical score instructs them to do so. Instead, they begin notes slightly before or after one another, and this staggered start probably helps listeners pick out the individual instruments in the group. Part of the signature style of the Rolling Stones has been to carry this practice to an extreme. Members of the group sometimes begin the same beat with such widely varying onsets that it is unclear whether they are playing together or not. Grouping of sounds with common onsets is consistent with the Gestalt law of common fate. Onsets of the overtones for different shards will differ because the pieces bounce independently until they bounce no more. Thus, even when the initial burst of noise is removed from the sound of the bouncing and breaking bottles, listeners can use patterns of onsets to accurately determine whether the bottle broke

Sound Localization

Interaural Intensity - Slight differences in loudness to each ear - the head occludes some sound Phase Disparity - The peaks (compressions) and valleys of the sound waves arrive at different times for the two ears - good for continuous sound sources - Barn owls use this to catch running mice in total darkness Binaural Neurons (A1) compare information differences from the two ears Interaural Time Difference - Slight delay in the arrival of a signal to the further ear.

harmonics

Many environmental sounds, including the human voice and the sounds of musical instruments have harmonic structure. -harmonic sounds are the most common type of sound in the environment With natural vibratory sources (as opposed to pure tones produced in the laboratory), there is also energy at frequencies that are integer multiples of the fundamental frequency. For example, a female speaker may produce a vowel sound with a fundamental frequency of 250 Hz. The vocal cords will produce the greatest energy at 250 Hz, less energy at 500 Hz, less still at 750 Hz, even less at 1000 Hz, and so on. In this case, 500 Hz is the second harmonic, 750 is the third, and 1000 is the fourth. For harmonic complexes, the perceived pitch of the com- plex is determined by the fundamental frequency, and the harmonics (often called "overtones" by musicians) add to the perceived richness of the sound. The auditory system is acutely sensitive to the natural relationships between harmonics. In fact, if the first harmonic (fundamental frequency) is removed from a series of harmonics and only the others (second, third, fourth, and so on) are presented, the pitch that listeners hear corresponds to the fundamental frequency-even though it is not part of the sound. Listeners hear the missing fundamental. It is not even necessary to have all the other harmonics present in order to hear the missing fundamental; just a few will do. One thing that all harmonics of a fundamental have in common is fluctuations in sound pressure at regular intervals corresponding to the fundamental frequency. For example, the waveform for a 500-Hz tone has a peak every 2.0 milliseconds (ms) (Figure 10.14b). The waveforms for 750-and 1000-Hz tones have peaks every 1.3 and 1.0 ms, respectively (Figures 10.14c and d). As shown in Figure 10.14e, these three waveforms come into alignment every 4 ms, which, conveniently, happens to be the period of the fundamental frequency for these three harmonics: 250 Hz. Indeed, every harmonic of 250 Hz will have an energy peak every 4 ms. Some neurons in the auditory nerve and cochlear nucleus will fire action potentials every 4 ms to the collection of waves shown in Figure 10.14e, providing an elegant mechanism to explain why listeners perceive the pitch of this complex tone to be 250 Hz, even though the tone has no 250-Hz component.

the physiology of ILDS

Neurons that are sensitive to intensity differences between the two ears can be found in the lateral superior olives (LSOs) (see Figure 10.5), which receive both excitatory and inhibitory inputs. Excitatory connections to the LSOs come from the ipsilateral ear (that is, excitatory con- nections to the left LSO originate in the left cochlea, and excitatory connections to the right LSO come from the right cochlea). Inhibitory inputs come from the contralateral ear (the ear on the opposite side of the head) via the medial nucleus of the trapezoid body (MNTB). Neurons in the LSOs are very sensitive to differences in intensity across the two ears because excitatory inputs from one ear (ipsilateral) and inhibitory inputs from the other ear (contralateral) are wired to compete. When the sound is more intense at one ear, connections from that ear are better both at exciting LSO neurons on that side and at inhibiting LSO neurons on the other side.

Common Causes of Hearing Impairment

Noise Exposure - Effects High Frequencies - Both Sudden and prolonged exposure When high Frequency hearing is impaired speech perception becomes increasingly difficult

attention

Once target sounds are determined and segmented, the brain amplifies those and diminishes others - The Cocktail Party Effect

Causes of Type I Nerve Deafness

Presbycusis - Old Ear -- High Frequency hearing loss Noise Induced Hearing Loss: Also effects high frequencies - Damage to cilia or Bassilar membrane -- Tinitus Infection of Cochlea - Damage to cilia Menier's Disease: Excessive fluid pressure in Cochlea damages Organ of Corti.

auditory event perception

Quantity Length Volume Energy Velocity Time Course

cochlear implant

Recall that the cochlea is coiled And that the basillar membrane selects frequencies The electrode has multiple stimulation points that selectively activate nerves at the appropriate places

spacial, spectral, and temporal segregation

The auditory system uses a number of strategies to segregate sound sources. One of the most obvious strategies is spatial separation between sounds. Sounds that emanate from the same location in space can typically be treated as if they arose from the same source. Moreover, in a natural environment in which sound sources move, a sound that is perceived to move in space can more easily be separated from background sounds that are relatively stationary.

the physiology of ITDS

The portion of the auditory system responsible for calculating ITDs obviously needs to receive input from both ears. As we saw in Chapter 9, binaural input enters almost every stage of the auditory ner- vous system after the auditory nerve. As information moves upward through the system, however, with every additional synapse the tim- ing between the two ears is likely to become less precise. The medial superior olives (MSOs) are the first places in the auditory system where inputs from both ears converge (Figure 10.5), and sure enough, firing rates of neurons in the MSOs increase in response to very brief time differences between inputs from the two ears of cat. ITD detectors form their connections from inputs coming in from the two ears during the first few months of life and developing the ability to use ITDs to localize sounds depends critically on having experience with separate sounds coming from different places in space the developmental sequence is similar to formation of binocular neurons in the visual cortex, and it probably has a similar cause. Interpretation of ITDs is critically dependent on the size of the head. If babies came out of the womb with their ITD mechanisms prewired to the size of their infant heads, their sound localization abilities would steadily worsen as their heads grew during childhood and adolescence

cones of confusion

Using normal cues, all you can detect is the time difference between your two ears. This specifies an azimuth not a position. If you examine Figures 10.4 and 10.6 for a bit, you should see a potential problem with using ITDs and ILDs for sound localization: An ITD of -480 μs arises from a sound source that is located at either an angle of -60 degrees from the line of sight (ten o'clock in Figure 10.4) or an angle of -120 degrees (eight o'clock). Adding information from intensity differences does not help us here, because the ILDs for these two angles are also identical. If we also consider a relay station in the brain stem where inputs Fig. 10.06 from both ears contribute to detection of the interaural level difference. the elevation of a sound source (how far above or below our head the sound Dragonfly Media Group source is—a factor we've been ignoring up to now), we find that a given ITD or ILD could arise from any point on the surface of a cone of confusion that extends perpendicularly from the left or right ear. Although many books speak of a single cone of confusion, actually an infinite number of cones are nested inside one another. In fact, the widest "cone" is really a disk extending from directly in front of you, up to directly over your head, back to directly behind your head, and continuing to directly below you. As strange as it may seem, this disc is the most confusing of the cones. Cones of confusion are real perceptual phenomena, not just theoretical problems for the auditory system. In the real world, thankfully, we need not hold our heads in a fixed position. As soon as you move your head, the ITD and ILD of a sound source shift, and only one spatial location will be consistent with the ITDs and ILDs perceived before and after you move your head

grouping by timbre

When a sequence of tones that have increasing and decreasing frequencies is presented (Figure 10.21a), tones that deviate from the rising/falling pattern are heard to "pop out" of the sequence (Heise and Miller, 1951). What happens when two patterns overlap in frequency—one increasing and then decreasing in frequency, and one decreasing and then increasing in frequency If the tones are simple sine waves, two streams of sound are heard without overlapping pitches (Figure 10.21b); one stream includes all the high tones, and one includes all the low tones. However, if harmonics are added to one of the sequences, thus creating a richer timbre, two overlapping patterns are heard as distinct Auditory stream segregation based on groups of notes with similar tim- bres can be seen as another example of the principle of similarity at work. As Figure 10.16 illustrated, timbres are very different for pianos, trombones, and saxophones. Grouping by timbre is particularly robust because sounds with similar timbres usually arise from the same sound source. This principle is the reason why listeners are able to pick out the melody played on a single instrument—a trombone, for example—even when another instrument, such as a saxophone, plays a different or even opposite sequence of notes Neural processes that give rise to stream segregation can be found through- out the auditory system, from the first stages of auditory processing to the primary auditory cortex (A1) to secondary areas of the auditory cortex, such as the belt and parabelt areas (J. S. Snyder and Alain, 2007). The brain stem shows neural evidence of stream segregation based on simple cues such as frequencies of tones, but segregation based on more sophisticated perceptual properties of sounds is more likely to take place in the cortex.

directional transfer function (DTF)

a measure that describes how the pinna, ear canal, head and torso change the intensity of sounds with different frequencies that arrive at each ear from different locations in space (azimuth and elevation) The importance of the DTF in sound localization is easily understood if we consider the difference between hearing a concert live and listening to music through a set of headphones. In person, we perceive the sound of the French horns as coming from one side of the orchestra and the sound of the flutes as coming from the other side. But when we wear headphones (especially the type inserted directly inside the auditory canals), sounds are delivered directly to the eardrums, bypassing the pinnae. Auditory engineers can use multiple microphones to simulate the ITDs and ILDs that result from the musicians' different locations (the Beatles were early users of this type of technology), but DTFs are not simulated. As a result, you may be able to get some sense of direction when listening to a concert through headphones, but the sounds will seem to come from inside your skull, rather than from out in the world. The situation is akin to that of visual depth perception. Pictorial cues can give a limited sense of depth, but to get a true perception of three-dimensionality, we really need binocular-disparity information that we normally get only when we're seeing real objects. Actually, just as stereoscopes can be designed to simulate binocular dis- parity, it is possible to simulate DTFs. Instead of using two camera lenses in place of two eyes, two microphones are placed near the eardrums as described earlier. Then the sound source, such as a concert, is recorded from these two microphones. When this special stereo recording, called a "binaural record- ing," is played over headphones, the listener experiences the sounds as if they were back out in the world where they belong. Unfortunately, however, every set of pinnae is different (see Figure 10.9), so for this simulation to work, all listeners need their own individual recordings. Just as heads (and their corresponding ITDs and ILDs) grow to be larger, ears grow and change during development and are often subject to mutila- tions of varying degrees (e.g., piercings). Research suggests that listeners learn about the way DTFs relate to places in the environment through their extensive experience listening to sounds, while other sources of information, such as vision, provide feedback about location. -This learning through experience suggests that children may update the way they use DTF information during development, and it appears that such learning can continue into adulthood. Hofman, Van Riswick, and Van Opsal (1998) inserted plastic molds into folds of adults' pinnae. As expected, listeners im- mediately became much poorer at localizing sounds. But after 6 weeks of living with these molds in their ears, the subjects' localization abilities had greatly improved. Somewhat surprisingly, these listeners also remained quite good at localizing with their "old ears" when the molds were removed. It would be interesting to know how well Leonard Nimoy (who played Spock in the original Star Trek series) could localize sounds with his Vulcan ear molds. The experience of listeners in this study suggests that switching between human and Vulcan pinnae every day may have become just a normal part of Nimoy's auditory life. Unfortunately, there are some limits to the ability to adjust to growing or remolded pinnae. Babies grow larger heads, and pinnae really do get larger in old age. These larger ears do help older adults use lower-frequency cues; however, this improved ability to use low frequencies is insufficient to offset the effects of age-related hearing loss, and older individuals are poorer at localizing elevation.

inverse square law

a principle stating that as distance from a source increases, intensity decreases faster such that decrease in intensity is equal to the distance squared. This general law also applies to optics and other forms of energy. As it happens, reverberations appear to be important for judging the loudness of sounds. You do not judge that a coyote is howling softly just because it is far away, so how do you estimate how loud the howls really are? Recall that the sound to which you are listening rapidly decreases in energy, following the inverse-square law. However, reverberations do not fall off so quickly, because the surfaces that sounds bounce off do not move when the sound source becomes closer or farther away.

cone of confusion

a region of positions in space where all sounds produce the same time and level (intensity) differences (ITDs and ILDs)

medial superior olive (MSO)

a relay station in the brain stem where inputs form both ears contribute to detection of the interaural time difference.

lateral superior olive (LSO)

a relay station in the brain stem where inputs from both ears contribute to detection of the interaural level difference

conduction deafness

anything up to but not including the cochlea 1. obstructions 2. damage treatment: Remove Obstruction Repair Eardrum Repair Ossicles Open Eustacian Tube

timbre

loudness and pitch are relatively easy to describe, because they correspond fairly well to simple acous- tic dimensions (amplitude and frequency, respec- tively). But the richness of complex sounds like those found in our world depends on more than simple sensations of loudness and pitch. For example, a trombone and a tenor timbre the psychological sensation saxophone might play the same note (that is, their two notes will have the exact same fundamental frequency) at exactly the same loudness (the sound waves will have identical intensities), but we would have no trouble discern- ing that two different instruments were being played. The perceptual quality that differs between these two musical instruments, as well as between vowel sounds such as those in the words hot, heat, and hoot, is known as timbre. the official definition of timbre is "the quality that makes listeners hear two different sounds even though both sounds have the same pitch and loudness" -However, differences in timbre between musical instruments or vowel sounds can be estimated closely by comparison of the extent to which the overall spectra of two sounds overlap (Plomp, 1976). Perception of visual color depends on the relative levels of energy at different wavelengths (see Chapter 5), and very similarly, perception of timbre is related to the relative energies of different acoustic spectral components (Figure 10.15). For example, the trombone and tenor saxophone notes whose spectra are plotted in Figure 10.15a share the same fundamental frequency (middle C, 262 Hz). However, notice that the trombone's third (786-Hz) component is stronger than its fourth (1048-Hz) component, whereas for the saxophone, the relationship between the energies of these two components is reversed.

source segregation or auditory scene analysis

processing an auditory scene consisting of multiple sound sources into separate sound images our perception is typically of a world with easily separable sounds. We can understand the conversation of a dance partner at a party, and we can pick out a favorite instrument in the band. This distinction of auditory events or objects in the broader auditory environment is commonly referred to as source segregation or auditory scene analysis.

azimuth

the angle of a sound source on the horizontal plane relative to a point in the center of the head between the ears. Azimuth is measured in degrees, with 0 degrees being straight ahead. The angle increases clockwise toward the right, with 180 degrees being directly behind.

hearing in the environment

the auditory system's ability to transform tiny air pressure changes into a rich perceptual world is an amazing wonder of bioengineering. From the funneling of sound waves by the pinnae, to the mechanics of middle-ear bones, to the tiny perturbations of the basilar partition and hair cells, to the sophisticated neural encoding in the brain stem and cerebral cortex—some remarkable mechanisms have evolved to interpret acoustic information about the world around us. Research that reveals these inner workings of the auditory system typically uses very simple stimuli under constrained situations—often isolated pure tones heard through headphones by listeners sitting in an otherwise perfectly quiet laboratory. Although these methods are invaluable for understanding how the auditory system functions, this is obviously not the way we experi- ence sounds in our daily lives. In this chapter we get "outside the head" to investigate how hearing helps us learn about the real world. In many respects, sound localization parallels visual depth perception, which you learned about in Chapter 6. We then turn from where to what, how perceptual aspects of complex sounds are composed of simpler sounds in much the same way that visual representations of objects are built up from simple features (see Chapter 4). The third part of the chapter deals with auditory scene analysis, where we will see why some sounds group together, while separating from others, in ways that resemble the Gestalt principles introduced in Chapter 4. We see how the auditory system seamlessly fills in gaps to form a complete and coherent "picture" of our auditory environment in an auditory analog to visual object occlusion (see Chapter 6). Finally, we explore how auditory attention has much in common with visual attention that you learned about in Chapter 7, while also serving a special role in keeping us ever vigilant for surprises in world.

interaural level difference (ILD)

the difference in level (intensity) between a sound arriving at one ear versus the other The second cue to sound localization is the interaural level difference, or ILD, in sound intensity. Sounds are more intense at the ear closer to the sound source because the head partially blocks the sound pressure wave from reaching the opposite ear. The properties of the ILD relevant for auditory localization are similar to those of the ITD: -Sounds are more intense at the ear that is closer to the sound source, and they are less intense at the ear farther away from the source. -The ILD is largest at 90 and -90 degrees, and it is nonexistent at 0 degrees (directly in front) and 180 degrees (directly behind). -Between these two extremes, the ILD correlates with the angle of the sound source, but because of the irregular shape of the head, the correlation is less precise than it is with ITDs. Although the general relationship between ILD and sound source angle is almost identical to the relationship between ITD and angle, there is an impor- tant difference between the two cues: the head blocks high-frequency sounds much more effectively than it does low-frequency sounds. This is because the long wavelengths of low-frequency sounds "bend around" the head in much Wolfe Sensation and Perception 4/e: the same way that a large ocean wave crashes over a piling near the shore. ILDs are greatest for high-frequency tones, Dragonfly Media Group and ILD cues work really well to determine location as long as the sounds have higher-frequency energy. ILDs are greatly reduced for low frequencies, becoming almost nonexistent below 1000 hertz (Hz). Our inability to localize low frequencies is the reason it does not matter where in a room you place the low-frequency subwoofer of your stereo system.

interaural time difference

the difference in time between a sound arriving at one ear versus the other. If the source is to the left, the sound will reach the left ear first. If it's to the right, it will reach the right ear first. Thus, we can tell whether a sound is com- ing from our right or left by determining which ear receives the sound first. The term that is used to describe locations on an imaginary circle extending around us in a horizontal plane—front, back, left, and right—is azimuth The ITDs for sounds coming from various angles are represented by colored circles. Red circles indicate positions from which a sound will reach the right ear before the left ear; blue circles show positions from which a sound will reach the left ear first. The size and brightness of each circle represent the magnitude of the ITD. ITDs are largest, about 640 microseconds (millionths of a second, abbreviated μs) when sounds come directly from the left or directly from the right, although this value varies somewhat depending on the size of your head. A sound coming from directly in front of or directly behind the listener produces an ITD of 0; the sound reaches both ears simultaneously. For intermediate locations, the ITD will be somewhere between these two values. Thus, a sound source located at an angle of 60 degrees will always produce an ITD of 480 μs, and a sound coming from -20 degrees will always produce an ITD of -200 μs. That might not seem like much of a time difference, but listeners can actually detect interaural delays of as little as 10 μs (Klumpp and Eady, 1956), which is good enough to detect the angle of a sound source to within 1 degree.

fundamental frequency

the lowest frequency component of a complex periodic sound

sound localization

the owl's hoots enter your ears in exactly the same place (funneled through the pinnae into the middle and inner ear) regardless of where the owl is. The auditory system uses a similar approach to determine the location in space from which a sound is coming. Just as having two eyes turned out to be one of the keys to determining visual depth relations, having two ears is crucial for determining auditory locations. For most positions in space, the sound source will be closer to one ear than to the other. Thus, there are two potential types of information for determining the source of a sound . First, even though sound travels fast, the pressure waves do not arrive at both ears at the same time. Sounds arrive sooner, albeit very slightly, at the ear closer to the source. Second, the intensity of a sound is greater at the ear closer to the source. These are our first two auditory localization cues.

decay

the part of a sound during which amplitude decreases (offset)

attack

the part of a sound during which amplitude increases (onset)

auditory stream segregation

the perceptual organization of a complex acoustic signal into separate auditory events for which each stream is heard as a separate event. Listeners move too, so if a sound stays in the same place relative to the path of a listener, it will be easier for that sound to be sorted out from other sounds. In addition to location, sounds can be segregated on the basis of their spectral or temporal qualities. For example, sounds with the same pitch or similar pitches are more likely to be treated as coming from the same source and to be segregated from other sounds. Sounds that are perceived to emanate from the same source are often described as being part of the same "auditory stream," and dividing the auditory world into separate auditory objects is known as auditory stream segregation. A very simple example of auditory stream segregation involves two tones with similar frequencies that are alternated. This sequence sounds like a single coherent stream of tones that warble up and down in frequency. But if the alternating tones are markedly different in frequency (Figure 10.19b), two streams of tones are heard—one higher in pitch than the other . Auditory stream segregation is a powerful perceptual phenomenon that is not limited to simple tones in the laboratory. Before stream segregation was "discovered" by auditory scientists, the composer Johann Sebastian Bach exploited these auditory effects in his compositions (Figure 10.20). The same instrument, such as a pipe organ, would rapidly play interleaved sequences of low and high notes. Even though the musician played a sequence in the order H1-L1-H2-L2-H3-L3, listeners heard two melodies—one high (H1-H2-H3) and one low (L1-L2-L3). The examples of auditory stream segregation presented in this section can be described as applications of the Gestalt principle of similarity (see Figure 4.18a in Chapter 4): sounds that are similar to each other tend to be grouped together into streams.

acoustic startle reflex

the very rapid motor response to a sudden sound. Very few neurons are involved in the basic startle reflex, which can also be affected by emotional state. The acoustic startle reflex is unselective—almost any loud sound will do. In other cases, auditory attention is selective—it is picking one sound source out of several. Many sounds occur at the same time in natural environments, with all of the sounds becoming merged at the ears. This makes listening to only one sound among many a serious challenge. Effects of attending to a particular sound source can be so strong that we completely miss out on hearing other sounds in a kind of inattentional deaf- ness. Skilled musicians were no better than untrained listeners at noticing an electronic guitar improvisation that was mixed in with Richard Strauss's Thus Spoke Zarathustra (theme music from the sci-fi classic 2001: A Space Odyssey) when both groups were asked to count (and, thus, to attend to) the number of timpani beats (Koreimann, Gula, and Vitouch, 2014). This inattentional deafness has its limits, because listeners had less trouble noticing the guitar when it was made sufficiently loud. While inattentional deafness might seem to be a bad thing, it really repre- sents an extreme example of auditory processes that help us to listen in our acoustically crowded world. Imagine yourself in a room full of people who are speed dating in their searches for true love (Figure 10.27). You earnestly focus on the person who you are sizing up, but many other voices compete with the one voice you are trying to understand. Listeners can use the acoustic characteristics of a talker to track what that voice is saying despite the clutter of other voices. While we might think that this relies on being familiar with a talker or actively concentrating on that voice, we would be wrong (Bressler et al., 2014). Instead, it appears that the brain does this automatically, following principles like those for auditory stream segregation described earlier in the "Spatial, Spectral, and Temporal Segregation" section. listeners become less accurate in understanding what they hear when they have to switch between talkers Second, your ability to look at one person while attending to a different conversation illustrates the flexibility of your attentional apparatus. Finally, if you are attending to another conversation, really embarrassing things can happen when you realize that your partner has stopped speaking and is waiting for a response to a question to which you have not attended. You can switch attention back and forth between streams, but you cannot fully process two streams of speech any more than you can read two sentences at the same time


Set pelajaran terkait

Physics Energy and Momentum Test

View Set

Gastrointestinal tract and the abdominal wall PRACTICE QUIZ

View Set

Chapter 57: Management of Patients With Female Reproductive Disorders

View Set

Biology 101 Smart book Chapter 3

View Set

Research methods N262B midterm review

View Set