Speech Science

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Characteristics of an Approximate

- Limited articulatory constrictions that alter resonant frequencies - Classification based on syllable position - Formant transitions typically faster than for vowels

What two articulatory behaviors help lower the spectral peak for SH?

- Lip rounding - curling of the tongue up

liquids

- Retroflex /r/ & lingual-alveolar /l/

Characteristics of a Sonorant

- Similar to vowels - Free airflow; articulation shapes the vocal tract - Characterized mainly by formant frequencies - Periodic laryngeal source (all voiced)

In English what is the approximate dividing line between VOT values for voiced and voiceless stops?

- Under 25 Ms. for voiced stops - Over 30 Ms. for voiceless stops

What happens to dormants surrounding nasal consonants? What are the consequences for intelligibility?

- Vowels affected by nasalization tend to have formants lower in intensity with wider bandwidths - Consequence: Formants are less distinct (hard to identify vowels)

What are the features of a simple periodic sound (pure tone)?

- a sine-shaped waveform - a single frequency - a certain amplitude - a certain wavelength

What Determines Center Frequency?

-Mass reduces high-frequency energy - Stiffness reduces low-frequency energy - The combination determines the optimal frequency

Three Categories of VOT

-Voice Leading -Zero onset/short-lag -Long-lag VOT

Resonance of Cavities depends on...

-Volume of cavity -Area of any aperture -Surface characteristics of the cavity -Coupling factors between diff cavities

Sampling

-You're measuring the voltage. Math that guarantees you'll get the same back that you put in

affricates

-combination of stop and fricative

sonorant or resonant

-consonant is vowel-like -voiced

fricatives

-continuants -produced with continuous airflow through the oral cavity -audible turbulence, without voicing

resonance

-every object has a natural resonant frequency influenced by its shape and size -depends on shape of vocal tract

antiformants

-frequencies of sound energy loss -frequencies at which sound is not transmitted through the vocal tract -sound is 'trapped' in the back cavity

fricative cues

-frication -transitions -voiceing

Glide cues

-gradual formant transition that is quicker than that of diphthongs

obstruent

-has a supraglottic sound source -the vocal tract is constricted tightly enough to produce turbulence noise

s, z

-high energy noise with most of the energy lying in the high frequencies (above 4,000 Hz) -Front cavity is long enough to introduce a significant resonance

Time Domain

-instantaneous amplitude across time -amplitude is Y axis (ordinate) -Time is X axis (abscissa) (Easy to tell if its square or sine or whatever)

ʃ,ʒ

-intense noise spectra with most of the energy lying in the mild to high frequencies (above 2,000 Hz) -Front cavity has significant resonance effect

liquids

-lateral and rhotic -steady state -transition

approximate

-liquid and glides and two types of non-nasal sonorants -liquid/l/ /r/ -glides: /j/ /w/

h

-low energy, flat, diffuse spectrum -the whole vocal tract filters the noise, so vowel-like formant patterns often are evident in the radiated noise

nasals

-murmur -transitions

strident/sibilant fricatives

-produced with an especially loud noise -tongue forms narrow channel -stream of air strikes incisors -/s/ and /z/

assimilation

-rubbing off of properties in other areas where they don't necessarily belong -NOT THE SAME AS COARTICULATION

consonant

-sound where produced with major obstruction in the vocal tract -one or more sound sources

vowel

-sound where vocal tract mostly open -always voiced

affricate cues

-stop gap -frication

/v/

Place: labiodental Manner: fricative Voice: voiced Aperiodic Turbulent, & Periodic Laryngeal Source, Muscles: Orbicularis Oris Inferior, Manner-presence of aperiodic noise, Place- low, Voice- Presence of Phonation

/j/

Place: palatal Manner: glide Voice: voiced Periodic Laryngeal Source, Muscles: Genioglossus, Damping, Formant transitions

/r/

Place: palatal Manner: liquid Voice: voiced, Periodic Laryngeal Source, Muscles: Superior Longitudinal Muscle, Orbicularis Oris, Rapid formant changing in F2 and F3 Formant changes and damping

liquids

/r, l/ semivowels characterized by rapid movements, formant structure, F3 distinguishes the phonemes (F3 and F4 are closer for /r/); antiformants for /l/

Consonants: Liquids

/r/ /l/

What are some examples of liquids

/r/, /l/

Liquids

/r/, /l/ Transitions similar to vowels

sibilants

/s, z, sh, Ʒ/ intense noise, differentiated among themselves by voicing and noise spectrum pulses for /z ʒ/ no pulses for /s ʃ/

What sibilants and how do they look on a spectrogram?

/s, z, shhh, "juh"/; concentrate energy above F3 with palatals even higher than alveolars

Which phonemes have steep, high-frequency spectral peaks

/s/ /z/ /sh/ /3/ sibilants; posterior

Difference between /s/ and /z/

/s/ is voiceless & longer, and /z/ is voiced & shorter. -Both have concentration 3500+, and dark noise.

/n/ has a very similar formant patterning to

/t/ and /d/

What phonemes are non-resonant Affricates

/tsh/ /d3/

Consonants: Affricates

/tʃ/ /j/

/w/ shares similar formant structure to which vowel?

/u/

High, back, rounded vowel

/u/

low F1 and low F2

/u/ has...?

glides

/w, j/ also called approximants and semivowels, gradual articulatory motion. narrow but not closed vocal tract, have formants

Consonants: Glides

/w/ /j/

Semivowel Glides

/w/ and /j/ -/w/ = /b/ + /u/ + transition to vowel -/j/ = /d/ + /i/ + transition to vowel

What are some examples of glides

/w/, /j/

Examples of tense vowels

/ɝ/, /u/, /o/, /i/, /e/

Examples of lax vowels

/ɪ/, /ə/, /ʊ/, /ɛ/, /æ/, /ɚ/

Difference between the strong, voiceless fricatives /ʃ/ and /s/

/ʃ/ starts at a lower frequency of 2000Hz, /s/ starts around 3500Hz. -Both dark noise, long durations.

central

/ʌ/ is long

Difference between /ʒ/ (voiced) and /ʃ/ (voiceless)

/ʒ/ has shorter duration, /ʃ/ has longer duration -both have greatest energy 2000Hz+ and dark noise.

/w/

Place: velar Manner: glide Voice: voiced Periodic Laryngeal Source, Muscles: Styloglossus, Orbicularis Oris, Damping, Formant transitions

affricates

/ʧ ʤ/ described as a combo of stop and fricative complete obstruction in the vocal tract, intraoral pressure builds up, release to generate fricative noise (distinguishes them from stops)

/ŋ/

Place: velar Manner: nasal Voice: voiced Periodic Laryngeal Source, Muscles: Levator Palatini, Palatoglossus, highest/longest variable,

Difference between /θ/ and /ð/ (ex: Thaw and That)

/θ/ has a trailing tail and longer duration, /ð/ has shorter duration -both have light noise and are 2000+

4 Ways a System Loses Energy/Why is there damping or decrease in amplitude?

1) Friction: air molecules rub against each other as well as the walls of the vocal tract resonator (vocal tract tissues). 2) Absorption: energy is lost/transferred to another structure (vocal tract tissues). 3) Radiation: air molecules escape from the tube and are lost. Escape from nose/mouth. 4) Gravity: exerts a force on the air molecules that opposes the inherent vibratory forces (inertia).

Characteristics of the Vocal Tract Resonator

1) Quarter-Wave resonator. Open at one end (mouth/nose) and closed at other (vocal folds/glottis). Driving frequency could be vocal folds or in tract. Responds with greatest amplitude to some frequencies while it attenuates to other frequencies. 2) Series of connected air-filled containers. 3) Irregular shape: acts like Nonuniform resonator-broadly tuned and responds to a wide range of frequencies around resonant frequency. 4) Variable resonator: frequency of response changes when vocal tract changes shape, which depends on the sound being made. Different resonant frequencies as vocal tract changes shape/configuration.

Acoustic correlates of a stop

1) silent gap 2) release burst 3) voice onset time 4) formant transition

Calculate Formants

1) take length of vocal tract. (Ex. 17cm) 2) find wavelength of lowest resonant frequency. (lambda = 4L for Quarter-Wave Resonator; 4(17)=68cm 3) use this info to find frequency. (lambda = c/f, so f = c/lambda; 34000/68=500Hz 4) calculate higher resonant frequencies. (For closed tube, multiply by 3 for 2nd formant b/c odd number multiples) 2nd=1500Hz

The lowest resonant freq (or, 1R) has a wavelength that is __ times the length of the vocal tract, while the ____ resonant freq (or 3R) has a wavelength that is 3 times the lowest frequency. Finally the ____ resonant frequency, (or 5R) has a wavelength that is five times the lowest frequency

4, second, third

F2 equation

485.7 x 3 = 1,457 Hz

F3 equation

485.7 x 5 = 2,428 Hz

wavelength

4x17.5=70

If 3 pure tones at 50 Hz, 150 Hz, and 250 Hz were added together to make a complex wave, the frequency of the complex wave would be:

50

What are the Hz for Formants I-III?

500 Hz for formant I 1500 Hz for formant 2 2500 Hz for formant 3 3500 Hz for formant IV- high pitch

Pinna resonates at what freq?

5kHz

Radiation from lips ®; gain

6 dB/octave gain

Radiated acoustic pressure wave ℗; loss

6 dB/octave roll-off

/g/

Place: velar Manner: stop Voice: voiced Periodic Laryngeal Source, Muscles: Styloglossus, Palatoglossus, Mylohyoid, Levator Palatini, Manner- silent or near silent closure interval; transient release burst, Place- F2 transitions, frequency of most intense portion of release burst, Voice- +phonation, -phonation, presence or absence of aspiration, VOT/F1 onset, closure duration (medial position), preceding vowel duration (final position),

Assimilation

A sound becomes like its neighbor; one articulator is involved

Assimilation

A sound becomes like its neighbor; one articulator is involved Partial Assimilation: no change in phonemic category EX: see powerpoint Complete Assimilation: phonemic class changes EX: Velarization of n/ before /k/ in "ten cards" Can be seen in acoustics, speech movements, and muscle activity Examples of types of assimilation (look at powerpoint)

Speech production system is comprised of ____.

A sound source and filter (resonator). Source excites the resonator (there are many different sources)

Consonant

A speech sound produced with one or more areas of the vocal tract narrowed by some degree of constriction (partial or complete); less energy, greater meaning, some are voiced

/k/

Place: velar Manner: stop Voice: voiceless Aperiodic Laryngeal Source, Muscles: Styloglossus, Palatoglossus, Mylohyoid, Levator Palatini, Manner- silent or near silent closure interval; transient release burst, Place- F2 transitions, frequency of most intense portion of release burst, Voice- +phonation,-phonation, presence or absence of aspiration, VOT/F1 onset, closure duration (medial position), preceding vowel duration (final position)

F2 transitions

Plosives- EX: /b/- frequency is low; the formant structure is like /a/ low & low. EX: /d/- starting at a mid frequency. Which means the F1 frequency is about the same as the F2. EX: /g/- frequency is high The frequency of the F2 transition of where it starts and ends; tells us about place of articulation. Burst release (concentration of energy)

Average Duration

Plosives: 10 msec. Affricates: 75-130 msec. Fricatives: 130 msec.

on a spectrum display, a voiced fricative shows... -vertical striations due to the opening and closing of the vocal folds -a voice bar at low frequencies - frication noise with a frequency range comparable to that of its voiceless cognate - frication noise somewhat weaker than that of its voiceless cognate - all of the above - none of the above

all of the above

oscilloscope

electronic test instrument allows you to see how voltages vary over time (amplitude measure)

in oral speech sounds, soft palate is ___

elevated against posterior pharyngeal wall

the levetor veli palatini muslce is involved in the - of the soft palate

elevation

analyze and give feedback for consonantal and vocalic /r/. demonstration of RTGram. and you can analyze with Praat

how to use acoustics in tx:

What are the spectrographic differences between open and close vowels?

http://www.u.arizona.edu/~ohalad/Phonetics/notes/Formants%20Spectrograms%20and%20Vowels.PDF

How do different sound classes appear on a spectrogram?

https://home.cc.umanitoba.ca/~krussll/phonetics/acoustic/spectrogram-sounds.html

Which item below is not used for measuring the glottal waveform

Spectrogram (transillumination, inverse filter, pseudo-infinite length tube, and electroglottograph all do)

These 2 instruments used together would provide you with measurements of fundamental frequency and vocal fold contact patterns

Speech Analysis using a microphone, PRAAT/ Multi-Speech and EGG

speech rate vs articulation rate

Speech rate includes pauses, whereas articulation rate excludes them

Assimilation

Speech sounds become like neighboring sounds, involves an alteration in the movement of a single articulation

Obstruents

Speech sounds produced with an obstruction of some sort in the airway. Stops, fricatives, affricates.

Resonants/Sonorants

Speech sounds produced with continuous, non-turbulent airflow. Nasals, liquids, glides, vowels. Typically voiced.

Continuants

Speech sounds without complete obstruction, but a continuous airflow. Fricatives, nasals, liquids, glides. Voiced or nonvoiced.

This portable device is used to measure vital capacity

Spiropet/ spirometer

The vowel /ɑ/ modeled as two tubes.

Start of /ɑ/ is thin tube and end is fatter tube because the tongue is in pharyngeal cavity, taking up space so it is thin at one end.

Stops: Aspiration

breathy noise generated following the release of a voiceless stop consonant as air passes between the vocal folds as they begin to adduct for following vowel

What sound is considered a high, unrounded front vowel? what are the articulatory configurations associated with this vowel

i, Low F1, High F2

consonants

identified as continuant, sonorant, strident, based upon closure and distribution of energy at higher frequencies

phonemic analysis of vowels

if 2 words are identical except for single vowel or consonant, the vowels are said to be different phonemes ex-bed/bad, beet/bit

What is delayed auditory feedback?

if a listener listens to his own recorded speech under a slight time delay, fluency speech can become disfluent, syllables repeated or prolonged (disfluent speakers can become fluent as well)

When are VOT values positive?

if onset of phonation follows stop release ex. if phonation begins 75 ms after release of stop, VOT value = +75

/θ/

Place: interdental Manner: fricative Voice: voiceless Aperiodic Turbulent, Muscles: Superior Longitudinal, Manner- presence of aperiodic noise, Place- low, Voice- Absence of Phonation

during a non-resonant affricate, what happens

airflow interrupted by a sound slowly pulls away from stop and constriction and produces a turbulent noise

egressive

airflow is...what?

open glottis = ?

airstream is only audible at the point of constriction = voiceless fricative

effects of narrowing the vocal tract at different locations

all formants are lowered by labial constriction

intonation ___ -is the rise and fall of fundamental frequency in an utterance -is perceived as the melody of speech -conveys meaning (declarative vs interrogative utterances) -conveys non-linguistic information about the state of the speaker's mind and/or emotions - all of the above

all of the above

in general, the first formant F1 is ___ in consonants than in vowels, because consonants involve ___

lower; more oral constriction

For formant 1, narrowing or rounding the lips will cause it to ______ in frequency and opening it will cause it to ____ in frequency

lower; raise. Narrowing or rounding the lips causes it to lower in frequency; opening raises it. lower tongue = raises formant raise tongue = lowers formant

F3

lowered by lip or mid tongue constriction, raised by anterior or posterior constriction

F2

lowered by lip or posterior constriction, raised by anterior constriction....the more front the sound, the higher and the more back, the lower

Sonorants

nasals, glides, and liquids, Characteristics:Similar to vowels, Free airflow; articulation shapes the vocal tract, Characterized mainly by formant frequencies, Periodic laryngeal source (all voiced)

Sonorants

nasals, glides, liquids

Resonant consonants are:

nasals, liquids, glides- similar to vowels - aka sonorants- more open vocal tract allowing resonant energy

Phonation Threshold pressure

minimum amount of transglottal pressure needed to set VF into Vibration ~In healthy adults this number is low

Node

minimum vibratory amplitude; formant frequency is raised by constriction; minimum volume velocity or maximum pressure

Sonorants: Semivowels/ glides

more constricted than vowels gradual articulator movement to following vowel good formant structure requires movement

What are sounds in English more susceptible to?

more susceptible to duration fluctuations than sounds in other languages (learned vs. psychologic difference? no one is sure what explanation for this is true)

during the close phase, a nasal has ___ energy than a stop because ___

more; air and sound may exit via the nostrils

where does perception for hearing occur

mostly in brain

where does sensation for hearing occur

mostly in inner ear

dysarthria

motor speech disorders due to neurological damage, rate changes, articulatory adjustments are neglected, diminshed acoustic contrasts incomplete stop closures, timing and sequencing are interrupted

Stops: Formant transitions

movement from stop to following vowel (or from following vowel to stop)

different sized resonating cavities

movement of the articulators creates what?

spatiotemporal dynamics

movement of the articulators relative to some frame of reference

What unit is VOT measured in?

ms

flaps

much shorter in duration than regular stops and voicing may not be distinguished

what is meant when said speech is redundant

multiple cues for same thing

manner

murur is a cue for ______ of articulation

Semi vowel

named because their articulations and acoustic features resemble those of vowels. The articulators used in their production form only minimum constrictions in the vocal tract, they are characterized by formant structures similar to those of vowels and diphthongs. The _____ are subdivided into two manners of articulation glides and liquids.

Fricatives

narrow constriction, push air through generates noise=aperiodic longer duration than stops voicless= just aperiodic noise aperiodic noise and periodic vocal folds

a tuning fork is a

narrowly tuned and has very low damping

the consonant manner class(es) that have phonationg at their only sound source are

nasal and semivowels

nasals involve resonance within the

nasal cavity

500

nasal murmur is dominated by low-frequency energy -often below ____ Hz

sonorants

nasal, liquids and glides tend to look both vowel and consonant like

velum

nasals are produces by lowering the _____, allowing a coupling of the oral and nasal cavities

vowels

nasals have highly damped formants, meaning broad bandwidths, however, nasal bandwidths are narrower than ___

Why is coordination of the articulatory and laryngeal system necessary?

necessary to create voicing differences that are reflected in VOT

If change is required, feedback will be what?

negative

the fact that no word in English can start with "tl" is an example of a - sequence constraint

negative

Prevoicing

negative VOT- release was before the start of voicing

Is there one set of accepted distinctive features?

no

do labiodental, dental, glottal fricatives have a narrow spectra?

no

Partial Assimilation

no change in phonemic category

Is accurate tactile feedback essential to accurate speech production?

no one knows for sure

voiceless stop equals

no periodic vocal fold vibration

What is partial assimilation?

no phonemic change occurs in the sound, only a phonetic change

voiceless

no voice bar at the bottom

What does aspiration diffuse?

noise energy generated at larynx (or lower)

photoglottography

bright light placed against the neck just below cricoid cartilage, probed passed to pharynx acts as photosensor, amount of light passing through glottis is proportional to glottal area

How are non resonant consonants characterized?

characterized by more restricted airflow than for semivowels or nasals

True

children under 12 years demonstrate spatiotemporal patterns similar to, but not as stable, as adults. True or False?

tracheostomies

children with this have limited interaction to hear lots of speech and limited ability to produce speech

front vowels

clearly separated F1 and F2

back vowels

close F1 and F2, but separated F2 and F3

Frication

noise energy noise is generated as air is forced through a narrow constriction

peaked spectrum

noise for stop /k/ generated in the mid-frequency range of 1.5 - 4 kHz

falling spectrum

noise for stop /p/ generated in the low frequency range of 500-1500 Hz

rising spectrum

noise for stop /t/ generated in the high frequency range above 4 kHz

What makes sounds audible in nonresonants?

noise in the speech signal makes the sounds audible regardless of whether phonation accompanies articulator or not, thus a single articulation can produce 2 different speech sounds, one phonated and one unphonated

What is frication?

noise random vibrations

cue to fricative

noise; transition to next vowel

2 types of sounds that are not perceived categorically

non speech sounds and vowels in isolation

rate

normal, fast or slow

sound pressure level in dB

not an accurate reflection of the sensitivity of the human ear (our perception of intensity - the loudness of a sound)

initial cluster with stops

not as much aspiration

How are speech sounds produced in reality?

not produced one at a time, independently of each other. They are produced in a context and are altered by neighboring sounds

stable

relationship between the vowel formants is stable or unstable?

stress

relative syllable weight based on F0 duration, instensity, amplitude and formants

rapid articulatory movements

relatively fast formant transitions (mostly F1)

What are the formant configurations for cardinal vowel [a]?

relatively high first formant; second formant just over it and slightly higher

What are the formant configurations for cardinal vowel [i]?

relatively low first formant; second formant is high.

what are the formant configurations for cardinal vowel [u]?

relatively low first formant; second formant is low as well

During nasal sounds, the levator palatini muscle is

relaxed

Stops: burst release

release of pressure behind constriction articulatory release should be very fast and burst is short (5-40msec) more intense for voiceless than voiced stops spectrum of burst varies with place of articulation burst spectrum also influenced by following vowel (coarticulation) if release is too slow, it could sound like a fricative/ fast=stop

F2 in fricatives

relevant for distinguishing labio-dental and dental fricatives

Wide Bandwidth

resolves time information well (formant structure) but frequency information poorly

formant

resonance of vocal tract

tube length and location/degree of constriction effect

resonances

the coal tract naturally resonates at these frequencies

resonant (formant) frequencies

two (general) consonant types include

resonant and non-resonant

nasal cavities form a ______ during nasal sounds

resonant chamber

The coal tract naturally resonates at certain frequencies. They are termed _____ or _____ frequencies

resonant or formant

when the styloglossus muscle contracts the tongue is

retracted

language specific

rising does not always mean question, other languages have tone on syllable as a contrastive feature

stop release

rising pressure behind the obstruction is rapidly released producing a 20-30ms transient three types: transient, frication, and aspiration

in active theories what happens if there is a large error signal

runs it through the process again and uses context

high amplitude fricatives

s z sh zsh

sibilant/strident

s, z

Strong Fricatives

s, z, ʃ, ʒ -All have dark random energy -Place of articulation determines frequency range -Voicing determines the duration and existence of voice bars or striations

one of the functions of the orbicularis oris is to

close the lips, push them forward, and help produce biblial sounds like "ooh"

Is the VP port open or closed during stops

closed

glottis

closed end of the tube (vocal tract)

the three phases of the glottal cycle are

closed, opening, closing (order matters)

stop consonants involve these three distinct components

closure, burst and transition

Consonants can be affected by

coarticulation

productive correlate of parallel processing of sounds is

coarticulation

two articulation are moving at the same time for different phonemes

coarticulation

What creates syllables?

coarticulation of vowels and consonants binding together

interaction of cues

combination of burst and vowel contributes to perception of place of articulation

Voiced obstruents...

combine periodic and aperiodic sources

the stop release burst of /k/ would be expected to have a ___ spectrum because there is ___ cavity anterior to the closure point.

compact; a large

the velar stops have a release bust with a ___ spectrum, and labial stops have a release burst with a ___ spectrum.

compact; diffuse falling

The type of assimilation while saying 'ten cards' Be specific in describing what happens to /n/

complete assimilation- /n/ becomes /ng/ due to the following velar stop

Stops: Stop Gap

complete constriction of the superlaryngeal vocal tract (VP port) voiced

Stops

completely blocking air behind articulators (vocal tract is completely closed) open tract then burst of air comes out ALL ORAL NOT NASAL

speech

complex sequencing of phonemes in rapid succession

turbulence

complex, unpredictable air flow

one of the functions of the buccinator muscle is to

compress the cheecks

How are models of speech production often generated?

computer or mechanically generated

Resonances are dependent on the ____ of the resonator.

configuration/shape

fricatives

consonant produced with a narrow constriction through which air escapes with a continuous noise

This sound class involves a constriction within the vocal tract

consonants

2 types of sounds that perceived categorically

consonants and vowels in context

to provide for constant airflow during speech production we must maintain - in the lungs

constant pressure

Consonants are produced with more _____________ in the vocal tract than vowels

constriction

Upstream

constriction near the glottis

Downstream

constriction near the mouth

Speech sounds vary in duration due to what?

context

rate

contextual timing characteristics of speech production (not individual sounds)

What sounds are longer than stops?

continuant consonants (fricatives, nasals, semivowels) are longer, including the duration of stop closure

after the corticospinal tract of the pyramidal system crosses over, innervation is considered to be mainly ipsi-/contra-lateral

contra-lateral

Consonant production involves a ____ within the vocal tract

contriction

the pyramidal system is comprised of three tracts which are..

corticospinal, corticobulbar, and corticopontine

murmur

coupling of the oral and nasal cavities causes a nasal _____

segmental shortening due to rate of speech

same pattern but reduced strength

different source frequency but same filter

same vowel at different pitch

implication of independence of source and filter

same vowel can be produced at different fundamental frequencies and different vowels produced at same fundamental frequencies

____ formants change due to the size of the oral cavity

second

place cues for stops 2

second formant (F2) transitions (and even F3)

some people produce sounds correctly even if they cant perceive it correctly- who?

second language learners

rate of speech

segments compress as rate increases

What makes suprasegmental features different from the segmental features of speech?

segments of speech are vowels and consonants, suprasegmentals are prosodic features of speech that tend to occur simultaneously with 2 or more segmental phonemes

give two of the types of information that a person has stored for every word they know

semantics (meaning of words) and phonetics (sequencing of sounds)

/w/ /j/ /r/ and /l/ are generally classified as

semivowels

which consonant manner class is most intense acoustically?

semivowels

Range of Human Hearing

sensitivity to sounds depends on both the amplitude and frequency of a sound.

DURATION: Syllabification

separate every syllable

tongue, jaw, and velum

set the filter for how you want to resonate a particular vowel-- set it by using what 3 things?

in terms of speech perception which is perceptual invariance

several sets of cues may be heard as same consonant

stress

creating meaning thru emphasis: F0 contour, intensity, duration many people use a combo of the three, used to change meaning of sentences based on stress of specific word

Voiced stops beginning a word are usually produced with a long/short delay of voicing onset for the next vowl

short

voiceless stops that follow an (s) in the same word (e.g. spy) are usually produced with a long/short delay of voicing onset for the next vowel

short

affricates

short stop + extra short fricative

fricatives

should show air and may look like a stop

sound spectrum

shows bands of harmonics over time: the narrowband and the wideband

any one given person can change the fundamental frequency of their voice by the relative contraction of which muscles?

crico-thyroid, vocalis

DURATION: Ataxia

damage to the cerebella and effects prosody

parallel processing of sounds

decoding more than one sound at a time

2 cues to nasality

decrease in intensity; weakness of upper formants

What does falling intonation result from?

decreased cricothyroid activity or from decreased subglottal pressure at the end of the breath for this utterance

in switching from breathing for life to breathing for speech, the number of breaths per minute increases/decreases

decreases

Increasing lung volume..

decreases lung pressure

Speed of sound is influenced by

density and temperature

direct realist theory of speech perception

derives from the visual perception theory. • Perception: what the listener hears ("the object"), not the actual acoustic event • Perception consists of a single step from acoustic signal to perception

what is the general function of the extrinsic muscles of the tongue

determines the position of the tongue in the oral cavity

what is the general function of the intrinsic muscles of the tongue

determines the shape of the tongue in the oral cavity

The primary muscle of inspiration

diaphragm

final consonants with stops

difference is seen in the duration of the vowel

F2 transitions are moving in __________directions

different

fricative spectra shows

diffuse energy in non sibilants and concentrated energy in sibilants

the stop release burst of /t/ would be expected to have a ___ spectrum because there is ___ cavity anterior to the closure point.

diffuse rising; a small

Manipulating rate of change differentiates semi-vowels and ____

diphthongs.

What sounds are intrinsically long?

dipthongs and tense vowels

What is proprioceptive feedback?

direct feedback from the muscles;s sense velocity and direction of movement and position of articulators and other speech organs

Glottal source - F0 + harmonics Vocal tract - formants

do not confuse this:

when the rectus abdominis contracts, the rib cage is pulled down/up

down

when the hyoglossus muscle contracts the tongue is pulled

downward

when the risorius muscle contracts

draws back the angle of the mouth

An aperiodic sound can be created

due to partial adduction of the vocal folds, at various locations along the supraglottal vocal tract, and through forcing the airstream through a constriction

cue to place of production for nasals

duration of F2 transition to adjacent vowel

2 cues to voicing in affricates

duration of closure, duration of preceeding vowel

partially voiced

during closure

When is there a groove formed along the tongue midline?

during sibilants

When does aspiration occur

during the release burst, but is not the same thing as the release burst contributes to voiceless stop bursts seeming more intense- turbulent air flow occur at the larynx not at the articulators *release burst and aspiration occur at the same time

Spirintization

during the stop gap, when sound occurs due to incomplete closure of the articulators sounds like a fricative another way for air to leak through is through the port

diphthongs

each of these has a characteristic F1-F2 pattern. the actual value of formant frequencies is variable across individuals and within individuals across speaking contexts

standing wave patterns

each resonant pattern is a (blank)=formant

syllable timed

each syllable has about the same duration and vowels do not get reduced ex: spanish

cite three characteristics of muscle that are used in the evaluation of muscle function

strength, range of motion, motor control

Any syllable can be spoken with greater/lesser ______ depending on the meaning demanded by context

stress

What are the suprasegmental features of speech?

stress, intonation (pitch) , and duration (length of time)

the suprsegmental features of speech are

stress, intonation, duration

Suprasegmentals include:

stress, intonation, rhythm and juncture

in heteronym pairs, the different meanings are indicated by ___

stressing the first syllable in nouns and the second syllable in verbs

the strident/sibilant fricatives are ____ than the nonstrident fricatives because ___

stronger; the teeth form an obstacle to the airflow and there is a resonating cavity

What speech disorders involve a breakdown in the rhythm of speech

studdering/ dysarthria

clear speech

stupid people trying to sound smart effort to be highly intelligible: slower, avoidance of articulatory modification, greater intensity of consonants, greater F0 variability, precise timing

the external muscles of the larynx are enervated by the - branch of CN X (vagus)

superior

This task requires a client to say a sound for as long as possible at a comfortable pitch and loudness level

sustained phonation/ max phonation time

No VOT when a stop is in the ___________________ position. It uses the preceding vowel duration instead

syallable- final post vocalic VC

liquids may function as _______, while gildes never do

syllable nuclei

What are suprasegmental features overlaid on?

syllables, words, phrases, and sentences

videokymography

television technology, limits scanning of the endoscopic image to rapid repetition of a single line: drawing the glottis in the horizontal plane over time. research tool, identifies glottal configuration

F2

tells you how fronted or how backed the tongue is in the mouth while producing vowel

F1

tells you how high or how low the tongue is in the mouth while producing vowel

when several impulses travel down the same axon towards the snapse with another neuron the effect is - summation

temporal

coarticulation

temporal overlap of articulatory movements for different phonemes

Co-Articulation

temporal overlap of articulatory movements for different phones.

Which are longer- tense or lax vowels

tense

longer

tense vowels are what?

what do spatial target models say?

that a speaker can still produce a sound accurately even in the face of disruption of underlying muscle activity : "motor equivalence"

What has further evidence said about DAF?

that audition does operate as a feedback system for speech control, but no one knows if the feedback is necessary for speech

What is the overall conclusion regarding the vocal tract normalization theory and simple target theory?

that even though a comprehensive and tested theory is not available, we can assume for the moment that he formant frequencies are the most important cues to vowel perception from the sound signal. That is the simple target theory (though imprecise at times) will lead to the best and most practical results when we try to interpret spectrographic images clinically.

What is problem of the simple target theory?

that formants 1 and 2 are not as reliable and consistent as would seem at first sight. (graphs of vowels circled)

What does rise-fall intonation curve mean?

that most often the pitch rises during the first part of the utterance and falls at the end, signals person its their turn to talk

What do acoustic-auditory models tell us?

that there can be variation in the articulator of sounds, like vowel formants, but the listener will still recognize the sound accurately

lower

the acoustic energy for /l/is primarily in the ______ frequencies -resembles a nasal

Murmur

the acoustic pattern associated with nasal radiation of acoustic energy

Voice Onset Time (VOT)

the amount of time between the burst release and the onset of voicing.

the weakening of the spectrum of nasals above 300 Hz is due to

the antiresonances of the nasal passageways, sinuses, and blocked oral cavity

When does aspiration occur in english

the beginning of a word and the beginning of stressed syllables

oropharynx

the bend at this, doesn't matter acoustically

Why is partial assimilation okay?

the brain is flexible and thus you don't have to hit every exact spot

formants

the characteristic resonances

perservatory (progressive) coarticulation

the current speech sound is influenced by the properties of a sound realized previously -ex. dogs or cats

What is VOT

the duration of the period of time between the release of a stop and the beginning of vocal fold vibration

1000

the first antiformant for /m/ occurs at around ______ Hz.

F1, F2, F3, F4, F5

the first four or five are relevant for speech, and for specification of a vowel, only the first three are relevant

If an area of maximum velocity (v) is penetrated, the formant moves ______ in frequency.

the formant moves DOWNWARD in frequency.

F3

the frequency of this is quite low, making it difficult to distinguish between the second and third formants

acoustic targets

the goal of articular movement may be a specific acoustic event. Supporting this theory=limitations acoustic feedback (hearing impairment) negatively affect speech production

VF are touching

the highest amplitude is when?

What is primary stress?

the highest level of stress, usually seen on the second syllable in a word

How is a greater intensity for the heavily stressed syllable attained?

the increased vocal fold tension that yields a higher F0 value also leads to greater excursion of the vocal folds from rest, causing greater amplitude of the stressed syllable

VOT (voice onset time)

the interval between the release of the stop and the onset of vocal fold vibration

The [l] sound is fairly comparable to the [r] in most respects except?

the l sound is fairly comparable to the r in most respects but fails to reveal the drop in F3 characteristic for the [r]. the l has relatively closed sound, appears softer on spectrograms.

15cm

the length of a female vocal tract (then F1=34,000/ (15x4) = 566.67 Hz)

In general, the longer a tube...

the lower its lowest resonant frequency

What is fundamental frequency

the lowest frequency of pattern repetition

note under vocal tract transfer function

the more widely spaced harmonics of the higher -F0 sound of the female voice compared to the male voice

aBduction is

the movement of the vocal folds away from the midline

aDduction is

the movement of the vocal folds towards the midline

doesn't

the nasal cavity does or doesn't change its articulatory posture?

What is lexical stress?

the pattern of stress within words, can count the number of syllable nuclei in an utterance to determine the number of syllables in a word (also differentiates nouns from verbs like PERmit vs perMIT and Extract vs exTRACT)

Changing F0 is the what?

the pitch pattern or intonation contour of a sentence

Why does /sh/ have lower overall frequencies than /s/?

the point of articulation is further back in the mouth than for /s/, giving a longer resonating cavity anterior to constriction and those lower overall frequencies (2000 Hz and above) - there is also lip rounding and protrusion in production of this soon, causing a longer oral cavity and lower frequencies (think back vowels)

What are the cues for voicing of stop plosives?

the presence of a voicing bar (phonation itself). the duration of VOT, VTT (or stopgap if it happens in the middle of a word). presence of aspiration (VL consonants only). duration of preceding vowels.

Describe voicing of fricatives

the presence of phonation during fricative noise. relatie duration of noise segment and bordering vowels. vowel is prolonged prior to z; z itself is shorter, however than the s phoebe.

When are vowels longer?

when they occur before voiced consonants ("leave") than they are when they occur before voiceless consonants ("leaf") - they are also longer before continuants than stops ("leaf" vs "leap")

Place of Articulation

• Bilabial • Tongue + fixed point of articulation • Pharynx/glottis

harmonics

these are in between (not the same amplitude or intensity), lots of frequencies that have a roll-off in intensity

six tense

these are long vowels: /i/, /e/, /ɝ/, /u/, /o/, /ɚ/

harmonics

these are related to fundamental frequency because they are whole number multiples of the fundamental

three lax

these are short vowels: /I/, /ә/, /ɛ/, /ʊ/ and /ɔ/

What is a non-phonemic diphthong?

these diphthongs are simply stressed versions of existing pure vowels; for ex: (oU eI). diphthongization, an allophonic change rather than a contrastive variation.

formants

these do not have set relationship because they are different for the different vowels.

lips

these gain 6db in terms of amplitude in the low or the high frequencies---get boost so you can hear them better

vowels

these have formants and sonorants

obstruents

these include stops, fricative and affricates

acoustic filters

these let through (pass) energy or reduce (attenuate) energy

instrinsic muscles of the tongue

these muscles origin is inside the oral cavity

formants

these particular areas are strengthened/amplified--need 3 because or /r/ because we have to see F3 dipping down to F2 to confirm /r/

Why are fricatives considered continuants?

these sounds can be prolonged

1. pitch, 2. loudness, and 3. duration

these three things contribute to the perception of stress, helping to differentiate the meaning of similar words

1. nasal airflow and 2. nasalance

these two things are highly dependent upon the stimulus material (whether it contains nasal consonants)

What are semi-vowels?

they are consonants that reveal similarities with vowels. They are entirely dependent on resonance. they have in common with diphthongs that they contain some kind of change or transformation in the first 2 formants ([r] sound has change in 3rd formant). However, as a rule, they can't occur independently without some bordering vowel; the element of change occurs more quickly than those characteristics of diphthongs. manipulating the rate of change differentiates semi vowels and diphthongs.

What do suprasegmentals do?

they play a role in the process of understanding speech- they enable listeners to interpret a speakers intention

What is resonance quality (voice quality)?

thin/oral resonance, muffled, or "back in the throat" resonance.

Give other examples of complete assimilation.

think, bank, anger

stop gap

this 50-150ms event corresponds to the complete closure of the vocal tract (silence), minimum radiated acoustic energy, , neck acts like a low pass filter

What is an appropriate response to the following question? That's NOT your green book?

this IS my green book

disadvantage

this artificially constrains theories to a specific approach or outlook

wideband

this band shows formants better

narrowband

this band shows harmonics better

frication

this comes after the burst

boundary condition

this exists at the lips between the vocal tract and the atmosphere

consonant

this has a locus that vowels move to or from

voice bar

this has a range- for women: 250-under 60

advantage

this helps to understand theories

What is an appropriate response to the following question? WHOSE green book is this?

this is MY green book

Perception of nasality

this is a complex phenomenon that is difficult to measure

palatoglossus (glossopalatine)

this is an extrinsic muscle of the tongue that contracts to raise the root of the tongue or with SG or GG, create groove in back of tongue

genioglossus

this is an extrinsic muscle of the tongue that has posterior fibers contract to push tongue out or against front teeth and anterior fibers that contract to retract tongue. the anterior + posterior fibers contract to pull tongue downward

syloglossus

this is an extrinsic muscle of the tongue that is antagonist of genioglossus. It contracts to pull tongue up and back and maybe pull tongue side up

complex wave

this is contained within the pressure wave

cut off frequency

this is defined as the frequency at which the amplitude of the frequency component is decreased by 3 dB (half of its power)

WHAT is that?

this is my green BOOK

17.5 cm

this is the average length of the male vocal tract

acoustic analysis

this is the cheapest way to see what the tongue is doing (but the rest are fun too..)

fundamental frequency

this is the lowest (most natural) frequency

mouth opening

this is when the jaw rotates downward and translates downward and forward

vertical

this muscle of the tongue contracts to flatten tongue

transverse

this muscle of the tongue contracts to narrow and/or elongate tongue

inferior longitudinal

this muscle of the tongue contracts to shorten tongue and bring apex and lateral margins DOWNWARD

superior longitudinal

this muscle of the tongue contracts to shorten tongue, bring apex and lateral margins UPWARD

vocal tract

this resonates the source signal by allowing certain frequencies to pass through the filter with greater amplitude than other frequencies

vocal tract

this runs from above the larynx to the lips and/or the nose

transfer function

this specifies the vowel

What are the most effective resonators of aperiodic noise?

those immediately anterior to the constrictions and occlusions in the oral cavity

1. presence or absence of voicing, 2. place of articulation, 3. manner of articulation

three features of phonetic description of consonants

stop

three phases of a ___: closure release transition

1. vocalic, 2. glide, and 3. consonantal

three sets of landmarks

lowest (dotted) line/0 dBHL

threshold of hearing/Minimal Audible Field

the external thyroarytenoid (or muscularis) muscles connect from the - to the -

thyroid, muscular process of the arytenoids

the vocalic (internal thyroarytenoid) muscles connect from the - to the -

thyroid, vocalic process of arytenoids

waveform is

time and amplitude

the voice onset time is the

time between stop release and the beginning of phonation

VOT

time from release of stop closure to onset of voicing

intrinsic

timing of movements of articulators is an (intrinsic or extrinsic) characteristic of the relationship among different muscles for a given movement

juncture

timing we give a group of phonemes to relay a message. where we put in breaks changes the meaning "an aim" and "a name" have same phonemes but different break

place cues for stops 3

to a very limited extent the duration of the VOT for initial stops

Why does the VP port close during non-resonant consonants

to ensure all airflow is directed at the oral cavity

What is the goal of target models?

to reach the spatial target , the brains internalized spatial representation of vocal tract areas in which the articulators move

What is Delayed Auditory Feedback seen as?

to some, its seen as proof that speech is a servomechanism and auditory feedback is the main control

the hypoglossal nerve enervates all the muscles of the

tongue

F2 corresponds to

tongue advancement/ size of oral cavity

F1corresponds to

tongue height/ mouth opening

This is how vowels are classified

tongue height/advancement/tension

[u]

tongue high - low first formant tongue back - low second formant

[a]

tongue low - high first formant tongue central - second formant slightly higher

explain what is happening with the articulators during the production of /tu/

tongue moving back, while lips are moving forward

Describe basilar membrane

tonotopic response (from thin and stiff at base to wide and floppy at apex)

how is /l/ produced?

tounge-tip contact with alveolar ridge, sides of tongue down: lateral

the contraction of the lateral crico-arytenoid muscles has the effect of drawing the vocalic muscles toward/away from the midline

toward

sensory feedback

transfer of a portion of the system's output back to the input for regulation and error correction

place

transitions are cues for ___ of articulation

t//f voiced fricatives often lose their voicing when they are produced in a word final position

true

t/f on side view the articulation of (L) looks like the articulation of (d)

true

t/f the articulation of the consonant (r) as produce in various dialects and by various

true

t/f the presence of voicing during the production of voiced stops is highly variable

true

t/f upper motor neurons exist entirely within the CNS

true

When is the rise-fall intonation curve true?

true of declarative sentences and those that do not have yes/no answers

the source for voiceless fricatives involves ___, while the filter involves___

turbulent air flow through a constriction; the cavity in front of the constriction

noise

turbulent airflow is what?

For formant II there are _____ the numbers of (p) points and (v) points.

twice the number. This formant is said to be mostly responsive to tongue front to back positions.

Coarticulation

two articulation are moving at the same time for different phonemes. Occurs due to the temporal overlap between articulatory gestures for vowels and consonants. Example- 'two' /tu/- t is moved back in the mouth and lips are protruded during /t/

What is coarticulation?

two articulators are moving at the same time for different phonemes

1. executive 2. effector

two levels of motor programs

1. open, 2. closed

two types of feedback control loops for sensory feedback

1. front. 2. back

two types of tongue advacement (see book for examples of vowels in vocal tract)

Diphthongs

two vowels within the same syllabic nuclei, smooth glide from one vowel to the next; onglide/offglide; each have a characteristic F1-F2 pattern

sinewave speech

type of speech where waves track the center frequencies of F1, F2, and F3 of a naturally produced sentence. no consonants except spaces.

release bursts are

typically stronger for voiceless than voiced stops

What are the limits for human pitch detection and discrimination?

uncertain and unreliable above about 5 kHz

short lag

under 20 ms

when the external intercostals contract the rib cage is pulled down/up

up

how are fricatives both periodic and aperiodic

upper vocal tract is aperiodic and lower vocal tract is periodic

a strong release burst is typical of __ stops because the ___ at the time of stop release

voiceless; vocal folds are apart

During closure the only possible source of voicing is shown as a

voicing bar

Zero onset/short-lag

voicing begins at or very shortly after burst release. Vocal folds adducted by the time the stop is released. During Silent closure; phonation begins at release or just after.

Zero onset/short-lag VOT

voicing begins at or very shortly after burst release. Vocal folds adducted by the time the stop is released. During Silent closure; phonation begins at release or just after.

Voice Leading

voicing begins before burst release. Vocal folds approximated throughout stop closure, and phonation occurs during stop closure.

pre-voicing

voicing begins just before release

simultaneous

voicing begins upon release

Long-lag VOT

voicing begins well after release. Vocal folds adduct after the stop is released. Voicing is delayed; the stop is aspirated

fully voiced

voicing no stopping

Eddies

volumes of air that perform rotation of aperiodic, high frequency fluctuations in pressure and velocity

two cues to juncture

vowel lengthening; silence

What provides information regrading the changes that occur due to tongue position and oral cavity size

vowel quadrilateral

the f2 "transition locus" of a stop if the hypothetical starting point of f2 of a ___ following a stop, and it gives information about ____

vowel; place of articulation of the stop

relative to one another

vowels are perceived as... what?

vowel quadrilateral

vowels form approximate shape of (blank)

What are segments

vowels, consonants, semivowels- which speech is composed

The majority of our prosody is what we do with ____.

vowels. largely determined what we do with vowels and phonation of them.

Give examples of semi-vowels

w, j, r, l

Which two semi-vowels are considered glides? the tongue positions for these are almost identical to what two vowels?

w, j; [u] and [i] vowels respectively. when they occur close to these vowels, acoustically there appears to be very little change (although perceptually there is). only 1F and 2F matter for perception here.

strongly aspirated

way after release

always

we almost always or never devoice final affricates and fricatives in real life

evidence that categorical perception is learned

we best identity categories in our language

air flow

we need (blank) to initiate phonation when the vocal folds are adducted to create air pressure changes that flow into the oral and nasal cavities or just to send air into the cavities without phonation where modifications of the pressure wave occur

VOT 20 ms or less (english)

we perceive voiced stops /b/

VOT 25 ms or more (english)

we perceive voiceless stops /p/

Nasals have ____ formant patterns

weak

jaw movement from mandibular teeth

what are the bottom two pairs of paths represent under x-ray microbeam?

1. motion of the tongue tip, 2. blade, and 3. dorsum

what are the top three pairs of paths under x-ray microbeam

consonants

what has less energy and may have 2 sources of sound? vowels or consonants

find stopgap in the picture of the spectrogram

what is /a/?

Manner of Articulation

• Degree of constriction and its effect on the airflow • Complete or transient cessation of airflow • Constriction with continuous airflow

Frication vs Aspiration

• Frication noise - vocal tract • Aspiration noise - vocal folds

contact quotient

what percentage of the cycle cue VF closed? CQ = (contact phase/vibratory cycle) x 100% normal is 40 - 60% vaires with voice quality, louder will be closed longer

vowels, diphthongs, approximants

what sounds have formants?

true

what we hear is linear, including syllables, sounds, and words? true or false?

What is anticipatory assimilation?

when a sound is influenced by a following sound

What is carry-over assimilation?

when a sound is influenced by a preceding sound

A formant is defined as:

A resonance of the vocal tract

shorter

lax vowel are what?

Acoustic Characteristics of Affricates

Silent Gap Release burst

Acoustic correlates of affircates

Silent gap and frication.

assess reaction to auditory stimuli

how to test infants?

What sounds are intrinsically short?

lax vowels (/a/ vs /i/)

Turning the treble control down on a radio is a type of

low-pass filtering

The higher the harmonic, the ___ the energy (dB)

lower

the motor branches of cranial nerves are considered upper/lower motor nuerons

lower

men's speech

lower F0, more closed, shallower spectral tilt, more power/amplitude

amplitude

lower frequencies have highest energy

a motor unit is comprised of

lower motor neuron, muscle fibers, neuromuscular junction

What are the 2 types of assimilation

partial and complete

What are the two types of assimilation?

partial and complete assimilation

exhalation for life is a passive/active process

passive

Immobile Articulators

• Alveolar ridge • Hard palate • Teeth

Compare speech intelligibility of speaker who are deaf from birth to those individuals who acquire deafness as adults.

"children learn languages easier"

What is an example of assimilation of manner of production?

"educate" used to be articulated with a stop-glide sequence, now the more common movement of the tongue back toward the palate as the stop is released generates the affricate /dz/ (palatalization) (stop goes to affricate)

In regards to limitations of the simple target theory, explain how "ideal frequency" targets for vowels often aren't achieved.

"ideal frequency" targets for vowel formants, often, aren't achieved in natural speech production (yet, these incomplete productions do not seem to affect speech perception).

place cues for fricatives (spectrum) s

"s" has a relatively high peak (4500-8000Hz); the peak energy for fricatives for males tend to be lower than for females.

place cues for fricatives (spectrum) sh

"sh" has a peak lower than "s" (around 2500-4500Hz) males typically lower than females

What is example of coarticulation?

"two": tongue is reaching alveolar ridge for /t/ at the same time the lips are rounding for /u/

Assimilation and Coarticulation are differentiated in terms of:

# of articulators and # of speech sounds involved in each effect

Formant/Resonant Frequency Equation

(2n-1) x (c/4*L) n = resonant number c = velocity of sound (34000 cm/s) L = length of tube

Silence

(Exception is Brownian Motion) As long as the air pressure is steady at the atmospheric level on both sides of the ear drum, a listener hears nothing.

Diffuse falling spectrum?

(High amplitude/low frequency or low amplitude/high frequency) - Bilabial

Pitch Contour

(Intonation) Change in Fundamental Frequency over time

/f/

(Nonsibilant), Place: labiodental Manner: fricative Voice: voiceless, Aperiodic Turbulent Friction, Muscles: Orbicularis Oris Inferior, Manner- presence of aperiodic noise, Place- low, Voice- Absence of Phonation

/z/

(Sibilant) Place: alveolar Manner: fricative Voice: voiced Aperiodic + Periodic Laryngeal Source, Muscles: Superior Longitudinal, LCA is active, Manner- presence of aperiodic noise, Place- high, Voice- Presence of Phonation

/s/

(Sibilant) Place: alveolar Manner: fricative Voice: voiceless Aperiodic Turbulent Friction, Muscles: Superior Longitudinal, Manner-presence of aperiodic noise, Place- high, Voice- Absence of Phonation

/ʒ/

(Sibilant) Place: palatal Manner: fricative Voice: voiced Aperiodic Turbulent Friction & Periodic Laryngeal Source, Muscles: Intrinsic Laryngeal Muscle, Manner- presence of aperiodic noise, Place- high, Voice- Presence of Phonation

Digital Resonance

-Based on arithmetic -Moving average filter is good example of low-pass filter

Electrical Resonance

-Based on capacitance, inductance, & resistance -Traditional bass and treble controls

/ʃ/

(Sibilant) Place: palatal Manner: fricative Voice: voiceless Aperiodic Turbulent Friction, Muscles: Intrinsic Laryngeal Muscle, Manner- presence of aperiodic noise Place- high, Voice- Absence of Phonation

/tʃ/

(Stop & Fricative) Place: palatal Manner: affricate Voice: voicless Transient Aperiodic & Continuous Aperiodic, Muscles: Styloglossus, Superior Longitudinal , Manner- presence of a silent closure interval, transient release burst; rapid rise/fall time; presence of non-transient fricative noise, Place- F2 transition to/from neighboring sounds, Voice- presence vs. absence of phonation; duration of fricative noise; duration of preceding vowel

/dʒ/

(Stop & Fricative), Place: palatal Manner: Affricate Voice: Voiced Transient Aperiodic, &, Continuous Aperiodic & Periodic Laryngeal Source, Muscles: Styloglossus, Superior Longitudinal, Muscles for Phonation, Manner-presence of a silent closure interval, transient release burst; rapid rise/fall time; presence of non-transient fricative noise, Place- F2 transition to/from neighboring sounds, Voice- Presence versus absence of phonation; duration of fricative noise; duration of preceding vowel.

Which wave to the right has energy at only one frequency?

(The sine wave pic)

Which wave tot eh right has the largest RMS amplitude?

(The square wave)

Which wave to the right is aperiodic?

(The white noise wave)

Carry- over assiliation

(left to right)- sounds influenced by preceding sounds. Examples Cats> ends in /s/ Dogs > ends in /z/

Diffuse rising burst spectrum?

(low frequency/amplitude or high frequency/amplitude) - Alveolar

What is the Vocal tract Normalization hypothesis?

(re: how perception overcomes inconsistencies of vowel production.) This theory states that our perceptual system pays particular attention to the occurrence of the so called "point vowels" (cardinal vowels) which represent the most extreme configurations for a given individual vocal tract and uses them as anchoring points for judging other vowels. In a sense, the listener "calibrates" perception for the individual variations of formant patterns for vowel production. perhaps, also, visual cues or non speech vocal tract sounds are taken into consideration.

Anticipatory assimilation

(right to left)- sounds influenced by following sounds. Example 'avec' /avek/ vs. 'avec vous' /aveg

Frication

*turbulent noise of a sound* The hissing element of a speech sound, such as an affricate.

Double Incoherent Pressure (dB SPL)

+3

Double Intensity (IL)

+3

Double Coherent Pressure (SPL)

+6

spectrography

- /b/ is mostly in the formant transition - depends on adjoining sounds - depends on position in word - silent interval - coarticulation spread across many phonemes

How many sources does a voiced fricative have? What are they?

- 2 - Upper vocal tract and quasi periodic vibrations of the vocal folds

Burst release (concentration of energy)

- Also known as a stop release/stop production. Oral release yields a transient noise source. - Is a concentration of energy which appears in spectrograms as a vertical spike following the silent gap, is somewhat more intense and thus more conspicuous for the voiceless than for the voiced stops. - These are very brief but often cover a broad range of frequencies with varying intensity.

Semi-vowel

- Articulations and acoustic features resemble those of vowels. - Used in their production form only minimum constrictions in the vocal tract. - Characterized by formant structures similar to those of vowels and diphthongs. - Subdivided into two manners of articulation glides and liquids.

Characteristics of an Obstruent

- Blocked or restricted airflow - Aperiodic sound sources in upper vocal tract - May be voiced or voiceless

Describe the touch receptors on the tongue.

- Can feel touch on 2 separate points on the tongue tip which are only 1-2 mm apart - need to be 1 cm apart on the back or lateral margins of the tongue - superior surface of the tongue is more sensitive

Upper Airway/Vocal Tract

- Closed at one end (glottis) and open at the other end (lips) - average male = 17.5 cm in length

Voiceless fricative the source is described anatomically as? Acoustically as?

- Constriction formed by supra glottal articulators (tongue, palate, lips, teeth) - Continuous aperiodic sound

How do consonants compare to vowels in frequency and amplitude? What are the implications of that for speech perception?

- Constrictions are higher in frequency and lower in amplitude - BC consonants carry the information of speech high frequency NIHL can impede speech perception more than low frequency losses

Resonances frication noise and aspiration noise?

- Frication: Higher resonances - Aspiration: Lower resonances

F1 equation

34,000 cm/70 = 485.7 Hz

Spectral roll-off (tilt)

- amplitude decreases 12 dB for every octave increase in frequency (causes the slope in harmonic series - steepness of closing phase reflects how rapidly the vocal folds close (slope) - Spectral roll‐off is a function of the speed of vocal fold closure

evidence for motor

- categorical perception - VOT experiments - place of articulation experiments - strict dividing line

Formant

- characteristic resonance of a particular vocal tract - peaks, frequencies with greatest ammplitudes

Consonants

- constricted vocal tract - may have alternative sound source - voiced and voiceless

Glottal Source Characteristics

- from vocal fold vibration - consists of harmonics

low frequencies

- human ear is less sensitive - lack of sensitivity becomes more pronounced for softer sounds

Acoustic Characteristics of Radiated Sound

- max gain (increase in intensity) from lip radiation is 6 dB - 6 dB per octave roll-off of radiated acoustic spectrum

What are the features of complex sounds?

- natural sounds are usually complex - every complex sound = composed of simple periodic sounds (Fourier analysis) - Complex periodic - frequencies of the contributing simple periodic sounds are always whole number multiples of the lowest frequency - these whole number multiples are often called harmonics

cues for voiced vs. voiceless

- position matter - duration of the preceding vowel - aspiration (ex: big/bic)

Vowels

- produced by relatively open vocal tract - nucleus of a syllable - all vowels are voiced

wave-surfer identification for VOT

- silence before the release - the release (burst) - the onset of voicing (good F2) - the duration from the release to the onset of voicing

stress timed

- stressed syllables last longer - unstressed syllables show vowels reducing ex: english similar to mora timed

What is the frequency composition of aperiodic sounds?

- the 'whole number multiples' does not apply - instead, random distribution of frequency components - there is no f0

Tube/Formant Resonances

- the tube will resonate best (the natural resonant frequency) at a frequency that has a wavelength that is 4x the length of the tube

What is the upper airway (a tube-like structure) responsible for?

- transforming the source sound into speech - the transfer function specifies different vowels

Place of articulation

--Bilabial •Tongue + fixed point of articulation (i.e. lingua-alveolar) • Pharynx/glottis (/h/ is a glottal sound but it different than breathing because the glottis is partially open whereas in breathing it is completely open)

highlighted equation under maximal gain

-12dB per octave roll-off at the source (larynx) and we gain 6 dB per octave at the lips +6dB/octave gain -6dB per octave roll-off of radiated acoustic spectrum

Sampling Theorem

-A signal may be represented exactly if it is sampled at at least twice the highest frequency (the Nyquist frequency) -It's guaranteed that when upon playback, all distortion will be above Nyquist frequency -So, if upon playback the signal is low-pass filtered at the Nyquist frequency, the signal will be recorded perfectly (in terms of frequency, not always amplitude)

In what way is a sine wave simple??

-All energy is at one frequency!

Complex wave

-Anything thats not a sine wave! -Expressed as a sum of sine waves +amplitude, +Frequency, +Phase -Fourier Analysis- converts time-domain complex wave to frequency domain

articulatory sequence

-Articulators move to produce consonants -Movements overlap ---Coarticulation -Usually v and c movement - Sometime c and c

Description of Pitch Contours

-Average F0 = Male 120, Female 225 -Sentence level variation because you run out of air -Linguistic variations used to express linguistic meaning and intent -Fo to express emotional states

Moving Average Filter

-Based on sliding window of averaged samples. -Original signal consists of list of n samples long -Filtered signal is list of n minus w (window length) samples long -Each sample in the filtered signal consists of the mean of the previous w samples from the original list. (Pic is an example of moving average filter)

dB HL

-Based on threshold of audibility of typical individual at particular frequencies. -Basis of the audiogram

Analog to Digital conversion and Processing of Speech

-Beginning: Going from air pressure, to digits, back to air pressure -Low-Pass Filter guarantees that nothing can get through thats higher than it can deal with -Analog to Digital: Nyquest/Sampling theorem -Optional Processing: Speeding up, slowing down. Commercials that speed up fine print at the end do this.

Primary Spectral Energy

-Bilabial: 500-1500 Hz -Alveolar: above 4000 Hz -Velar: 1500-4000 Hz

Spectrum of Release Burst + Aspiration

-Bilabials (p,b): energy broadly distributed across all frequencies or concentrated in lower frequencies -Alveolars (t, d): rising spectral envelope -Velars (k,g): Mid-frequency range contains the most energy -Try saying the plosives in a whisper to hear the frequency of the burst. •Pitch of /t/ is highest and /p/ is lowest. •Variable intensity with /t/ loudest

Utterance Levels

-Breath Group (Fo will go down, 1 curve) -Phrase Level (2 curves, up-down, up-down) -Word Level (multiple curves) -Phoneme/Syllable Level (breaks in curve)

RMS level is the foundation of other amplitude measures

-Common in speech and hearing is the decibel -Decibel is logarithmic scale -We hear vast range of levels -Decibel is a ratio or comparison -Based on 20 micro pascals RMS

Analog Representation of Speech

-Continuous (every time has a value) -Simple Equipment -Trouble with noise and distortion -Difficult to maintain -Inflexible

Physical Characteristics of Cavity Resonators

-Direct relationship between cavity length and its resonant frequencies -Basis of relationship is the cavity length and wavelength of sound -Wavelength ( λ ) is defined as the distance traveled by a periodic wave during one repetition of the fundamental frequency

Digital Representation of Speech

-Discrete (only specific times have values) -Complex Equipment -Noise and distortion as low as desired -Easy to Maintain -Flexible

Pascal and Micropascals

-Ear can hear air pressure vibrations between 20 micropascals and 20 pascals -Analogous to Alternating Current (AC) -Vibration measure is RMS

VOT

-Exists on a continuum -F0 is also an acoustic cue -F0 tends to go down in anticipation of closure for voiced and voiceless stops -After voiceless stops, F0 is elevated momentarily. F0 remains flat after voiced stops

Acoustic Analysis

-F0 is produced by the vocal folds -F0 is the lowest frequency -VFs also produce harmonics

/l/

-F1 - 360 Hz -F2 - 1300 Hz -F3 - 2700 Hz

Sound Pressure dB SPL

-Force on a surface area perpendicular to the direction of the sound -Standard reference level is 20 uPa (or 2 x 10^-5 Pa; or .00002) -Most meaningful for SLPs and Audiologists -Easiest to measure (just a microphone is required) -20 log10 (P/Pref) *Pref=20 uPA

Lingual-Alveolars (s,z)

-Greater degree of constriction (narrower channel) yields higher energy and higher frequency noise. -Energy concentrated above speaker's F4

Human's toleration of sound

-Intensity ratio between faintest and loudest we can tolerate is 1 to a trillion (1x10^12), -120 dB is the approximate range of intensity that human hearing can perceive and tolerate. Eardrum would explode if exposed to 160 dB of sound!

Simple-Complex Waveform

-It is relatively easy to add sine waves to make a complex wave -more difficult to extract component sine waves from a complex -Basically what we do when we listen to speech

Spectrogram Info

-Low vowels ( ɑ, ɔ ) have a high F1 -High vowels ( i, u ) have a low F1 -Front vowels ( i, ɪ, ɛ ) have a large distance between F1 & F2 -Back vowels ( ɑ, ɔ, o, u ) have a close F1 and F2.

external auditory meatus resonants at what freq?

3450 Hz (3000-4000 Hz).

Digital representation of Speech

-Most accessible representation of speech is the air-pressure waveform -most powerful tools for speech analysis, synthesis, manipulation, and training are computers -First step=convert an air-pressure waveform to computer readable digital waveform

Lingual-Palatal (ʃ, ʒ)

-Narrow constriction (further back than lingual-alveolars) -High frequency noise (not as high as alveolars) -Energy concentrated above speaker's F3

silence (stop gap)

-Occlusion to release -Voiceless stops: complete silence -Voiced stops • varying amount of silence (depending upon transglottal flow) • Voicing is low amplitude due to damping. • Seen as voice bar on spectrogram

Tube closed at one end and open at the other (/ə/)

-One quarter wavelength fits -So does three quarters, and five quarters... (pattern) -Formula

De-Constructing Speech Sounds

-Primary de-construction tool is Fourier analysis -Any complex, periodic wave can be broken down into a sum of simple sinusoids with appropriate frequencies, amplitudes and phases. -These component waves may be combined and displayed as a spectrum

Coherent Sounds

-Pure tones or sounds that are mostly composed of a pure tone -Speakers wired in parallel -Amplifiers -Noise-canceling headphones

RMS measures and Electricity

-RMS corresponds to the DC amplitude -120 volts AC household voltage=120 volts RMS which is the same as 170 volts peak.

White Noise Wave

-Random waveform -Equal energy at all frequencies -Often created by air turbulence -No fundamental frequency, no harmonics, aperiodic -In time domain its a bunch of squiggles, in frequency domain it's a flat line at the amplitude

Crest Factor

-Ratio of peak value to RMS -RMS is lower for speech than sine given the same peaks

Frequency Domain

-Requires two graphs: Amplitude and Phase -Amplitude most important, phase required for completeness (to recreate wave) -Time is discarded (infinitely repeating signal assumed)

How do you know whether the sampling rate is fast enough to represent ll the "bumps" in the waveform?

-Sample at More than twice the highest frequency in the signal

Digitization Parameters

-Sampling Rate: determines frequency resolution -Bits of Quantization: Determines amplitude resolution

Tube Models of the Vocal Tract

-Simple tubes amy be used to model the vocal tract -Bends in the vocal tract have little effect on resonance -Simplest tubes provide reasonable model of the schwa vowel

Complex Waves Summary

-Simplest unit of sound is the sine wave. Sine waves may be combined to create any periodic sound--such as the glottal source. -White noise is aperiodic and can be generated by air turbulence in the mouth. -Source-filter theory of speech production-Models speech as a combination of sound source

Acoustic Cue for manner

-Slow formant transition (75-250 ms) •compared to diphthongs 350 ms

Square wave

-Some complex waves are typified by the combination of F0 and specific harmonics. -Formed by F0 PLUS ODD HARMONICS

Sound Power

-Sound energy per unit time (usually in watts) generated by a sound source -Standard reference level is 10^-12 Watts -Rate of flow of energy

Sound Intensity dB IL

-Sound power per unit area (usually in Watts/m^2) -Rate of flow of energy -10log^-10(I/Iref) *where Iref=10^-12

Turbulence noise production

-The constriction functions as a nozzle -air exiting the construction forms a jet -as jet mixes with surrounding air, turbulence is generated -turbulence is associated with eddies

Compression & Rarefaction

-The force of an impulse or vibration pushes out against air molecules. -This carries compression away from the source- leaving rarefaction in its wake. -If air had no elasticity, the effect of an impulse would stop at its point of initiation.

Why do objects make the sounds that they make?

-They are mechanical systems. -They have properties that create vibration (Periodic & aperiodic) -Periodic vibration usually involves three properties of the object (Mass, resistance, stiffness).

Traditional Description of Vowels

-Tongue Height (High/Mid/Low) -Tongue Advancement (Front/Back) -indistinct articulation

covert contrast

-the speaker produces a measurable, reliable distinction between two sounds, but listeners do not readily perceive the contrast the child produces -speaker has more knowledge of sound contrast than we think based on transcription alone--shown by instrumental analysis -need to be measured multiple times

Affricate

-tʃ, dʒ -have a stop gap followed by intense frication

When you strike a mechanical system it creates....

-velocity, acceleration and displacement (which are based on the physical properties mass, resistance, and stiffness.)

nasals

-velum is lowered the sound and air can pass into the nasal cavity creating this sound -all voiced

v, ð

-voiced -low energy, diffuse spectrum -voicing feature adds a prominent concentration of energy in the low frequency region.

f, θ

-voiceless -low energy, diffuse spectrum -high frequency

Glide

-w, j -F1 starts very low and then rises to the F1 of the following sound -F2 and F3 also begin similar to those of /i/ and /u/ and shift toward the F2 and F3 values of the following sound

five lax vowels

/I/ /ɛ/ /ʊ/, ә/ and /ɝ/

r

/_/ has the lowest F3 of any English sounds

Low, back vowel

/a/

high F1 and low F2

/a/ has ...?

high F1 and high F2

/ae/ has...?

three neutral

/ae/, /a/, and /ɑ/

nonsibilants

/f v θ ð h/ less noise energy than sibilants, formant transitions are primary acoustical cue, voiced nonsibilants will have quasi-periodic pulses, noise spectra are fairly flat and diffuseand expected to have antiformants due to narrow constricction

fricatives

/f, v, s, z, sh, Ʒ, θ, ð, h/ lower intensity than vowels, aperiodic features, wide frequency range, no clear formant frequencies, voiced fricatives have vertical striations

What phonemes are non-sibilant

/f/ /v/ /0(th)/ /Q(th)/

Which phonemes have flat, low-frequency spectral peaks

/f/ /v/ /0(th)/ /Q(th)/ non sibilants; anterior

Consonants: Fricatives

/f/ /v/ /θ/ /ð/ /s/ /z/ /ʃ/

Give an example of how a single articulation can produce 2 different speech sounds.

/f/ and /v/ are articulated the same way, but there is phonation present during the production of /v/ and not during the production of /f/

Difference between the weak, voiceless fricatives /θ/ and /f/

/f/ energy concentration is 500+, /θ/ the energy is 2000+ with trailing tail. -Both have light noise and longer duration.

Consonants: Glottals

/h/

what is the glottal fricative

/h/

what is the voiceless version of preceding or following vowel

/h/

Glottal Fricatives

/h/ narrow vocal folds, they don't vibrate Frequency range: depends on what's around low amplitude acoustic energy on its own can be completely coarticulated it depends on the vowel you are producing acoustic energy in vowel range of /i/ or /u/ place is glottis

/j/ has similar formant structure to which vowel?

/i/

High, front unrounded vowel

/i/

The glide /j/ is initiated as a vowel-like sound that is similar to the vowel...

/i/

low F1 and high F2

/i/ has...?

Name the point vowels and include a description of their general formant pattern in relation to this front/high/tense vowel

/i/, /a/, /u/ - Low F1, High F2 for /i/, higher F1 lower F2 for /a/; Higher F1, Lower F2 for /u/

seven tense vowels

/i/, /e/, /ae/, /u/, /o/, /ɔ/, and /ɑ/

/ng/ has a very similar formant patterning to

/k/ and /g/

nasals

/m n ng/ low intensity and frequency is about 300 Hz, short duration, *voiced with vertical striations, low intensity formants (antiformants)

which nasal is the lowest in frequency and shortest in diameter

/m/

Consonants: Nasals

/m/ /n/ /ŋ/

nasals

/n, m, ŋ/ •Occlude oral cavity & open velopharyngeal port; all nasals = voiced •Formant structure & can be syllabic, like vowels, but have significant constriction. •Acoustic evidence for manner: - Nasal murmur = very low F1 (250-500 Hz) (large nasal resonating space & narrow opening) - Low energy of all formants (high damping) - F2, F3 vary

Which nasal is higher in frequency and slightly longer in duration

/n/

Which nasal is highest in frequency and longest in duration?

/ng/

voicing

/p t k/ with a VOT of 25-80ms (mean of 45ms) are phonetically distinguished from /b d g/ with a VOT of -20 to 20ms (mean of 10ms) by what?

Consonants: Stops

/p/ /b/ /g/ /t/ /k/ /d/

what phonemes do aspiration occur on in english?

/p/ /t/ /k/

/m/ has a very similar formant patterning to

/p/ and /b/

The brief cessation of airflow emitted from the vocal tract underlies the acoustic period of silence characteristics of

/p/, /t/, and /k/.

Language Specific for English Stops

/p/,/t/,/k/

What are the three acoustic characteristics of stressed syllables?

1. Higher F0 for the heavily stressed syllable (higher pitch) 2. Greater duration of the stressed syllable (longer) 3. Greater intensity for the heavily stressed syllable (louder)

Acoustic Features of Suprasegmentals

1. Pitch (Fo) and Pitch Variation = Intonation 2. Loudness (intensity) and Loudness Variation = Stress 3. Duration (length) 4. Pausing (rhythm perception of the brain; makes you uncomfortable when disrupted) 5. Patterns -Tonation -Intonation (change in pitch) -Stress/Emphasis -Duration -Rate

Acoustic Production/Perception Cues

1. Place of Valving: where the constriction is 2. Degree of Valving: size of the space air goes through 3. Duration of Valving: how long air is pushed through 4. Voicing Overlay: combination of aperiodic and periodic sounds 5. Formant Transitions 6. Rise Time

Four Acoustic characteristics in temporal order?

1. Silent closure 2. Transient aperiodic release burst 3. Continuous aperiodic frication 4. Continuous aperiodic asperation

Stop Features

1. Stop Gap 2. Noise Burst 3. Aspiration 4. Voice Onset Time 5. Transitions 6. Stop Gap 7. Release (+/-)

Perception of Transition Duration

1. Stop: 40-60 msec 2. Glide: 60-100 msec. 3. Vowel + Vowel >100 msec.

What Three forces act on the vocal folds for vibration

1. Stress 2. Strain 3. Shearing

Cues to Nasal Manner (Nasal Murmur)

1. Voiced 2. Low intensity (softer than the neighboring vowels; meaning lower amplitude) 3. Relatively steady state formants 4. Low frequency resonance (usually below 500 Hz but often below 300 Hz)

Sound Visualization

1. Waveform (frequency/time) 2. Spectrum (amplitude/frequency) 3. Spectogram (frequency/time/intensity)

Nasal Acoustic Information

1. Weak Formants -Anti-Resonances -Nasal Cavity Damping (absorption of sound makes the bandwidths wider and less distinct) 2. Nasal Murmur -Extremely low formants (below 500 Hz F1) 3. Place of Articulation: F2 and F3 -Wide formants; blend into each other -F2 transition similar to stops 4. Vowel Coloring: blend of a vowel and the consonant after it (one sound takes on the characteristics of another) 5. Nasality (resonance)= sound coming out of nasal cavity 6. Nasal Emission (flow) = air coming out of nasal cavity

What are the several categories that models fall into?

1. a strong linguistic basis and emphasis 2. the goal of speech production is to attain one or more targets 3. a focus on the role of timing in speech 4. a focus on the role of feedback in speech

Explain one famous experiment by Liberman et al that tested categorical perception "in counterbalanced fashion" two separate procedures were tested:

1. an identification task; either through forced choice = multiple choice or open response sets. 2. a discrimination task; which is most similar to x: a or b? the results were there appeared to be critical (categorical) boundaries only between certain stimuli in the sets, but not the majority of theories suggesting crucial points where perception "flip flops" between different speech sounds. this result was the same regardless of order of the task, prior knowledge of the stimuli, forced choice/open set format. other possible variations in categorical perception are distinctions in manner and voicing.

What are the other classifications of assimilation?

1. anticipatory (or right-to-left) assimilation 2. carry over (or left-to-right) assimilation 3. assimilation of manner of production

What 4 kinds of info are available to a speaker for feedback?

1. auditory 2. tactile 3. proprioceptive 4. central neural

What does changing F0 do in terms of intonation?

1. expresses differences in attitude ("that's a pretty picture.") 2. use of rising intonation can turn a sentence into a question ("today is tuesday")

manner cues for fricatives

1. have aperiodic noise component 2. duration of the noise is relatively longer than aspiration noise after voiceless stops or the fricative noise in affricates

What are the 4 places at which constrictions are created for fricatives?

1. labiodental 2. linguadental 3. alveolar 4. palatalveolar

Describe different examples of Linguistically Oriented Models.

1. model of speech production using phonetic and physiologic data together 2. examined spectrograms and derived a set of 12 features of speech (voiced and voiceless) and assigned each speech sound one specific feature - a "distinctive feature analysis" 3. redesigned the distinction feature system in articulatory terms: sounds were labeled rounded", "high tongue", etc, they generated a set of 27 features based on cavity, manner of articulation , and source 4. Linking speech perception to production: speech sounds are encoded in the acoustic signal due to how they produced (first to really propose this)

What are the 2 differences between non resonant consonants vs resonant consonants?

1. nonresonants lack formant structure and openness of resonants 2. audible noise is present in nonresonants

place cues for affricates

1. palatal (post alveolar) 2. stop burst and formant transition (F2)

What are the essential components of the motor theory of speech production?

1. perception makes use of articulatory knowledge of speech production 2. speech makes use of special perceptual properties. for example, categorical perception (speech sounds are based on critical perception boundaries; here we see a significant parallel with the quantal theory). another special property would be to produce speech.

What are the contributing cues for manner in stop plosives?

1. presence of spike (across entire freq domain); by itself perceived as a "pop" or "click" - aperiodic, very short noise. "pop talk" 2. Presence of a burst, following the spike (which in the case of a [p] sound is a minimal form of aspiration that resonate right between the lips when they open. 3. presence of a VOT, VTT or stop gap. which is the use of a stop in the middle of a sound sequence (either long or short based on plus or minus voicing). 4. short duration of f2 adjustments in vowels; the abruptness of the plosion causes f2 adjustments also to be brief and quick i nature.

Spectrogram

3D visualization of sound (frequency, time, intensity

speech production systems may be compose into what?

1. sound source and 2. filter

manner cues for affricates

1. stop gap (silence) followed by a burst then a sharp rising fricative noise 2. both the rise time (amplitude) and duration (length) of a full fricative ia about double of the fricative portion of an affricate.

voicing cues for fricatives

1. voiced fricatives have a periodic component (F0) fundamental frequency 2. voiceless fricatives are not associated with voicing.

voicing cues for final stops

1. voicing during closure is the most salient cue for stop voicing in the final position; voiced stops have voicing during the stop gap 2. duration of the stop gap; final voiced stops have a longer stop gap than their voiceless counter parts 3. length of the preceding vowel; vowels are shorter before a voiceless stop than before a voiced stop at the same speaking rate 4. F1 falls at the end of the vocalic portion of a voiced stop

voicing cues for medial stops

1. voicing during closure is the most salient for stop voicing in medial position; voiced stops have voicing during the stop gap 2. duration of the stop gap; unstressed medial voiced stops have shorter stop gaps than their voiceless counter parts 3. length of the perceding vowel; typically vowels are shorter before a voiceless stop than before a voiced stop at the same speaking rate 4. F1 transition for voiced stops

voicing cues for initial stops

1.initial voiceless stops have longer VOT that voiced stops 2.low vs high starting position of F1: initial voiceless stops have a higher starting position of F1 than voiced stops 3. relatively larger vs relatively small F1 change; initial voiceless stops have a smaller F1 change than voiced stops 4. voicing during the stop gap can also be a cue; initial voiceless stops can have voicing during the stop gap

in breathing for speech, the breathing in cycle takes about - percent of the duration of the whole cycle

10

If you have the harmonic 200, 300, and 400 then the fundamental frequency is...

100!! (Subtract the difference between harmonics)

general roll-off

12 dB per octave

Triangular wave source (Sound From Larynx) (U); loss

12 dB/octave roll-off

doubling of frequency

12 dB/octave roll-off, so we lose 12 dB per octave

F2 locus for (average adult males /d/

1800 Hz F2 transition direction depends on the F2 *F2 locus- when you have a release gap and it goes to the following vowel

When was Delayed Auditory Feedback discovered?

1950

1. bite-block, 2. artificial palate, 3. sudden occlusion of the airway, and 4. sudden mechanical perturbation to the jaw or lip

4 examples of perturbation studies

Syllable initial prevocalic stop CV

1st Closure (Stop gap)- vocal tract closes (no sound) (voiced may have vocal fold vibration) (exhaling) 2nd Release of articulatory constriction- air pressure comes out fast (release quickly), perceived as a noise burst (aspirated or unaspirated) 3rd transition- transition to the vowel (formant transition)

Syllable final postvocalic stop VC

1st transition- vocal tract is moving constriction 2nd closure- (stop gap)- may release stop or not 3rd release (noise burst) or no release noise burst

1. path, 2. trajectory

2 components of spatial dimension

electroglottography

2 electrodes on each side of thyroid cartilage, 1 electrode emits low current that is transmitted to other electrode when VF are in contact, signal reflects VF contact are, air is insulator so current wont pass

Coarticulation

2 or more articulators move at the same time to produce 2 or more phonemes

1. coarticulation, 2. suprasegmental factors

2 reasons there is not a one-to-one relationship between the acoustic features and the perceived consonant?

closed glottis = ?

2 sources of sound, period sound of phonation, aperiodic sound of airstream passing through constriction = voiced fricative

Closed Glottis

2 sources of sound: periodic sound of phonation and aperiodic sound of airstream passing through constriction

What is the reference value of dB SPL?

20 micro Pascals.

what is the frequency of a nasal murmer

200-300hz

what are our most sensitive frequencies?

2000-5000 Hz

The eardrum strengthens the signal by about ____ dB

25 dB

What is the amount of time for voice onset?

25 ms

1. acoustic targets, 2. articulatory gestures, 3. aerodynamic pressures

3 theorized output targets

Overall the middle ear provides a boost of perceived sound level by about ____ dB

30 db

F2 locus for (average adult males /g/

3000 Hz, F2 transition always falls on the vowel's F2

The speed of sound in air is approximately...

34,000 cm per second, 1125 Feet per Second, 767 Miles per Hour (Not the vacuum option)

A classroom had two identical window air conditioners. The sound pressure level from each of them measured alone was 73 dB

76

F2 locus for (average adult males /b/

800 Hz, F2 transitions always rises to the vowel's F2

How many dB SPL is a .25 Pa sound?

81.94 dB SPL Math: 20 x log10(.25/.00002) 20 x log10 x 12500 20 x 4.096 =81.94 dB SPL

Voiced VOT

< 20 msec (may be between 1 to 19 msec) may be 0 may be negative msec: voicing begins before the articulatory release (prevoicing) may not be able to measure VOT if voicing is continous through out the stop gap Manner and voicing cue

Timing for VOT

< 20 msec voiced > 20 msec voiceless

voiced

<20 ms

energy=

=amplitude

voiceless

>25 ms

Fourier Analysis (prism)

A prism can subdivide light into its component frequencies.

3dB Down Point/Half-Power Point

A reduction in intensity of one half is equal to a decrease of about 3dB. The frequency at which the intensity is 3 dB less than the peak intensity of the resonant frequency.

Stop on Spectrogram

A Complete STOP in airflow (Complete Obstruction of Airflow) Obstruction is followed by a large wide-band burst of energy as the obstruction is released

An acoustic tube model of the vowel /i/ has formant frequencies determined by the resonances of what type of tubes?

A Helmholtz resonator plus one tube open at both ends plus one tube closed at both ends

Aspiration

A brief hiss of air following the burst - Present for voiceless stops (pie), not for voiced stops (buy) - No aspiration in s‐clusters, likely due to persistence of lack of voicing (stop) - Release of a voiceless stop may occur with or without aspiration (pie vs apple)

Anti resonance

A filtering effect of the vocal tract characterized by loss of acoustic energy in a particular frequency region.

What is a spectrum?

A graph of amplitude over frequency - it provides an analysis for a single point in time (line spectrum)

What is a limitation to the Peterson and Barney study?

A limitation of the Peterson and Barney study is that formant frequencies were calculated from isolated and deliberate productions. It also showed that even in these unnatural conditions, some vowels still have overlapping formant configurations.

How are thresholds of hearing and equal loudness scales determined?

A logarithmic scale compresses the range

Sound intensity (dB IL)

A measure of power, doesn't depend on size or shape of environment, (dB IL equation)

How to produce a fricative

A narrowing of the vocal tract. Air moves faster through narrow areas because of the constriction.

What is a quasi-periodic tone?

A pattern that repeats itself at almost regular intervals

What is periodicity?

A pattern that repeats itself.

Aspiration

A period of voicelessness after stop release

How does perception overcome the challenges of these inconsistencies of vowel production ( re: limitations of simple target theory)?

A possible explanation for the successful consistent perception of otherwise variable stimuli is the "vocal tract normalization" hypothesis by Liberman.

Unreleased stop

A stop without the release burst, usually at the end of a word

Speech models fall into 4 categories:

A strong linguistic basis and emphasis The goal of speech production is to attain one or more targets A focus on the role of timing in speech A focus on the role of feedback in speech

Spectrogram

A type of short-term running spectrum in which sounds are analyzed in a 3D pattern of time, frequency, and amplitude. Shows pattern of energy in phonemes. Intensity show in different shades of gray. Darkest areas = higher intensity or amplitude; region of energy. Shows the acoustic correlates of information sources in speech.

In the vowel perception theory, what is the only certainty?

A vowel identity is somehow coded in its formant configuration (relative frequency positions) of formants 1 and 2 (much of the following applies to semivowels, as well). precisely how we extract vowel perception from these formants (through hearing or in a neurological sense) is not entirely known. there are many discrepancies with respect to this so called "Simple target theory"

Voiceless consonants have...

APERIODIC laryngeal source; supraglottal noise sources

What is the threshold of hearing in dB SPL?

About 0 dB SPL

Affricates

Acoustic Cues: 1. Duration: 75-130 msec. 2. Rise Time: 33 msec. -Transitions: 73-150 msec. -Spectograms: looks like fricative, but shorter -Pre-Vocalic is shorter than post-vocalic (e.g. "judge")

Vocal Tract Length

Adult male: 17-18cm Adult female: 14-15cm Young child: 6-8cm and shape differs

Offglide

After transition-relatively steady state formants

Vocal Tract Transfer Function

Air particles vibrate most effectively at the open end of the tube (air moves freely), and least effectively at the closed end of the tube • The open end will have a velocity maximum (pressure minimum) • The closed end will have a velocity minimum (pressure maximum)

Egressive

Airflow is an outward flow from the lungs

Open Glottis

Airstream is only audible at the point of constriction

Output Spectrum

All spectrums put together

Coarticulation

Simultaneously articulating more than one phoneme. Anticipatory (forward) & retentive (backward)

Air Pressure

Alternating pressure transfers energy much better than static pressure

Which places of articulation of fricatives have high energy formants

Alveolar and Postalvelolars

Which places of articulation of fricatives have narrower bands of high frquency

Alveolar and postalveolar

Describe the sensory receptors on the alveolar ridge.

Alveolar ridge has more reception than posterior part of the palate

In what stop is the front cavity is short, high frequency energy

Alveolar stops

What stop has a high F2 (1800hz) and F3

Alveolar stops

What is the Y axis on a spectrum?

Amplitude

What is the Y axis on a waveform?

Amplitude

Describe the movement of an affricate.

An alveolar closure is made (like for /t/ or for /d/), then the closure is released and the tongue retracted to the postalveolar position on the palate, with the same shaping as for production of /sh/ and /zh/

What are acoustic cues?

An aspect of the acoustic signal that has a role in distinguishing between one phoneme and another.

Compression, pathologies, loudness during adduction...

affect resistance * In a healthy cycle this is predictable

Name given to sharp reduction in amplitude; when sound is absorbed by the shunt?

Anti-Formants

How is resonance influenced by our "variable resonator?"

Any cavity has the tendency to produce a standing longitudinal wave form as its resonance. This waveform (and its odd multiple) process a number of "critical points" for resonance; resonance is changed when a structure is deliberately moved into that spot. These points are those where there may be "maximum pressure" or "maximum velocity of movement" If something blocks or narrows "maximum pressure (p)" this will raise that particular formant frequency. If an area of maximum velocity (v) is penetrated, the formant moves downward in frequency.

Release Burst

Aperiodic sound following silent gap

Lateral Liquid

Area behind tongue acts as shunt resonator = antiformants. (Whenever there's an extra cavity, you get antiformants) Energy largely in lower frequencies. Great deal of variability in formants, F2 influenced by surrounding vowels.

Diphthongs: Offglide

Articulatory ending point of the diphthong

Diphthongs: Onglide

Articulatory starting point of the diphthong

Steady-State Formants

Articulatory there is little or no movement.

formants

As pitch changes, the harmonics move through the (blank)

Speaking Rate

As you speak slower, the duration of your sounds will get longer (vowels more than consonants) 1. Segmental Duration: you're just making sounds longer by the rhythmic pattern is maintained 2. Pause Duration: also slows down the speech, but not fluidly, doesn't sound natural 3. Syllable Deletion: drop sounds to speed up speech, less intelligible 4. Undershoot: when you don't get the full sound but you approximate

aspirated

after release

Aspiration during VOT

Audible release of air between noise burst (plosive) and the following vowel -VOT for aspirated consonants are typically longer; sounds breathy

4 Kind of Info available for Feedback

Auditory Tactile Proprioceptive Central Neural

~9 cm

Average length of a child's vocal tract

~15 cm

Average length of the female vocal tract

~17.5 cm

Average length of the male vocal tract

good

BETWEEN categories of sounds, discrimination is good or poor?

Antiformants

Bands of frequencies with damped acoustic energy

Frequency Domain (Spectrum)

Based on Fourier Transform

Mechanical Resonance

Based on Mass, Stiffness, and Resistance Mass & stiffness results in a "natural frequency" Resistance results in a decay of amplitude over time called "damping" Tuning fork has excellent resonance characteristics It has one natural frequency and slow decay Energy imparted to a tuning fork results in free vibratory motion at the natural frequency or resonant frequency Other objects have varying quality resonance curves

your practice

Based on theory: if you think your speech is special, would you use non speech sounds / if you think general auditory skills are the most important maybe you would use non speech Based on evidence: new info on mirror neurons / new info on mcgurk effects in autism Based on clinical experiences: integrate theory, evidence and practice

Onglide

Before transition-relatively steady state formants

Release burst/stop burst

Brief, transient aperiodic noise burst following the silent gap. 10-30ms (longer for vl). Vertical line extending into high frequencies. Usually seen in positions other than final. Bursts of voiceless stops are longer than voiced and include aspiration (noise generated by turbulence as air moves through the glottis during the time in which the folds are starting to close for the following voiced sound).

Rhotic Liquid

Bunched or retroflex, retroflex production results in a slight lowering of F3, bringing it closer to F2.

4 months

By what age can infants discriminate basic contrasts of their native language?

CAP vs CAB

CAB is longer because of continous vocal fold vibration, but no VOT for either *previous letter helps determine what comes next

What is happening- 'cats' is produced with a word- final /s/. However 'Dogs' is produced with a word-final /z/

Carry-over (left to right) assimilation

Categorical vs. Continuous Speech Perception

Categorical Perception of Consonants: there are limits to their acoustic boundaries -VOT could vary between -20 and 20 and we wouldn't hear a difference Continuous Perception of Vowels: no given place that is one absolute vowel sound, it is a continuum that you can blend from one category into another

2nd Acoustic Cue of Stops: Release of Burst

Caused by turbulent air. Observed in spectrogram as gray area.

Complete Assimilation

Change from an allophone of one phoneme to an allophone of another phoneme

Formant transition

Changes in formant frequencies that occur during the transition from one speech sound to another

Liquids

Clear tongue positions, F3 is low compared to other sounds

How do you change from a narrow band to a wide band in praat?

Click spectrum spectrogram setting delete one 0 in window length (wide- band = .005 SEC, narrow band= .05 SEC)

Two Occlusions of Stops

Close of VP port, Closure of tongue/lips (bilabial, lingua-alveolar, lingua-palatal)

Affricates

Combination of stops and fricatives

Most of the sounds we hear comes from multiple incoherent sources, this means that they:

Come from independent sources

Most of the sounds we hear comes from multiple incoherent sources. This means that they: - are binaural -Come from independent sources -Do not add logarithmically -Are samples of "frozen noise" -Are the result of turning up a volume control

Come from independent sources

How to produce a liquid

L: tongue may contact alveolar ridge, air comes out laterally around tongue, vf vibrate. R: tongue bunched back or pointed backward and air flows around the tongue. Look similar to vowels. Formant transitions to surrounding vowels.

How to produce a stop

Complete blockage of the vocal tract, pressure builds up, then it bursts and is released. Voiced=vocal fold vibrations

How to produce affricates

Complete obstruction of the airway. Pressure builds then longer burst of energy/noise.

Stop Gap

Completely obstructing the flow of air that is coming through the vocal tract -Produced in syllable initial position, final position, or continuous speech/syllables -Articulator is pushed up against another one to constrict the air (and sound) from coming out -50-150 msec. -Voiced Plosives: voice carries over from the voiced consonants around it (Voice Bar present) -Voiceless Plosives: "buttercup" would have no voice bar, but we're lazy and say "buddercup"

Voicing?

Complex periodic

What are acoustic speech sound characteristics

Complex periodic - vowels Random - voiceless fricatives Complex periodic + random - voiced fricatives Transient - plosives (burst) Quiescent - plosives (closure)

For which types of sounds are harmonics present?

Complex periodic sound

What are non-resonant consonants

Consonants that have restricted air flow

Features of Consonants

Constriction of airflow (valve) = turbulence -Aperiodic: disturbed airflow or noise -Anterior to the point of constriction determines the acoustic spectrum of that sound -Characteristics dependent on acoustic production and perception cues

Perturbation

Constriction of vocal tract

Noise?

Continuous aperiodic

What are formants?

Created by the filter, create odd number multiples of the fundamental frequency, as a result of RESONANCE

Oral Constriction

Created by the position of the tongue relative in relation to oral cavity space, often the hard palate

How do you make a diphthong sound like a pure vowel?

Decrease the transition time between vowels

place

Depending on the ______ of articulation, certain frequency ranges will be given more amplitude and others will be attenuated.

Neural Processing and Suprasegmentals

Depends on language and task, what is the listener trying to process? -Left Hemisphere = syntactic information for the right hem. -Right Hemisphere = prosodic information at phrase and sentence level -Superior Temporal and Interior Frontal Cortex -Tonal languages (e.g. mandarin) have suprasegmentals at the word, phrase and sentence level that carry significant syntactic information

Tense Vowels are produced with greater muscle constriction; produced at the extremes of articulatory posture, with tongue higher in oral cavity; tense vowels are longer; lax vowels are shorter

Difference between tense and lax vowels

Males have lines closer together for their harmonics because the harmonics have a lower Fo Females have wider spaced harmonics (wider gap between lines) Females have a higher Fo Females have shorter vocal folds

Differences between male & female harmonic structures

Contrastive Stress

Differentiate between two words that differ only by a syllable

Stridents

Directing air flow against a surface, more intense acoustic energy

Shearing

Displacement along the vocal folds when they come back together (Displacement is both lateral/medial and anterior/posterior)

Harmonic Spacing

Distance between harmonics on a spectrum

Which item below is not an advantage of digital over analog representation of sound -Flexibility, once the speech has been encoded -Absolutely perfect copies of original recording -Virtually perfect recordings -Ease of cataloging (tagging, etc) -Distortion is impossible

Distortion is impossible

Antiformants/Nasals

Divergence of air into oral cavity introduced ______ into the picture, as some of the sound energy is trapped within the oral cavity. Opposite of formants. ______ act as stop-band filters, damping the harmonic frequencies. Look like weak-intensity formants. Frequencies of the _____ depend on where the oral blockage occurs.

Acoustic features of Voiceless Stops

Do not have continued phonation throughout period or closure In the pre-vocalic position, this continued phonation feature is not usually present

Transfer Function

Doesn't represent sound. Represents frequency response of vocal tract. Shows formants. It's the filter.

What is the X axis on a spectrogram?

Duration

What is the X axis on a waveform?

Duration

What is the definition of a period?

Duration of a cycle/ time it takes for a cycle to complete

Formant transition

During the formation (closing phase) of a stop occlusion and just after an occlusion is released, the rapid movements of the articulators cause sudden changes in the resonance peaks of the vocal tract. These changes occur during the transition from one speech sound to another. The rapid change in frequency of a formant for a vowel immediately before or after a consonant. The F2 transition is a very important acoustic cue to the *place of articulation* of a consonant. The F1 transition signals information about the *manner of articulation* of a consonant. Changes in formant frequencies that occur during the transition from one speech sound to another (that's the simpler definition from the book)

How is a complex sound displayed in a spectrum?

Each line represents a harmonic. horizontal axis = frequency vertical axis = amplitude (dB)

3 Connected Tubes

Each tube has its own natural/resonant frequency and, therefore, responds better to a different range of frequencies. The resonant frequency of the entire system is different from each of the separate tubes.

What differentiates diphthongs from monophthongs on a spectrogram?

Even where vowels are perceived as being steady-state monophthongs, the acoustic representation often indicates some articulatory movement. Although diphthongs typically show marked shifts in the spectrographic patterns, this varies depending on lots of things. Therefore, it's not always easy to make a categorical distinction between monophthongs and diphthongs.

-Introducing antiformants and dampening acoustic energy -Introduction of noise from turbulent nasal airflow emissions -Decreasing intra-oral air pressure, thereby decreasing clarity of consonant production

Excessive nasal resonance can decrease intelligibility by:

Fricative at 1200 HZ?

F & TH

Diphthongs and semivowels are characterized by some form of change in formants

F 1, 2, 3

Difference between /f/ and /v/

F's duration is long, v's duration is shorter. -Both have light noise, and start anywhere above 500Hz

What is the difference between F0 and the formants?

F0 is a source (glottis) characteristic Formant frequencies are contributions of the filter (vocal tract)

Independence

F0 is the rate of vocal fold vibration. The source function and the transfer function are relatively independent of one another: source and filter aren't necessarily connected; you can change one without changing the other. Harmonic spacing will change with varying F0.

intonation

F0 varies over longer stresses (global vs local)

What is the equation to calculate fundamental frequency?

F0=1/p

vocal tract transfer function for male

F0=100, F1=200, then 400 and etc. this is a smaller range since F=0

vocal tract transfer function for female

F0=200, then 400, and so on

What acoustic information is necessary to perceive vowels

F1 and F2

What formant frequencies are essential for the acoustic analysis of voiwels

F1 and F2

front vowels

F1 and F2 are far apart and F2 adn F3 are close together

Semivowel Liquids

F1 and F2 are similar -Formant Structure /r/ 1. F3 Transition: rapidly falls and then rises for a VCV 2. Dark /r/ - CV "root" = Posterior Tongue 3. Light /r/ - VC "early" = Palatal Tongue -Formant Structure /l/ 1. Complex (lateral emission of air) 2. Similar to /r/ without lowering of F3 because you are splitting airflow: creates anti-resonants, wider bandwidths and a murmur

VC for /w/

F1 decreases and F2 decreases

VC for /j/

F1 decreases and F2 increases

When vocal tract closes in preparation for final stop production...

F1 falls

manner

F1 formant transition tells about blank?

How are vowels distinguished by the frequency position of mainly F1 and F2?

F1 frequency rises with increasing openness of the vowel - the higher the vowel, the lower the first formant F2 frequency rises with increasing frontness of the vowel F1: opener vowel = higher frequency F2: fronter vowel = higher frequency

CV for /j/

F1 increases and F2 decreases

CV for /w/

F1 increases and F2 increases

glides

F1 profile and reduction in amplitude

When vocal tract is opened after initial stop production...

F1 rises

CV: formant transitions

F1 transition always moves from low frequency up to the vowel's F1 direction of the F2 transition is sensitive to place of articulation for the stop consonant

rises

F1 usually ___ for stops

Vowels

F1, F2, & F3 peaks shift. every Person has it in a different place for the same sound. We know which vowels are which because of the relationship between F1, F2, and F3. We don't look at value of frequency but the spectral envelope or relationship between formants.

F1, F2, and F3 placement characteristics

F1: Oral Cavity F2: Tongue Shape F3: Tongue Tip

/i/ typical formant pattern

F1= 270 hz, F2= 2290 Hz

Typical formant pattern for /u/

F1= 300Hz, F2= 870 hz

Typical formant pattern for /a/

F1=730 Hz F2= 1090 Hz

This vowel formant is influenced by tongue advancement/location of constriction

F2

bilabials (CV)

F2 and F3 dont rise as steeply

velars (CV)

F2 and F3 start together

place

F2 plus F3 formant transitions tell about?

changing the place of articulation of a nasal changes the ____

F2 transition locus

The later [l] and rhotic [r] are characterized by relatively neutral position of formant ____

F2.

The defining formant characteristic of /l/ and /r/ includes

F3

The acoustic difference between /r/ and /l/ resides mainly in the...

F3 formant

WHat is the difference between axes on line spectra and waveforms?

Line spectra show amplitude by freq, waveforms show amp by time

What differentiates the [l] and [r] is the presence or absence of a sharp drop/rise in formant _______. explain the kind of movements that need to occur for r-coloring.

F3. [r] is the only sound/phoneme for which this drop is clearly seen. it's established by moving the tongue tip or back of the tongue in critical posts for affecting the third formant. there are two such posts resulting in retroflex or bunched production of this sound. it is not enough to simply hold the aforementioned parts of the tongue in these pots. there has to be a movement to or from or both for the F3 drop to occur and produce a subsequent 'r-coloring'

the liquids /l/ and /r/ differ on the basis of ___, which is low in ___ and high in ____.

F3; /r/; /l/

T or F: Resonances (formants) do not characterize the acoustic signal for any consonant sounds...

FALSE

T or F: Voice onset time is the time elapsed between the onset of articulatory closure and the release of a stop...

FALSE

Center Frequency

FC. Greatest vibratory response/natural frequency. Depends on characteristics of resonator.

Lower Cutoff Frequncy

FL. Frequency below FC. Noted by line; left of line = unresponsive

Upper Cutoff Frequency

FU. Frequency above FC. Noted by line; right of line = unresponsive

What is the passive theory (sensory only) by Fant?

Fant: specialized innate/built in sensory filtering mechanisms are responsible for perception, which are based mostly on acoustic distinct feature theory. a common pool of distinct features for production/perception for speech is presumed to play a role. these features are thought to be located i the linguistic center of the brain. Presumably, there are innate templates (acoustic) for matching and feature detectors. the perceptual features of speech are more or less detected automatically. do not need to refer to production to perceive speech.

Development of Suprasegmentals

Fetus: responds to sound stimuli and prosody during 3rd trimester 0-6 months: response to biologically driven needs -2-3 months: linguistic discrimination emerges based on adult prosody -6 months: production of wide range of suprasegmentals -6-12 months: learned prosodic patterns of pitch, rhythm and pausing ->12 months: integration into adult like patterns

constriction

Filtering is imposed by the cavity in front of the _________ and in certain conditions by the cavity behind the _______.

Wide-Band Filter

Filters more frequencies at once than narrow filter

___ formants change due to mouth openings and tongue position

First

Diphthongs are characterized by what?

First 3 formants

Fricatives Production

Narrow constriction but not complete occlusion.

Fricative on Spectrogram

Narrowing, obstructed (not completely) Tend to Have Higher Frequency Variation

subglottal pressure can increase ____

Fo

Research indicates that the greatest cue for stress is _____ followed by _____ and then ______

Fo, duration, amplitude

4

For a tube closed at one end and open at the other, the tube will resonate best (the natural resonant frequency) at a frequency that has a wavelength that is how many times the length of the tube?

F1 increases and relationship to F2 is unclear

For vowel height, what happens as the back vowels become more open (low)?

F1 increases and F2 decreases

For vowel height, what happens as the front vowels become more open (low)?

Antinode

For what is formant frequency lowered by constriction?

Node

For what is formant frequency raised by constriction?

Node

For what is minimum volume velocity or maximum pressure?

Antinode

For what is volume velocity maximum or pressure minimum

Stress

Force when the vocal folds come back togethers. The actual contraction of the muscle

For formant 1, where is is critical pressure point at and its critical max velocity point at?

Formant 1 has its critical pressure point at the glottis; and its critical max velocity point at the oral opening.

Which formant is said to be mostly responsive to tongue front to back positions?

Formant 2. Tongue front makes f2 go up; tongue back makes f2 go down.

Which formant results in the perception of the presence of the [r] sound?

Formant 3. Formant 3 identifies that an [r] happened. Retroflex [r] the tongue tip curls up behind alveolar ridge. in a bunched [r] the back of tongue moves close to velum.

2 Rules of Perturbation

Formant Frequency is raised by constriction at the nodes & Oral Cavity. Formant Frequency is lowered by constriction at the antinodes & Pharyngeal cavities

Which is the loudest formant?

Formant I (lower frequency)

Cues to Nasal Place

Formant Transition (mainly F2) provide place cue for nasals (as demonstrated by Malecott's 1956 experiment)

Nonresonants lack...

Formant structure and openness of resonants

Shift

Formant transition

Stop-Glide-Dipthong Series

Formant transitions goes from short to medium to long.

There is no end of combs of formants ___ & ___ in vowels in the mouth and vocal tract.

Formants I and II.

Formants 1, 2, 3

Formants responsible for differentiating vowels

What is the X axis on a spectrum?

Frequency

What is the Y axis on a spectrogram?

Frequency

Spectrum

Frequency (x) by Amplitude (y)

The sound spectrum shows..

Frequency and amplitude DOES NOT SHOW:wavelength, phase, period

What is meant by place theory?

Frequency is encoded in the ear according to location

Spectograms

Frequency, Time, Amplitude

Frication

Fricative noise. Has a narrow spectrum. It's concentrated at different frequencies depending on the specific consonant.

In what consonant do articulators form constrictions and occlusions within the vocal tract that generate aperiodicity ( as in noise) when air flows through them?

Fricatives

The nonresonant consonants of English are the ____, the ____, and the _____.

Fricatives, affricates, stops

Fricative Spectral Energy

Front cavity will act as an amplifier-filer; as you change the shape and length, it will amplify different frequencies (acoustic features of the front cavity make the difference in sound) 1. Strident (Concentrated Spectrum) -Large front cavity (s, z, sh, dz) -Those that have concentrated energy; smaller frequency range -Greater power 2. Non-Strident (Diffuse Spectrum) -Small front cavity (f, v, th) -Energy is spread out over a much wider band -Less Power

What is the difference between fundamental frequency and pitch?

Fundamental Frequency is the rate at which a system vibrates, while pitch is the perception of fundamental frequency.

What changes the harmonic series?

Fundamental frequency

What is the difference between fundamental frequency and formant frequency?

Fundamental frequency is the rate at which the vocal folds vibrate and formant frequency is the size and shape of resonating cavities

Period of glottal wave (area over time) depends on ___.

Gender and age

normative data

Given that formant frequency depends upon vocal tract length & resonating cavity size, no absolute values for F!, F2, & F3 exist.

Formant Transitions

Going from a plosive shape to a vowel shape -Voicing starts, vocal folds start vibrating, bending of formants to get to steady state Acoustic Features (Consonant-to-Vowel formant blending) F1= Manner of Production/Degree of Constriction F2= Place of Articulation -Tell you what consonant and vowel are being produced -Stops will always bend up when looking left to right -The starting point for F2 will be pointing to the spectral energy for that consonant (e.g. /b/ will be between 500-1500 Hz)

Front vowels have a ___ f2

HIgh

Where do release bursts occur?

Initial and medial stops; bursts are longer for voiceless and shorter for voiced sounds

Primary resonance for nasal consents?

Nasal Murmur (250HZ)

White noise has all the following characteristics except:

Harmonic-Based

Voiceless consonants

Has a supraglottal noise sources; aperiodic laryngeal source (noise, aspiration)

Stops

Have the greatest amount of breath stream obstruction

Combined Tubes: /i/

Helmholtz resonator plus a tube open at both ends and a tube closed at both ends

When Does Subglottal Pressure Affect Intensity?

High Frequencies only

High vowels have a ___ tongue body or a ___ F1

High, low

Acoustic Characteristics of Stressed Syllables

Higher F0 Greater duration Greater intensity

Properties of Stress

Higher Fundamental Frequency Intensity (Louder) Longer in Duration

Primary Stress

Highest level of stress, usually seen on the second syllable

The second phase of stop production is called the...

Hold or closure/stop gap

Maximum Flow Declination Rate

How Quickly the vocal Folds Close * the slope down ~People who have problems with this are more likely to have a higher declination rate ~It closes faster because of Bernuolli and Mass Model

What are feedback mechanisms of speech important for?

How a speaker controls production of speech, like how much does the speaker monitor his actions or how does a speaker produce speech with little or no feedback regarding speech output?

Lateral X-ray Image, Axial CAT Scan, Ultrasound, Computed tomography (CT), Magnetic resonance imaging (MRI)

Instruments used to visualize the vocal tract

Harmonics

Integer multiples of the fundamental frequency H1 = F0 H2 = 2F0 H3 = 3F0

Attenuation Rate/Role-Off Rate/Rejection Rate/Slope

How rapidly the resonator decreases in its intensity of response to different frequencies. Measured in dB/octave. Less than 18dB shallow. +90dB deep.

25 ms

If VOT is 25 ms or greater= the plosive is voiceless. If VOT is less than 25 ms, the plosive is voiced.

What produces nasal resonance?

Nasal cavities

Aperiodic

If a wave does not repeat, it is represented like any other data. The x axis is usually time or distance, can only draw specific conclusions about the part of the wave we measured

Periodic

If a wave repeats, its more efficient to show just one cycle of the wave and indicate that it repeats, x-axis is degrees or radians, draw bigger conclusions about the wave

If something blocks or narrows maximum pressure (p), this will _____ that particular formant frequency.

If something blocks or narrows maximum pressure (p) this will RAISE that particular formant frequency.

What is the difference between conductive and sensorineural hearing loss?

If the hearing loss is just a conductive loss, then bone conduction thresholds should be the same as for normal hearing. If bone conduction thresholds are raised, there is impairment of the cochlea, auditory nerve or the higher auditory nervous system.

Nasals Description

Nasal consonants are produced with nasal radiation of sound

The tube will resonant best (the natural resonant frequency) at a frequency that has a wavelength that is 4x the length of the tube

Important "rule" for a tube closed at one end (when the tube will resonate best)

damped

In nasals, the acoustic transmission tends to be heavily _____ do to increased length and absorptive nature of nasal cavity

What are the physical properties of a sound?

Intensity/amplitude Duration/time Frequency

Place Cues for Liquids

In order to distinguish /r/ from /l/ the first 3 formants are needed, but F3 can separate /r/ from /l/ (/r/ has a low F3 of about 1500 Hz)

Extrinsic Muscle Activity

Indicates glottal inefficiency

What is the relationship between period and frequency?

Inverse relationship- the shorter the period the higher the frequency

Relationship of Volume and Pressure

Inversely related

no

Is every /d/ the same? (yes or no)

The above plot shows the sum of two sine waves, one at 100 Hz and the second at 200 Hz. The resulting wave:

Is periodic

What is descriptive research?

It describes variables of importance, describes their differences, or describes their relationships (spectrography very helpful)

Describe the pascal to dB SPL conversion formula.

It is a logarithmic scale of relative amplitude.

What was another study about VOT by Eimas et al on categorical perception?

It was a study conducted on very young babies who do not know about speech yet; it was to prove that such babies are biologically ready to have categorical perception for voiced voiceless (with VOT around 20ms).

Why was 20 micro Pascals chosen?

It was estimated that a sound of this size was at the threshold of normal human hearing.

Why is aspiration useful?

It's a way to make voiced/voiceless distinctions when examining the acoustics of stops

What suprasegmental feature is demenstrated: A name vs. An aim

Juncture

When given a pair of phonemes (e.g., /p/ and /b/) be able to identify the acoustic cue that differentiates the two.

KNOW THE CONSONANT CHART!

Jedbbdn

Kdndndn

VOT and Age

Kids produce distinct VOTs around 11 years old. Before then, short lag. Elderly have more variability.

What stop has a lower F2(800hz) and F3

Labial

Which places of articulation of fricatives have broad frequency bands

Labio- and linguadentals

Which places of articulation of fricatives have very low energy formants

Labio- and linguadentals

Constrictions in Fricatives

Labiodental Linguadental Alveolar Palatal

what places of articulation make up non-sibilants

Labiodentals and linguadentals

Lip movement Jaw movement

Later descriptions

If the vocal folds close faster, slope: energy:

Less steep of a slope Greater energy

glides (semivowels)

Lingual-alveolar /j / & bilabial /w/

Approximates

Liquids and glides, Characteristics: Limited articulatory constrictions that alter resonant frequencies, Classification based on syllable position, Formant transitions typically faster than for vowels

In order to properly model the vocal tract with acoustic tubes, the wavelength must be _____ with respect to the length of the tubes.

Long

Frication

Long duration of energy, continuous sound, complex aperiodic waves

Vowel on Spectrogram

Look for the formants Vowels are always voiced

Diphthong on Spectrogram

Looks like packman, kind of

Nasal Formant

Low energy from nose. A very low frequency, high intensity component of a nasal. Nasals have additional formants about this (called nasal formants, N1...) but the antiformants are more important in characterizing a nasal.

Pharyngeal (Glottal) /h/

Low energy, broad spectrum

What are the formant values for /i/ and /j/

Low f1, high f2

Which voice will be richer in harmonics? One with a low or high fundamental frequency.

Low fundamental frequency

Sonorant

Nasals, liquids, and glides which are similar to vowels. They are characterized by free airflow, articulation shapes vocal-tract cavities, formant frequencies, and periodic laryngeal source meaning they are all voiced.

• Source=vocal fold vibration; all vowels, many consonants

Nearly periodic complex waves source and examples

To produce a vowel..

Need relatively open vocal tract, vocal folds must vibrate, and tongue is in a certain position in the oral cavity/may or may not have lip rounding.

Types of Filters

Low-pass: lets in low frequencies, amplitude is high in lower area High-pass: Lets higher frequencies through, amplitude is higher in higher area Band-pass: Lets middle frequencies get through, amplitude lower at lower/higher ends Band-stop: Lets low and high through and not middle frequencies, amplitude is highest at lower/higher ends (Amplifies the ones let through and weakens the ones not let through)

Higher the vowel

Lower F1

Increased oral cavity length

Lower F2

Which voice is richer in harmonics? Male, female, or a child.

Male

How do you tell the difference between male and female /i/

Male and female speakers will have the same spectral envelope for the same sound, just different frequencies

Simple Harmonic Motion creates sounds that:

May be plotted as a sine wave

Glides

May involve a gliding motion from a partly constricted state to a more open state -Palatal Glide /j/ -Labio-Velar Glide /w/

Sound Pressure (dB SPL)

Measure of pressure at a location, level at receiver. level the ear hears or the microphone transducer. (dB SPL Equation)

-Airflow as it is emitted from the nasal cavity -Nasalance

Measuring factors contributing to nasality:

The amplitude of a sound wave is typically measured using the RMS or Root Mean Square method. This has the effect of:

Measuring the equivalent of static pressure

Elastic Medium

Medium must be able to keep the pressure disturbance going beyond the initial point of change

Stops (/p/, /b/, /t/, /d/, /k/, /g/)

Place: bilabial, alveolar, and velar Voicing: voiced and voiceless Manner: complete blockage of airflow, rapid pressure change

/h/

Place: glottal Manner: fricative Voice: voiceless Aperiodic Turbulent Friction, Muscles: Lateral Cricoarytenoid, Manner- presence of aperiodic noise, Place- low, Voice- No voice cue

F3

Most important in distinguishing rounding of the lips. If American English has a round/unround pair at a certain height, this might be more important.

Nasal formants

Most intense, lowest frequency

Diphthongs

Move from one steady state to another. Movement from the characteristic formants of one pure vowel to another. (Resonance characteristics change during production.) Longer in duration than monophthongs/glides. Onglide/offglide: relatively steady state formants of the onglide, then transition.

Does resonance change from one nasal to the next?

No because the nose is made out of cartilage that can not be moved

Partial Assimilation

No phonemic change occurs in the sound, only a phonetic change

/r/

No tongue tip contact with alveolar ridge, often retroflexed, often has lip rounding

Is there pressure build up for vowels?

No, relatively open vocal tract positions so there is no pressure build up around a closure (no distinctive elements of noise, like consonants). Air just goes through.

Can consonants can stand by themselves and be meaningful?

No, they nearly never can stand by themselves and be meaningful.

Aspiration

Noise generated by turbulence as air moves through the glottis during the time in which the folds are starting to close for the following voiced sound

What is a burst?

Noise produced at the place of articulation (when the closure is released)

Given that formant frequency depends upon vocal tract length and resonating cavity size, no absolute values for F1, F2, and F3 exist

Normative data on formant frequencies

Segmentals

Not defined by individual speech sounds Duration: juncture and length of phonemes

What is juncture?

Spacing in speech (we don't speak with a lot of breaks)/ the relationship between sounds within words or between words within continuous speech

Turbulent airflow

Obstacle disturbs the flow of air.

Acoustic affect of introducing obstruction into a turbulent airstream such as we produce fricatives?

Obstruent increases the degree of turbulence and increases the intensity of sound

Nasals

Occluded oral cavity, the nasal cavity has very different acoustic features -Built up pressure behind the point of occlusion, the sound and pressure will back up and be diverted through the nasal cavity -Split airflow and sound= Anti-Resonances --suppress harmonics -Total constriction in oral cavity -Can serve as the syllable nucleus (e.g. "button")

Formant Transitions

Occur from a voiced sound preceding a stop or from a voiced sound post-stop, or both

Tube Resonance/Resonant Frequency of a Tube Characteristics

Occurs in any tube/pipe that contains air. A tube's resonant frequency is related to its physical characteristics: tube length, tube geometry, status of the tube ends.

frication

On a spectrogram, ______ looks like a wide band of energy distributed over a broad range of frequencies.

weak

On spectrograms, antiformants look like extremely ___ intensity formants

filtering

Once turbulence is generated, the noise energy is subjected to ______ by the vocal tract

Consonants

One or more areas of constriction of vocal tract Source of sound • Voicing • Turbulent airflow • Or both • Less energy, greater meaning Consonants differ greatly • Degree of constriction • Presence or absence of noise • Nasality

Consonant production

One or more areas of the vocal tract are narrowed by some degree of constriction. All consonants have a manner, place, and voice.

An acoustic tube model of the vowel /a/ has formant frequencies determined by the resonances of what types of tubes?

One or more tube(s) open only at one end.

Tubes Closed at Both Ends

One-half wavelength fits nicely So does a full wavelength And so does one and a half Formula: (picture)

The velopharyngeal port must be ___ during the production of nasal consonants.

Open

Open Tube

Open at both ends. The end pressures are atmospheric/ambient pressure. Half a wave fits inside the tube = Half-Wave Resonator. Areas of greatest pressure are somewhere in the tube, never at the ends. Node will always be at the ends. Starts with half a wave (1/2) for F0. Then increases by 1/2 each resonant frequency: 2nd resonance = one whole wave (2/2), 3rd resonance = 1.5 waves (3/2) and so on.

Closed Tube

Open at one end and closed at other. Closed end contains antinode/greatest pressure. Pressure at open end is ambient pressure. 1/4 wave fits inside the tube = Quarter-Wave Resonator. Starts with 1/4 wave for F0, then increases by 2/4 wave each time: 2nd resonance = 3/4 wave, 3rd resonance = 5/4 wave and so on. F0 has wavelength 4x the length of the tube. Higher resonance frequencies are odd number multiples of F0; harmonics for the resonator are odd number multiples.

Verticle Phase difference

Open inferior to superior

Differences Between Vowels and Consonants

Open vs. Closed VT Aperiodic noise source for most consonants Consonants (generally) cannot serve as the syllable nucleus

PRTU

Order of the graphs for acoustic characteristics

/ð/

Place: interdental Manner: fricative Voice: voiced Periodic Laryngeal Source, Muscles: Superior Longitudinal, Manner- presence of aperiodic noise Place- low, Voice- Presence of Phonation

Open Loop Feedback

Output is preprogrammed, no feedback needed

lip rounding

Over-simplified categorization of place of articulation

Fricative place of articulation

Oversimplified categorization of place of articulation. Acoustic evidence for place of articulation: fricative noise spectrum, formant transitions

Source filter theory

P(f)= U(f) . R (f) P=spectrum of the sound pressure wave exiting the lips. U=glottal volume velocity T= transfer function of the vocal tract R=radiation characteristics at the lips

• P=spectrum of the sound pressure wave exiting the lips • U=glottal volume velocity • T= transfer function of the vocal tract • R=radiation characteristics at the lips

P(f)=U(f)*T(f)*R(f) means...

Relationship between frequency and period

P=1/F (reciprocal relationship) e.g. .005 seconds=1/2000 cycles

Voiced consonants have...

PERIODIC laryngeal source

Lexical Stress

Pattern of stress within words

In the plot shown, the solid line labeled "A" indicates the _______ pressure

Peak

Closed Loop Feedback

Performance of system is fed back in for check

3rd Acoustic Cue of Stops Voice Onset Time

Period of time between release of stop and the onset of voicing

1st Acoustic Cue of stop: Silence

Period of time between stopping of airflow and continuation with phonating

Semivowels

Periodic sound wave; open voicing and semi-constricted -Subdivisions: glides and liquids -Characteristics: 1. Constriction interval <100 msec. or 40-50 msec. for less carefully articulated speech 2. Initial articulatory position (can be in front or have a vowel in front of it) 3. Rapid transition to vowel (60-100 msec.) similar to a diphthong -Perception: 1. Formant Transitions (F2, sometimes F3) 2. Location of semi-constriction

/b/

Place: bilabial Manner: stop Voice: voiced Periodic Laryngeal Source, Muslces: Orbicularis Oris, Levator Palatini, Manner-silent or near silent closure interval; transient release burst, Place- F2 transitions, frequency of most intense portion of release burst, Voice- +phonation, -phonation, presence or absence of aspiration, VOT/F1 onset, closure duration (medial position), preceding vowel duration (final position)

/p/

Place: bilabial Manner: stop Voice: voiceless Transient Aperiodic Laryngeal Source, Muscles: Orbicularis Oris, Levator Palatini, Manner- silent or near silent closure interval; transient release burst, Place- F2 transitions, frequency of most intense portion of release burst, Voice- +phonation, -phonation, presence or absence of aspiration, VOT/F1 onset, closure duration (medial position), preceding vowel duration (final position)

Source Spectrum

Phonation from the Larynx

The _______ is minimum amount of pressure needed to sustain vocal fold vibration

Phonation threshold pressure (PTP)

What are the 4 different ways to measure fundamental frequency?

Pitch Contour Analysis Voice Report NarrowBand Spectrogram Waveform

Syllable stress as a result of these 3 things

Pitch, length, duration

Cognates

Place and manner of articulation are the same. Voicing varies.

/l/

Place: alveolar Manner: liquid Voice: voiced Periodic Laryngeal Source, Muscles: Levator Palatini, Palatoglossus, Rapid formant changes; damping

/n/

Place: alveolar Manner: nasal Voice: voiced Periodic Laryngeal Source, Muscles: Levator Palatini, Palatoglossus, Place- high average duration

/d/

Place: alveolar Manner: stop Voice: Voiced Periodic Laryngeal Source, Muscles: Superior Longitudinal Muscle, Levator Palatini, Manner- silent or near silent closure interval; transient release burst, Place- F2 transitions, frequency of most intense portion of release burst, Voice +phonation, -phonation, presence or absence of aspiration, VOT/F1 onset, closure duration (medial position), preceding vowel duration (final position)

/t/

Place: alveolar Manner: stop Voice: voiceless Aperiodic Laryngeal Source, Muscles: Superior Longitudinal Muscle, Vertical muscle fibers, Levator Palatini, Manner- silent or near silent closure interval; transient release burst, Place- F2 transitions, frequency of most intense portion of release burst, Voice- +phonation, -phonation, presence or absence of aspiration, VOT/F1 onset, closure duration (medial position), preceding vowel duration final position)

/m/

Place: bilabial Manner: nasal Voice: voiced Periodic Laryngeal Source, Muscles: Levator Palatini, Palatoglossus, Anti-resonance; nasal murmur;, F2 transitions, lowest and highest

Formant Transitions

Position from the on glide to the offglide

Descriptors of Consonants

Presence or absence of voicing Place of Articulation Manner of Articulation

The Pascal (Pa) is a unit of

Pressure

Noise Burst

Pressure release escaping the point of constriction 1. Manner Cues: -Duration: 5-40 msec. (average 10 msec.) -Rise Time: 10 msec. 2. Place Cues: Primary Spectral Energy (varies with vowel context) -Frequencies determined by where they are produced in the oral cavity

What are harmonics?

Produced by the source, create whole number multiples of the fundamental frequency

Tense Vowels

Produced with greater muscle contraction and produced more according to the vowel quadrilateral

Vowel Production

Produced with more open vocal tract

Monophthongs

Produced with relatively constant tongue positions. Steady state vowels. Wide, dark stripes. Concentrated intense energy around harmonics that are amplified near resonances. Corner vowels: tongue far from neutral position as possible.

Sound Spectography

Provides a spectral picture of the acoustic wave

Linguistic Stress

Putting more emphasis (meaning) on certain portions of the utterance than others -Needs more physiological emphasis (more power) -Acoustic Features of stressed syllables/words: 1. Higher F0 2. Greater Amplitude 3. Longer Duration -Levels of Stress: syllable level or sentence/phrase level -Pauses, Duration and Utterance Junctures: --Occlusion duration signals syllable, word or utterance juncture

What is the speech perception theory that is related to it all but may not be a theory of speech perception per se?

Quantal Theory by Kenneth Stevens. He looked at all languages in the world and what they have in common. languages of the world use perceiving acoustic features that are associated with certain critical regions where we form consonants (apparently this hypothesis does not effectively apply to vowels, however). certain articulatory changes produce little acoustic change, whereas other minimal adjustments have major acoustic consequences. the quantal theory predicts hat the languages of the world have largely formed around these so-called "quantal changes" in acoustic speech signals. this theory doesn't carefully separate acoustic and perceptual features; it tends to focus on acoustic changes. our hearing system seems to work better at the freq ranges these quantal changes occur, and provides evidence for theory.

Digital Recording Problems

Sampling Rate -Aliasing, Reduced Bandwidth Quantization -Peak clipping, reduced dynamic range, quantization noise.

subglottal pressure

air pressure immediately below the glottis

Glides

Quick tongue movements, short duration, characterized by formant transitions

How can you identify vowels on a spectrogram?

RATIO of formants

In the plot shown, the dashed line labeled "B" indicates the _________ pressure.

RMS

Transient Aperiodic Waves

Rapid pressure change some consonants (/p/, /b/)

Source spectrum and resonant spectrum are NOT related

Reason formants DO NOT change on spectrogram when the fundamental frequency changes

consonant clusters

Spanish does not have these, while English does (it depends on the language)

equation under standing wave patterns

Recall, that λ = c/f Rearrange to Fn =c/ λ Fn = formant number c = velocity of sound (34,000 cm/sec) L = vocal tract length (17.5 cm)

Voice bar

Reflects the energy of the F0 of voicing. A dark band at the bottom of a spectrogram.

F2

Related to the length of the oral cavity. If the lips are rounded, the oral cavity is extended. F2 is higher when the oral cavity is shorter. Where tongue is in mouth is where the split is. Tongue advancement: higher frequency when tongue is front and lower frequency when tongue is back.

F1

Related to the volume of the pharyngeal cavity as well as how tightly the vocal tract is constricted. Tongue height: lower frequency when tongue is high and higher frequency when tongue is low.

Transglottal pressure

Relative difference between the pressure above and below the vocal folds ~The "driving" pressure ~needed to keep VF in Vibration

S to Z Reliability

Reliability is questionable and unreliable because you are measuring how you elicit voice, not the quality of voicing

Output Function

Represents the sound when it comes from your lips. System loses 6dB per octave at the lips --> radiation characteristic. Original glottal spectrum has been filtered/transfer function applied. Formants stay the same. Amplitude of frequencies is different. Because of the filter, frequency components attenuated. How sound generated by the vocal folds is modified based on the resonances of the vocal tract.

Why are fricatives thought to be the most precisely articulated or distorted in disorders like dysarthria?

Require a very small degree of constriction to be properly articulated and may require a finer degree of motor control

Silent Gap/stop gap

Silence prior to release. Voiceless stop- initial position=cannot be seen on spectrogram. Other positions=visible as a blank space between the preceding sound and stop. Voiced stop-voice bar is sometimes apparent.

Graphical representation of the frequency and intensity of the sound pressure wave as a function of time

Spectrogram

What is one obvious practical problem with the vocal tract normalization theory?

Research has demonstrated that we can hear vowels correctly, even if a speaker did not use any point vowels up until that point. of course, it may be possible still that the so called speaker normalization process occurs on other parts of speech than merely the point vowels. for example, coughing, throat clearing; consonants; also, the physical appearance of a speaker could somehow be a factor.

What Determines Damping?

Resistance!

What are formants?

Resonances of the vocal tract - the peaks on a spectrum The peak of lowest frequency = F1 The peak of highest frequency = F2

Which anatomical structure do we ascribe formant frequency?

Resonating cavities in the vocal tract

Souce= vocal folds and vocal tract, What is: resonator, sound, manner and examples

Resonator= vocal tract, sound= mixed periodic and aperiodic, manner= voiced stops, voiced fricatives, and voiced affricate, examples= /b/ /g/ /z/ /v/

Source= Vocal folds what is: Resonator, sound, manner, and examples

Resonator= vocal tract, sound= periodic, manner= vowel, diphthongs, semivowels, nasals, examples= /i/ /u/ /ai/ /ou/ /w/ /j/ /m/ /n/

Source= vocal tract, what is: resonator, sound, manner, and examples

Resonators= vocal tract, sound= aperiodic, manner= stops, fricatives, affricates, examples= /p/, /s/, /k/, /f/

Falling Intonation

Results from decreased cricothyroid activity; seen as product of running speech

Rising Intonation

Results from increased vocal fold tension, which result of increased cricothyroid muscle

In general, the male glottal waveform:

Results in a spectrum with more high-frequency energy than that of the female

What involves numerous accented or stressed portions that ...."occur with some regularity, regardless of tempo (fast or slow) or tempo changes within the pattern (accelerate, retard)

Rhythm

During normal inhalation your...(anatomy)

Rib cage goes up and out while your abdomen pushes out

RMS

Root means squared -Most meaningful measure of sound amplitude -for sine wave it equals .707 of the peak value -for any other wave it must be calculated either sample-by-sample or symbolically

Fricative at 4,000 HZ?

S

When given a phoneme be able to identify the acoustic cues for manner of articulation

SEE ACOUSTIC CUES IN CONSONANT CHART

When given a phoneme be able to identify the acoustic cues for place of articulation

SEE ACOUSTIC CUES IN CONSONANT CHART

When given a phoneme be able to identify the acoustic cues for voicing

SEE ACOUSTIC CUES IN CONSONANT CHART

Fricative at 2,000 HZ?

SH

Cul-De-Sac Resonance

Same as nasal resonance but in oral cavity

Spatial Target Models

Say that a speaker can still produce a sound accurately even in the face of disruption

Glides

Semivowels. Different from diphthongs: more rapid Formant transitions than diphthongs. No steady-state portion. Transitions are very short and really just look like movement from one sound to another. Lips typically rounded for the labiovelars, lengthens vocal tract, increases volume, which lowers all formants. F1 for both glides starts very low. /j/ is like /i/ and /w/ is like /u/.

Uniform/Symmetric Resonator

Sharply/narrowly tuned transmits (responds to) a narrow range of frequencies. Narrowly tuned resonator responds slowly to driving frequencies (amplitude grows slowly until it reaches its greatest levels). Sharply tuned resonator is lightly damped (once forced into vibration, takes a long time to fade away).

Resonance Curve/Filter Curve/Transfer Function

Shows the response of the resonator at different frequencies. Response is greatest at/near the objects natural frequency.

The above plot:

Shows three periods of the wave

Acoustic Features of Stops

Silent Gap Noise burst at moment of release Rise time and fall time First formant frequency changes as a result of articulation and coarticulation

Fricatives

Sound Source: Spectral Characteristics of the airflow past the point of constriction -Manner of Production: produced through aperiodic random airflow -Acoustic Cues: 1. Duration: average 130 msec. 2. Rise Time: approx. 76 msec.

Nasal

Sounds produced with an open velopharyngeal port, hence nasal emission of the airstream.

Radiant Spectrum

Sounds tend to take on more energy with higher frequencies at the lips

Glide

Sounds with production requires the tongue to move quickly from one relatively open position to another in the vocal tract; one of the manners of consonant articulation. EX: /w/ or /j/

Glide

Sounds with production requires the tongue to move quickly from one relatively open position to another in the vocal tract; one of the manners of consonant articulation. EX: /w/ or /j/

Transient aperiodic waves

Source: rapid pressure change Some consonants, such as /p/ and /b/ (stops)

Continuous aperiodic waves

Source: turbulent flow through a supreglottal constriction (noise) Many consonants, such as /s/ and /f/ (fricatives)

Nearly Periodic Complex Waves

Source: vocal fold vibration All vowels, many consonants

Sound Propagation

Sound Propagation. Sound propagates through air as a longitudinal wave. The speed of sound is determined by the properties of the air, and not by the frequency or amplitude of the sound.

R-Colored Vowels

Some of the /r/ sound is heard in the following vowels

tensed vowels

Some vowels are longer or shorter than other vowels depending on context, but without context, some vowels are intrinsically longer, which ones?

If the vocal folds close slower, slope: energy:

Steeper slope Less energy (because you lose air faster, can't talk loud because of the lower amplitude)

The places of articulation of the nasal consonants are identical to those of the...

Stop Consonants

Supraglottal noise sources include...

Stop bursts and frication

What articulatory feature is cued by differences in the VOT (special parameter)?

Stop voicing Long VOT = [ -v ] Short VOT = [ +v ]

Consonants

Stop, fricatives, nasals, affricatives, liquids, glides

Suprasegmental Features of Speech

Stress, Intonation, and Duration

Articulators

Structures used to produce sounds of speech

Continuous Aperiodic Waves

Turbulent flow through a supraglottal noise Many consonants (/s/, /f/)

Glottal (Source) Spectrum

Successive harmonics lose amplitude at rate of 12 dB per octave. At the level of the larynx. Glottis/vocal folds --> source

true

T/F at any one instant in time, the vocal tract shows adjustments for more than one sound

false (transitions are not important for sibilant perception, but they are important for nonsibilants)

T/F formant transition location depend on the articulation, so the transitions are important perceptually for sibilants

true

T/F there is no fixed transition pattern for perception

What is an appropriate response to the following question? WHICH one of those green books is yours?

THIS is my green book

T or F: A phonemic change results from a complete assimilation

TRUE

T or F: Affricates are characterized by the acoustic features of both stops and fricatives...

TRUE

T or F: All fricatives are characterized by the use of an aperiodic source of sound

TRUE

T or F: Consonants can be produced using periodic sources of sound, aperiodic sources of sound, or a combination of both types of sources...

TRUE

T or F: The glide consonants of English are articulated with rapid articulatory movements that cause changes in formant frequencies.

TRUE

T or F: The second formant transition to or from neighboring vowels provides information about stop place of articulation...

TRUE

Glottal Volume Velocity

Takes a while to for vocal folds to open and once open air flows quickly through before snapping shut. ~Airflow through the glottis

Tense/Lax Duration

Tense- longer in duration Lax- shorter in duration

Where has it been suggested that the feedback system for speech is housed?

The CNS

Density

The amount of mass per unit volume

How does tongue position interact with F1 and F2?

Using [i], [a] (other a) and [u] for reference, approximate positions of F1 and F2 can be estimated for other vowels.

The open end (lips)

The antinode is what end of the tube?

Antinode

The area of largest amplitude of vibration of a sound wave. Maximum pressure.

Node

The area of smallest amplitude of vibration of a sound wave. Minimum pressure.

The narrow-band spectrogram below shows the following information regarding the utterance:

The articulators were changing position and the fundamental frequency was falling.

What is wavelength?

The distance a sound wave travels during a full cycle

In regards to limitations of the simple target theory, explain dynamic nature of vowel sounds.

The dynamic nature of vowel sounds in context (beginnings and endings of vowels reveal transitions typical for preceding and following consonants; yet these transitions do not throw off perception). in other words, what moment during a vowel production defines its identity? are we are responding to the center frequencies? where would we measure formant frequencies?

2000

The first antiformant for /n/ occurs at around ______ Hz

3000

The first antiformant for /ŋ/ occurs at around _______ Hz

Cutoff Frequencies

The frequency at which a resonant system is unresponsive. The point where the intensity of the transmission is reduced by one half (unresponsive).

How does aspiration happen?

The glottis is open at the moment of stop release, allowing the breath stream to flow freely into the upper vocal tract without phonation

Voice Onset Time (VOT)

The interval of time between the release of a stop consonant and the onset of voicing

The "length rule"

The longer the length of the tract, the lower all the average formants will be

What is fundamental frequency (f0)?

The lowest frequency contained in a complex periodic sound

spectral roll-off

The lowest harmonics have the higher amplitude, while the higher harmonics have the lower amplitude.

The first formant of a vowel is always created by:

The lowest resonance of any tube or combination of tubes used to model the vocal tract

What is acoustic reflex?

The middle ear provides protection in the form of the acoustic reflect (stapedius muscle can stiffen the movement of the stapes; the eustachian tube furthermore equalizes pressure). Reduction of about 10 dB below 1000 hz (if volume reaches 85-90 dB). stapedius muscle stiffens.

What are the spectrographic differences between front and back vowels?

The more front the vowel, the higher the second formant front vowels - higher F2 back vowels - lower F2 The closer F1 and F2 are to each other, the more back a vowel is

The closed end (glottis)

The node is what end of the tube?

off quarter

Using the ____ ________ wavelength relationship, we can determine the lowest resonance for the /s/

What is Rhythm?

The pattern of stress on a series of syllables / speaking w/o pauses in speech, occurring with some regularity

The place of fricatives

The presence of a dominant relatively high frequency spectral peak (sibilants); and non sibilants. They are differentiated because of the use of single or double contsrictions; also their placements differ between anterior and posterior. in volume, the sibilants are louder than non-sibilants.

Suprasegmental Features of Speech

The prosody/melodies of natural speech -Segments: individual sound categories of our language (phonemes) -Combine phonemes through blending and transitions to create meaningful utterances 1. Linguistic information and structure: -Question vs. Statement (intonation) -Noun vs. Verb (stress) 2. Psychosocial: intent and affect 3. Speech Rhythm: based on variations in stress locations and pauses

Vocal fundamental frequency?

The rate at which the vocal folds vibrate

Transducers

Used to measure and move energy. Transforms signals from one form of energy to another. -Microphone: A transducer that changes a sound signal into an electrical signal. (Or, converts air pressure into voltage variation) -Loudspeaker: A transducer that changes an electrical signal into a sound signal

Relationship of formants and vowels

The relationship among formants helps you distinguish one vowel from another. High front vowel /i/ has the most space between F1 and F2 and low back vowel /a/ has the least amount of space between F1 and F2.

How are Formants and Tract Length related?

The resonance of a tube are related to the length of that tube. Longer tubes respond to lower frequencies. Shorter tubes respond to higher frequencies.

Formants

The resonant frequencies of the vocal tract

Formants

The resonant frequencies of the vocal tract. The first 2-3 are most important and occur below 5000Hz. They're related to the volumes of the oral and pharyngeal spaces. Containers with large volumes respond to lower driving frequencies & vice versa.

Hooke's Law

The restoring force is proportional to the magnitude of displacement/distance

Nasal Murmer

The sound generated while the oral cavity is closed at the place of articulation and the nasal sound is being radiated exclusively from the nostrils. Happens during entire sound/spectrogram-entire area of nasal energy.

What does a spectrogram display?

The spectrum of frequencies in a sound or other signal as they vary with time

Fall Time

The speech with which the acoustic signal falls to minimum intensity for a syllable-final stop

Rise Time

The speed at which the maximum intensity of the acoustic signal is achieved for a syllable-initial stop

Utterance Level Declination in Fo

The tendency for Fo to decrease over the course of an utterance 1. Statement = Falling F0 (can have different pitch contours based on emotionality and/or intent) 2. Question = Falling + Rising Fo (doesn't always have to be rising, can override the intonation pattern) 3. Within utterance of Fo Contour Variations -Individual syllables may receive a slight upward inflection, whereas the overall pitch of the utterance decline or remain relatively flat -Typically associated with the length and complexity of the utterance as well as intent

Elasticity

The tendency of a volume of mass to return to its original volume following compression

Voice Onset Time

The time between the release of articulatory blockage to the beginning of vocal fold vibration of the following vowel; coordination between laryngeal and articulatory systems

Voice Onset Time

The time between the release of the plosive (burst) and the onset of the following vowel -Primary acoustic cue for perception of voicing 1. Voiceless: 40-80 msec. 2. Voiced: -10-20 msec. -Influenced by rate and phonetic context -Development: perception at birth, production 15-20 months

Frication

The turbulent noise of a sound. The hissing element of a speech sound, such as an affricate. A sibilant fricative noise is stronger than in non sibilants.

is not

There (IS or IS NOT?) a one-to-one relationship between the acoustic features and the perceived consonant?

1. tongue height, 2. tongue advancement

There are two ways to describe vocal tract articulatory posture by:

Acoustic-Auditory Target Models

There can be variations on the articulation of sounds, like vowel formants, but the listener can still recognize the sound accurately

In regards to limitations of the simple target theory, explain interpersonal variability.

There is interpersonal variability in vocal tract size and shape (male/female/children's vocal tracts are physically different which leads to different formant frequencies for vowels). Often all formant frequencies are shifted because of a vocal tract size difference; also, proportional differences are possible, depending on the relative size of the oral cavity and the length of the pharyngeal cavity.

How does pitch relate to frequency?

They are the same.

Voicing Cues for Liquids and Glides

They both have f0 (because both are voiced)

F1 & F2 Plots for men, women, and children

They get bigger as frequency gets higher. (Men->women->child)

Are semi-vowels considered consonants or vowels?

They have some aspect of vowels, but they are considered consonants.

While producing a vowel, if a speaker raises his or her fundamental frequency, what happens to the formant frequencies?

They stay the same.

If two spectra with equally-spaced harmonics were measured from two different vowels, and the first spectrum tilted down less than the second one, which one would have the higher pitch?

They would have the same pitch

What is an appropriate response to the following question? Is that your RED book?

This is my GREEN book

o Nearly periodic complex waves o Continuous aperiodic waves o Transient aperiodic waves

Three sources of speech sounds

Waveform axis

Time (x) by Amplitude (y)

Voice Onset Time

Time between burst and when vocal folds start vibrating for vowel following stop. 4 Possible VOT values: 1) Prevoicing VOT lead-when voicing occurs before the release burst 2) Simultaneous voicing-VOT and release burst occur at the same time. (For voiced stops) 3) VOT with short lag-onset and vibration occurs just after release burst. (For voiced stops). 4) VOT with long lag-voiceless stops. Bigger VOT.

Voice Onset Time

Time from release of stop closure (marked by the burst) to onset of voicing Burst + frication + aspiration Longer for voiceless stops

Why is the dBA measurement used?

To take the fact that we have different sensitivities at different frequencies into account

/g/

Tongue dorsum mid back (Formants), moves to high back, stops air (Stop gap). Build up air pressure, explode (Burst), short air flow before tongue gets far from dorsum (frication), air continues to flow (aspiration). Tongue dorsum moves to low back (Transition), and vowel space resonates source (Formants).

/n/

Tongue front high front (Coarticulation), tongue root into pharynx (Vowel formants). Tongue apex moves to alveolar ridge stopping oral air flow, tongue dorsum moves a bit forward (Transition). Velum opens and resonates source (Damping of signal). Tongue dorsum moves back and root moves into pharynx (Transition). Vowel space resonates source (Formants).

[i]

Tongue high - low first formant tongue front - high second formant

/s/

Tongue high in front, low in back (Formants). Tongue apex rises to constrict air at alveolar ridge (Noise). Tongue apex lowers a little (Transition) to high front. Vowel space resonates source (Formants).

/ʧ/

Tongue high in front, low in back (Formants). Tongue apex rises to stop air at alveolar ridge (Stop gap). Tongue taps a bit releasing air (Multiple Bursts). Apex drops and constricts air (space resonates Frication Noise). Tongue apex barely moves down (not much Transition) to high front. Vowel space resonates source (Formants)

How do changes in the Vocal Tract affect vowels?

Tongue position changes for each vowel and filter characteristics change accordingly. Change vocal tract, that changes filter, which changes the output.

Tongue height and tongue advancement

Traditional description of vowel formation

Transfer Function

Transfers energy to frequencies. Another name for resonance

A transducer is a device that:

Transforms energy from one form to another, concerts electrical vibration to air pressure vibration, and converts air pressure vibration to electrical vibration.

• Source=rapid pressure change; some consonants such as /p/, /b/; stop consonants

Transient aperiodic waves source

Vocal Folds don't have to close completely to vibrate

True

True or false: vowels are the loudest sounds with longest durations

True. vowels are the loudest sounds with the longest durations. vowels function as nuclei of syllables. phonation energy comes out easy and efficiently for vowels.

Diameter of Tubes

Tubes may vary in diameter along their length - tube resonances change (F1-F3) - some frequencies boost or reduce

The resonance of a tube is based on one-quarter, three-quarter, five-quarter (and further odd-numbered) wavelength resonances for which types of tubes?

Tubes open at one end and closed at the other

3 phases of stop articulation

Two occlusions Intraoral pressure build-up Release of pent-up air pressure in oral cavity

Incoherent Sounds

Two or more independent sources: -Blenders, lawnmowers, window air conditioners, cars, etc... -Most common sound sources

Liquid

Two semi vowels produced with relatively prominent sonority and with some degree of lateral emission of air; one of the consonant manners of articulation. Also called approximants or laterals.

Liquid

Two semi-vowels produced with relatively prominent sonority and with some degree of lateral emission of air; one of the consonant manners of articulation. Also called approximants or laterals.

Underlying Waveforms

Two sine waves extracted from complex wave

Combined tubes: /a/

Two tubes closed at one end and open at the other

o Air particles vibrate most effectively at the open end of the tube, and least effectively at the closed end of the tube

Uniform tube model- at what ends do air particles vibrate most and least effectively?

o The open end of the tube (lips) will have a velocity maximum (pressure minimum) o The closed end of the tube (glottis) will have a velocity minimum (pressure maximum)

Uniform tube model- what happens to velocity and pressure at the open end (lips) and closed end (glottis)

Decibel

Unit for comparing the intensity of two different sounds; not a unit of absolute measurement. -Often forced to be an absolute by comparing to (a barely audible) .00002 Pa -This is dB SPL -Logarithmic scale, ratio measure.

What are some methods used to interfere with articulatory movements?

Use of: bite blocks to interfere with jaw movements, metal plates to interfere with labial closure, palatal prostheses to alter alveolar ridge

Place Cues for Glides

Usually the first two formants are sufficient to differentiate between /w/ and /j/ (/j/ has a high F2, similar to the one characteristic of /i/, while /w/ has a low F2 like /u/)

Mass Model

VF are compressed and spring back

Special name of period from stop release burst to the onset of voicing for a following vowel?

VOT - Voice onset time

What is simultaneous voicing?

VOT = Zero, Voiced Stops in English (B, D, and G)

If onset precedes stop release...

VOT is negative

If onset of phonation follows stop release...

VOT is positive

What is prevoicing VOT lead?

VOT measure is negative; vocal folds are vibrating before articulatory release

auditory cues

VOT, F1 cutback

Stops: VOT and Voicing

VOT- time interval from release of articulatory constriction to start of vocal fold vibration for the following vowel VOT represents articulator- laryngeal coordination

High, front, tense, unrounded

VPM for /i/

High, back, tense, rounded

VPM for /u/

Low, front, lax, unrounded

VPM for /æ/

Low, back, tense, unrounded

VPM for /ɑ/

Low-mid, front, lax, unrounded

VPM for /ɛ/

High, front, lax, unrounded

VPM for /ɪ/

High, back, lax, rounded

VPM for /ʊ/

Compact spectrum with a peak about 1.5-2 KHz?

Velar

Which fricative has a nearly flat spectra?

Velar fricatives

What stops have a converging F2 (3500Hz) abd F3

Velar stops (F3 is velar pinch)

Nasals

Velum is lowered, nasal cavity is couple to vocal tract

Voicing Bars

Vertical bars on spectrogram that represent voicing

glottal pulses

Vertical lines in the voiced section that go up on the spectrogram, which are the fundamental frequency that you can measure in how many cycles per second

How to produce glides

Vf vibrations, moving from one vocal tract position to another.

Source

Vibrating object, impulse, or other means of causing the initial condensation or rarefaction

Compare wide vs narrow-band

Wide: dark, easy to find bands of energy. Narrow: more faded, shows component harmonics, time resolution isn't good

Bandwidth

Width of dark frequency area on spectrogram

Nearly Periodic Complex Waves

Vocal Fold Vibration all vowels, many consonants

Stop Gaps

Vocal Folds are adbuct (open) • Total or near‐total absence of energy Voiceless stops: • complete silence Voiced stops: (shorter gap) • Varying amount of silence (depending on transglottal flow) • Voicing is low amplitude due to damping • Seen as "voice bar" on spectrogram

Which anatomical structure do we ascribe fundamental frequency?

Vocal folds

Fricative /h/

Vocal folds are closed together such that sufficient airflow will generate an aperiodic noise. Similar to state of glottis for whispering. Spectrum is highly dependent on the following vowel.

Glottal Area Function

Vocal folds are typically addicted for about 60% of each cycle. A _____ shows the state of the glottis from cycle to cycle. Opening, open, closing, closed. Not seeing acoustic energy but opening of glottis over time. Closing happens longer. Use this to see atypical open to close ratio. Can find frequency of the vocal folds.

How to produce nasals

Vocal folds vibrate and vp port must be open. Obstruction in the oral cavity as well. Oral and nasal cavity resonance, only exits through nasal cavity. /m/-obstruction at lips /n/ obstruction at alveolar ridge /ng/-obstruction at velum

Vowels

Vocal sound produced by relatively free passage of the air-stream through the larynx and oral cavity; the nucleus of a syllable; voiced, greater energy and less meaning, always voiced

Resonant Spextrum

Vocal tract configuration that allows for resonance. NOT SOUND

Filter Characteristics

Vocal tract filter is frequency dependent - Allows certain frequencies to pass through the filter with greater amplitude than other frequencies - Frequencies can get intensified

voiceless fricatives

Vocal tract source may be the sole sound source.

What is VOT?

Voice Onset Time - the delay in onset of voicing (relevant in aspiration)

Voice onset time in terms of consonants

Voice: less than 20 ms Voiceless: more than 25 ms The time between the release of consonant and onset of voice

If VOT is less than 25 ms, the plosive is...

Voiced

Fricatives

Voiced: /ð/ /v/ /ʒ/ /z/ Voiceless: /θ/ /f/ /ʃ/ /s/ weak-----strong

If VOT is 25 ms or greater, the plosive is...

Voiceless

Which Sounds have greater intramural pressure?

Voiceless greater than voiced

The job of a microphone is to convert air pressure variation into:

Voltage variation

Dialect cues

Vowel Duration and diphthongization

Formant Transitions

Vowel production begins while the stop is being released. Superimposed on transient noise. About 50ms. Usually easier to detect for voiced stops because of the continuity of voicing energy between the stop and vowel. The slope depends on the place of articulation and vocal tract positioning for the following sound.

vowels form the approximate shape; tongue height and advancement are not completely accurate

Vowel quadrilateral (what form its approximate shape and what are not completely accurate?)

Diphthongs

Vowels that change resonance characteristics during production

What are the only sounds that can occur alone by themselves and still be meaningful (syllabic) in some contexts?

Vowels. Vowels in isolation can occur; hard to do with consonants.

Sources of energy

Vowels: vocal tract is relatively open. Periodic. Sound source is the vocal folds. Consonants: voiceless-if aperiodic-sound source is vocal tract. Voiced-if periodic and aperiodic-vocal folds and vocal tract.

poor

WITHIN categories of sounds, discrimination is good or poor?

Coordinating & Sequencing Articulator Movements

Waveform and kinematic data (x-ray microbeam) segmented into "units," demonstrating underlying kinematic activity of just a single point on the tongue, and how the movement does not closely correspond to the segmentation of the waveform

Waveform vs Wideband Spectogram

Waveform: time and amplitude Wideband spectogram: time and frequency

How are vibrations in the air transmitted to the cochlea?

Waves in air --> oscillation of bones --> waves in fluid --> neural impulses to the brain

1. nearly periodic complex waves, 2. continuous aperiodic waves, 3. transient aperiodic waves

What are the three sources of speech sounds?

Wide bandwidth

What bandwidth is formant structure?

Narrow

What bandwidth is harmonic structure

Where the different harmonics and formants are

What can a spectrogram show you?

-Provides detailed characteristics of the harmonics or formants -Provides spectral picture of the acoustic wave

What does a spectrogram do?

lower

What does lip rounding do to all the formants? lower or higher them? It makes the vocal tract longer

Strain

What happens when vocal folds are stretched. Produces a length change of tissue in direction of the force

The first

What harmonic has the lowest frequency?

ending point

What is an off glide

starting point

What is an onglide

The glottis (node)

What is the closed end of the tube

The lips (antinode)

What is the open end of the tube

Front to backness (is the tongue in front, middle, back of oral cavity) (horizontal on quadrilateral)

What is tongue advancement?

If vowels are open (low) or high (closed) (vertical on quadrilateral)

What is tongue height?

Wide bandwidth

With what bandwidth do you see a number of harmonics?

Narrow bandwidth

With what bandwidth do you see one harmonic?

Give a second example of partial assimilation.

When a front vowel follows /k/ (as in "key"), the tongue is usually farther forward on the palate than it would be when a back vowel follows /k/ (as in caught)

Anticipatory

When a sound is influenced by a following sound

Carry-Over

When a sound is influenced by a preceding sound

What's an example of carry-over assimilation?

When a voiceless sound following a voiceless sound remains voiceless ("cats"), but a voiceless sound preceded by a voiced sound becomes voiced ("dogs")

Lombard Effect

When in a noisy setting, people will speak louder so they can hear themselves and -as a result- can be heard by others. ~is used to treat Parkinson's speech

What is categorical perception?

When listeners are: - able to categorise the stimuli consistently - unable to discriminate between stimuli in the same category

Retroflex

When the tip of the tongue is curled up and back

Shunt/side-branch resonator

When the vp port is opened, the nasal cavities are acoustically coupled to the rest of the tract. Energy in the oral cavity is at a dead end. Longest for /m/ first antiformant~1000Hz, shorter for /n/ ~2000Hz, shortest for /ng/ ~3000-5000Hz. The oral cavity becomes this. Contributes to antiformants.

Does tube length affect formant/resonant frequencies?

When tube length changes, formant frequency changes (frequency at the peaks)

lips- the open end of the tube

Which part of the tube will have a velocity maximum (pressure minimum)?

glottis- the closed end of the tube

Which part of the tube will have a velocity minimum (pressure maximum)?

What are the frequencies contained in a complex periodic wave?

Whole-number multiples of a lowest fundamental frequency.

Measuring Amplitude

Why? -Equipment: Avoid distortion and to calibrate. -Humans: Hearing damage, comfort, and diagnosis. How? -Static Pressure -Dynamic Pressure -Microphone & Meter

Spectrogram: Filters Wide-Band & Narrow-Band

Wide-band/broad-band filter: Filters 300-500Hz. Energy from several adjacent harmonics are added together. F0 = count the number of individual vertical lines per unit time. F1 & F2 visible as dark concentrations of energy. Shows for short amount of time = 3-5ms. Can determine F0. Excellent time resolution Narrow-band: Filter 45-50Hz = resolution is much finer; can see harmonic components active/amplified. Cannot isolate each cycle of vibration. Identifies individual harmonics. A little longer = 20ms. Excellent harmonic resolution.

Are vowels carriers of prosody?

Yes, (f0-frequency, intensity, duration)

Can formants occur simultaneously? Do they mix well at the same points?

Yes, all formants occur simultaneously, while a number of their (p) and (v) points overlap with their own kind respectively. However, they do not mix well at the same point which is the reason why formants occur only as odd multiples in the acoustic resonator of the vocal tract; most other resonators occur in simple multiples.

Are vowels necessary for consonants?

Yes, consonants are superimposed onto vowels at the beginning or ending of a vowel

Are resonant poles different than harmonics?

Yes, they go up in odd multiples, whereas harmonics/overtones are simple multiples.

Is resonance present regardless if you have or voice or whisper?

Yes, voice just makes more energy.

Are vowels always voiced?

Yes; non-nasal (unless nasal assimilation)

Are vowels dominant speech sounds perceptually and physically/physiology?

Yes; they contain much sound energy.

For [l] and [r] what is formant configuration

[l] - f2 is middle; f3 is high [r] - f2 is middle; f3 dips

affricate

___ acoustic features: rise time, duration of frication, relative amplitude in third formant region, stop gap

nasal

___ acoustics: many spectral peaks, but most have low amplitude. antiformants. nasal formant. highly damped formants

fricative

___ articulation: narrow constriction in the vocal tract, when air flow rate is high, turbulence results, perceived as turbulent noise, relatively long duration compared to stops

voiced

___ stop shows vertical striations during the period of closure and remnants of formant frequencies

voiceless

___ stops (highest intraoral pressure) follow the stop release with frication (aspiration)

coupling

______ of the oral and nasal cavities also causes antiresonances or antiformants

energy

______ will be longer in duration because fricatives are continuous sounds.

antiformants

______can occur when the vocal tract is bifurcates or radically constricted

nasalance

a ratio of the nasal energy to the overall combined nasal and oral energy as measured from the acoustic pressure waveform

What are sibilants characterized as

a "distintive" hissing noise

Stress serves as what?

a "pointer" telling the listener which information is most important in an utterance

low

a ___ F3 is a distinctive property of rhotic sounds - both consonant /r/ and r-colored central vowels

aspiration (coarticulation effect)

a brief hiss that occurs sometimes after voiceless stops , never after voiced stops. this is likely a function of function of transition of vocal folds from no voicing to voicing (vocal folds moving back to phonation position)

vowels

a category of speech sounds produced with unobstructed vocal tract, usually produced with vocal excitation but not always (whisper), excitation due to glottal vibrations

Why is feedback so important in children?

a child trying to say a certain word tries it out, senses articulator movements and positions, gets tactile and acoustic results, and compares the output of the word with the stored sound pattern of the adult production

resonance

a cold will disrupt (blank) and soft palate problems could affect it.

cleft palate

a craniofacial abnormality that arises when the palatine bone fail to fuse completely during gestation -results in abnormal airflow and resonances--hypernasalization -bilateral most traumatic- requires the greatest surgical repair --need multiple surgeries because as the child grows the seam falls apart --have to wait at least 3 months to let structures settle

Frequency range in a noise burst release

a cue to place of articulation

the nasal murmur is due to

a formant resonance of the vocal tract from larynx to nostrils

How is a higher F0 for the heavily stressed syllable attained?

a function of increased vocal fold tension, increased expiratory effort leads to increased subglottal pressure and then to extra effort in the larynx

Opening the VP port creates:

a large resonating cavity resonator

effector

a level of motor programs for enactment

executive

a level of motor programs for information processing

voiceless stops typically have

a long voice oncset time and a strong release burst

mandible

a major regulator of oral cavity opening during speech

Diphthongs are characterized by what? and what are the 3 stages?

a more or less gradual change in resonance due to fairly slow changes in tongue position (or shape) and mouth opening (rounding). there appear to be 3 stages in a diphthong on-glide: relatively steady period during which there is no change glide: a gliding/sliding pattern of formants f1, 2 off glide: a relatively steady state during which there is no change; the off glide is shorter in duration than the on glide.

Falling intonation is seen as what?

a natural product of running speech

What is aspiration?

a period of voicelessness after stop release, seen in the three English voiceless stops

Feedback seems nonessential for who?

adolescents and adult speakers, speech has already fully developed

What is a servomechanism?

a self-regulation machine where the device output is fed back into the system

vocalic

a set of landmark that has frequency and amplitude of f1

voiced stops typically have

a short voice onset time and a weak release burst

Fricative

a sound source is created by a severe constrictions within the vocal tract, rather than just at one end of the vocal tract.

nearly periodic complex waves

a source of speech sounds that has focal fold vibration--all vowels and many consonants

transient aperiodic waves

a source of speech sounds that has rapid pressure change--some consonants such as stops like /p/, /b/

continuous aperiodic waves

a source of speech sounds that has turbulent flow through a supraglottal constriction (noise)--many consonants, such as fricatives like /s/, /f/

What is an affricate?

a stop with a fricative release through a narrow constriction

unit of analysis

a theoretical issue composed of sound, syllable, word, and gesture

dynamic systems

a theoretical issue: explain the mechanism that constrains the potentially infinite number of degrees of freedom of speech production system to a few useful degrees of freedom.

output targets

a theoretical issue: perhaps the CNS has some goal or target output for which it controls muscle activity during speech

coarticulation

a theoretical issue: the adjustment of articulator movements to target more than one speech sound simultaneously. - temporal coordination of multiple articulators

motor programs

a theoretical issue: a pre-structured set of central commands capable of carrying out a movement. sensory feedback is an integral part

Oral release of a stop teilds

a transient noise source/ release-burst

Vowels

a vocal sound produced by relatively free passage of the air stream through the larynx and oral cavity; these have an open vocal tract; the nucleus of a syllable (unless word has syllabic consonant)

SVS coarticulation (silence)

a vowel in isolation shows this?

What does pitch rise at the end of an utterance signal?

a yes/no question

What sound is considered a low, back vowel? What are the articulatory configurations associated with this vowel?

a, High F1, Low F2

What sound is considered to a high, back rounded vowel? What are the articulatory configurations associated with this vowel?

a, High F1, Low F2

The loudest vowel that we have is

a, c (backwards c)

What are the cardinal vowels?

a, i, u

What three point vowels assist in understanding the extreme positions in articulatory and vowel formant frequency data

a, u, i

VOT for voiceless stops

abducted, not vibrating during voiceless stop gap adduct and build up pressure, then release it it takes at least 20 msec to do this before vibration

What is the dB SPL of .1 Pa?

about 74 dB SPL

What are we trying to explain in speech perception theories?

about speech perception in general, about the perception of specific sounds (vowels, consonants, etc), and about speech in naturally occurring speech or contrived controlled situations.

What is the frequency for most of the energy in /s/?

above 4000 Hz

suprasegmentals

above the segments (prosody)

harmonic doubling

abrupt appearance of a harmonic series at 1/2 F0 (in between the harmonics)

What do the walls of the nasal cavity do to sound?

absorb sound

Venturi Effect

acceleration of air through a narrow channel

which cranial nerve enervates the sternocleidomastoid muscle

accessory

assimilation

acoustic affect of coarticulation, change in a speech sound's acoustic feature because of context dogs s=/z/

Nasal phonemes have a strong resonance around 200-300 hz called a

acoustic murmer

duration

acoustic result of tense-law dimension

What do active speech perception theories have in common?

active theories have in common that they somehow use info about features of speech-production process in explaining speech perception.

During nasal sounds, the Palatoglossus muscle may _____

actively lower velum

When two waveforms are played at the same time they create a new waveform by

adding the waveforms

roundness

adds to length of vocal tract

formant transitions may vary on....

adjacent vowels

the typical falling intonation contour near the end of a declarative statement - results from the economy of expiratory effort - is due to falling subglottal pressure as air is exhaled - may be accompanied by a switch to the pulse register - is described by "breath group" theory - all of the above

all of the above

Formant 3 has a lot more (p) and (v) points that offer more opportunities to manipulate them which allows what?

allows for more opportunity to make those effects.

What is assimilation?

alteration in the movement of a single articulator

higher (4 kHz - 12 kHz compared to 3 kHz)

alveolar sibilants have ___ frequency energy range than palatal sibilants. but spectral irregularities aren't important in perception

the affricates in english consist of a(n)

alveolar stop followed by a short palatal fricative

What places of articulation make up sililants

alveolars and postalveolars

place cues for fricatives (amplitude)

amplitude of a fricative relative to the vowel is louder than other fricatives (stridents such as "sh, s, z, zsh"

What are stop-plosives?

an abrupt sudden release of air.

What kind of movement is VOT?

an acoustic movement that describes coordination between laryngeal and articulatory systems

biphonation

another second set of a F0 and its harmonics, having a double series

some fricatives have just

aperiodic, subglottal noise with no periodic source

maximum sensitivity of the human ear for sound

approximately 3,000 to 4,000 Hz

Axial CAT Scan

approximately the level of the 4th cervical vertebrae; bone (high density) is white; soft tissue (low density) is shades of gray; air is black

What are the acoustic features of affricates?

are a combo of stop and fricative features 1. silent gap, with and without phonation 2. release burst as in stops with extended duration of aperiodic frication noise (as seen in fricatives)

stop consonants

are characterized by a closure in the vocal tract is released rapidly

The sound at the lips contains the same harmonics as the glottal source, but...

are shaped by the transfer function and includes harmonics with a modified amplitude

What is assimilation of manner of production?

articulators are placed in a different location resulting in a different manner of sound

offglide

articulatory ending point of the diphthong

formant transitions

articulatory movement from stop to vowel entails a formant movement, important for perception, 50ms in duration *as vocal tract changes, formant freq change

onglide

articulatory starting point of the diphthong

F1 increase and F2 decreases/relationship to F2 is unclear

as the front vowels become more open (low) then what?

What describes how speech sounds become like neighboring sounds

assimilation

What are the two context effects?

assimilation and coarticulation

The movement of one articulator is characteristic of ____, whereas the simultaneous movement of two articulators is characteristic of _____.

assimilation; coarticulation

place of articulation

associated with formant transitions (especially F2) but can be difficult to see on spectrogram

nasal murmer

associated with lower amplitudes and resonant frequencies compared to surrounding vowels

voiceless unaspirated

at release

Where does fricative noise originate?

at the articulatory constriction

the voice bar in voiced stops arises because

at the time of stop closure; the pressure in the vocal tract is 0 cm H2O, so phonation can continue until the pressure builds up

What is central neural/internal feedback?

audition and action are external feedback systems, their information is delivered to external receptors

the contraction of the posterior crico-arytenoid muscles has the effect of drawing the vocalic muscles toward/away from the midline

away

average VOT for initial stops (voiced)

b 1msec d 5msec g 21 msec

evidence that categorical perception is innate

babies and monkeys have it

glides w and j are distinguished

based on F2, j has high F2 and w has low F2

liquids r and l are distinguished

based on F3, r has lower F3 and l has higher F3

When given a phoneme (i.e. /b/ or /p/)

be able to specify place, voice, and manner. LOOK AT CONSONANT CHART BELOW!

When given a phoneme (i.e. /p/, /t/, /k/)

be able to specify the associated acoustic cues.

Why are semivowels and nasals resonants?

because they are characterized by a relatively free flow of air and thus formant structure (nonresonants have little or no formant structure)

VOT (begins and ends)

begins at the release of articulatory constriction ends at the start of vocal fold vibration no VOT when stop is in syllable final post vocalic position (VC): so we use the preceding vowel duration instead

tri-top theory

berkowitz audition; tactile; visual (tri) top-down using theory to drive clinical practice if theory can't explain what is seen in clinic, its not perfect special populations are special

100 - 8,000 Hz

best part of curve for speech and hearing

what are the points of constriction for /s/ and /z/

between the alveolar ridge and the tongue and he opening between the upper and lower incisors

Lateral X-ray Image

bone (high density) is white; soft tissue (low density) is gray; air is black; there is a lack of depth in the image, all structures appear on the same plane

What does the end of a declarative sentence mean?

both a decrease in F0 and intensity - heavy sigh, pitch falls as lung volume decreases

The eardrum is a ______ band resonator

broadband resonator. during low freq it moves entirely; only smaller sections vibrate with higher freq.

vocal tract is a

broadly tuned filtered resonator that is highly damped

which of the following is not true? nasal are charactereized by ____ -strong, low frequency nasal murmur (formant) of the nasal passages - build-up of pressure and release burst - complete blockage of the oral cavity - weak formants (anti formants) due to the absorption of sound by the oral and nasal passages - formant transitions as cues for place of articulation - all of the above

build-up of pressure and release burst

vocal fold vibration

builds air pressure up below them to blow them apart, come back together because of the Bernoulli Effect and elastic recoil

two major tongue constrictions

bunched (tongue bunched up towards hard palate and retroflex (tongue curling up to palate) ALSO need pharyngeal constriction

final stops may or may not have

burst

burst spectra for alveolar

burst has most of the energy around 3000-4000HZ for male speaker and an uptilt

burst spectra for labial

burst has most of the energy under 600HZ and it has an overall downtilt

burst spectra for velar

burst is linked to the F2 of the following vowel and is usually a few hundred hertz higher than the F2 of the following vowel (tend to have narrow spectral peaks)

what is the analysis by synthesis theory

by Kenneth Stevens. sensory only. s-s. in speech perception, auditory patterns that are recognized are compared/matched with self-generated auditory models (of how the listener would produce these same acoustic patterns.) hears "beatcha" and applies their own phonological knowledge to infer that the speaker means "beat you". in contrast, the motor theory of speech perception explains that the patterns are motor in nature. in this theory the patterns used in perception are entirely acoustic. this theory was no longer used by Stevens in favor of the later development of his quantal theory.

How is restricted airflow in the nonresonants created?

by articulators forming constrictions in the vocal tract, thus aperiodic noise is created as the airflow passes through

How has proprioception for speech been explored?

by interfering with articulator positions and movements and studying compensatory strategies

Fn=

c/ λ

consonant vowel formant transitions

can be hard to see depending on vowel, and rate of transition

turbulence

can occur alone or in addition to vocal fold vibration

With regard to speech, define categorical perception.

categorical perception is defined as the phenomenon that we can discriminate speech sound differences only as well as we can identify them (= give them a speech sound label). the prediction of abrupt/categorical perception shifts was tested through use of "pattern playback stimuli"; stimuli could be presented this way with one characteristic systematically varying across dimension.

consonants

category of speech sounds usually produced with vocal tract obstruction, produced with or without excitation

the release burst stops is filtered by the ___

cavity anterior to the closure point

What is the different between central neural/internal feedback and proprioceptive feedback?

central feedback is generated by muscle activity, but does not tell about muscle activity itself, as proprioceptive feedback does

the non-mucular part of the diaphragm is called the

central tendon

Vocal folds..

change aerodynamic to acoustic energy

What is complete assimilation?

change from an allophone of one phoneme to an allophone of another phoneme

assimilation

change in articulation of speech sound that makes it more similar to articulation of neighboring sound

most important cue for the perception of stress/intonation

changes in frequency

VOT very well studied

changes with prosody development: VOT shorter in children Speech disorders: VOT often affected, often continuous voicing

independence of source and filter

characteristics of source and filter can vary independently without affecting the characteristics of the other

stops

characterized by a complete closure somewhere in the vocal tract /p,b,t,d,k,g/ relatively short in duration, low intensity, wide frequency rage (high, mid, or low) 2 acoustical cues: gap and release

Vowels have more sound ___________ than consonants

energy

place cues for stops 1

energy peak in the spectrum of the burst applies mostly to non-final stops

actual glottal volume velocity

energy roll-off is a function of the speed and completeness of vocal fold closure. In general, the low frequency harmonics dominate.

Fletcher & Munson

equal loudness in phons (graph) two dots on same line are equally as loud highest line = dangerously loud (IRB issues) curves represent equivalent loudness level (intensity as perceived by the average ear) always have frequency, intensity, phons

place cues for fricatives (formant transitions)

especially F2are especially relevant for distinguishing labiodental fricatives than dental ones. (F2 is generally starts lower for labiodental fricatives than for dental ones

shape

every vowel has a different (blank)

aerodynamic targets

evidence from studies of individuals with velopharyngeal incompetence or hearing impairment. aerodynamic stability is an important regulating factor

perturbation studies

examine effect of disturbance to speech production system. It can be anticipated/unanticipated, transient/static, biomechanical/acoustic/aerodynamic alteration

speech science

explore timing and contact patterns for sounds -not amount of pressure -not sounds with no contact

The manner of fricatives is the presences of a relatively _____ period of noise.

extended period of noise (frication).

1. genioglossus, 2. styloglossus, 3. palatoglossus, 4. hypoglossus

extrinsic muscles of the tongue (4)

low amplitude fricatives

f v th d h

nonsibilants/non stridents

f, th, v

Weak Fricatives

f, v, θ, ð -All have medium-gray to nearly invisible random energy -Place of articulation determines frequency range -Voicing determines the duration and existence of voice bars or striations

diphthongs acoustic properties

f1 and f2 are touching but near end start to separate

f2 decreases in frequency as

f1 increases

the glides /w/ and /j/ differ on the nasis of ___, which is high in ___ and low in___.

f2; /j/; /w/

different types of intonation

falling-statement, declarative sentence rising- yes/no question level- unfinished sentence, to be continued

t/f sibilant fricatives have less energy than non-sibliant fricatives

false

t/f the left and right lung have the same number of lobes

false

t/f the trigeminal cranial nerve has only sensory branches

false

anticipatory

features a sound appear earlier than the sound, forward coarticulation /aem/ vowel is nasal

retentive

features of a sound carry over to the next one, backward coarticulation /no/ vowel is nasal

A wide-band spectrogram is best for showing the...

filter characteristics of the speech

the vocal tract acts like a

filter reshaping the spectrum of the source

in what part of a syllable is the /l/ considered dark

final

the many-to-one configuration of neurons to muscle affords us

fine motor control

transitions

following the steady state, formant frequencies change direction according to the following sound

p b

for /__ __/ F2 and F3 rise slightly

k g

for /__ __/ F2 and F3 separate steeply and rapidly

t d

for /__ __/ F2 falls and F3 rises slightly

Give an example of short lag.

for /b/, vocal folds begin adduction at the same time of labial occlusion - adduction continues during the hold phase, and at the same time of the release phase, the folds are still adducted, ready for phonation - for /b/, there is a short (0-10 ms) positive VOT value

When is fricative energy very low?

for /f/, /b/ and voiced and voiceless /th/ due to the lack of resonating cavity anterior to point of constriction

how loud a sound feels depends on

how much intensity it has (dB SPL) on the y axis what frequency it is (Hz) on the x axis

Give an example of long lag.

for /p/, vocal fold begin adduction at some point during the hold phase, so the glottis is still open at the moment of release - VF adduction is not complete until sometime after the stop release, during articulation of the following vowel - for /p/, there is a long (between 60-70 ms) positive VOT value

When is fricative energy high?

for /s/ and /z/, there is a high frequency, high energy noise

closed end of the tube (glottis)

for our uniform tube model, air particles vibrate least effectively where?

open end of the tube (lips)

for our uniform tube model, air particles vibrate most effectively where?

Why don't you hear aspiration in voiced stops?

for the voiced stops, /b,d,g/, the glottis is closed at the moment of stop release, forcing the breath stream to set the vocal folds into vibration, sending vibrating air (phonation) into the upper vocal tract

/i ɪ e æ ɛ/.

for these vowels, many people use the genioglossus to lower the tongue, others lower the jaw

"schwa" (upside down e)

for this neutral vowel sound we model the vocal tract like a uniform tube

/i/

for this sound, the genioglossus pulls the back and root of the tongue toward the front. becasue the tongue is a muscular hydrostat

/u/

for this sound, the styloglossus pulls the tongue up and back. The hyoglossus pulls the tongue down and back. So the tongue squishes up and back.

short

for voiceless fricatives the front cavity is so ______ it has little filtering effect on the noise energy

concentrations

for ð, there are other __________ of energy at 1500, 2500, and 4000 Hz.

strain gauge

force of lip movement

transitions

formant ______ occur much as they do for other consonants -when oral articulation changes, bends in the formant patterns are seen

children's speech

formant frequency will lower with age due to the changing length of the vocal tract, most dramatic change at puberty, productions become faster and more reliable, nasalization is reduced.

major cue to tell you which diphthong was produced

formant glide, especially the rate of change

Whare are glides characterized by

formant structures similar to vowels

shorter

formant transitions are often _____ for /l/ than for /r/

/b/, /d/, /g/, /p/, /t/, /k/

formant transitions in the high and low vowels preceding and following what? (6)

1. silence, 2. burst noise, 3. voice onset time 4. post-stop vowel formant transition

four acoustic cues

What re the cues for place on stop plosives?

frequency locus of the burst (a brief moment during aspiration) best visible when initial plosive is VL. bust is high- alveolar placement (t, d) burst is middle/spread out- velar (k,g) burst is low- bilabial (p,b)

which is stronger frication or non-sibilants

frication (noise)

Voiceless fricatives

frication noise is sole source

on a spectrum displayed, a voiceless fricative shows

frication noise with a frequency range comparable to that of its voice cognate

initial affricative may look like

fricative

____ have little to no formant structure and often are the result of turbulent airflow through constrictions within the vocal tract

fricatives

The nonresonant consonants of English are the

fricatives, the affricates, and the stops.

consonants range

from vowel-like sounds with relatively open vocal tract EX: glides [w][j] to sounds produced with severe vocal tract constriction [s][t][f]

F2

front vowels have high F2 and back vowels have low F2

the precentral gyrus (motor strip) is located on the - lobe

frontal

tongue muscles

functionally: apex, lamina, dorsum, root. intrinsic and extrinsis muscles

What is the vibrating frequency of the vocal folds

fundamental frequency

intonation

fundamental frequency, across the phase/sentence, tendency for declination, rises for yes/no questions, emotions

Wideband (or Broadband)

generates a display of formants, vertical striations indicate intermittent measurement, broad bands of energy show formants, center of each band of energy is the estimated frequency of the formant, black spaces indicate silence , filters set at 300-500hz. DOES NOT resolve energy to show individual harmonics, Obtains information about timing of changes in vocal tract (VOT, center frequency, multiple harmonics adjacent to one another)

Narrowband

generates a display of harmonics, narrow, horizontal bands represent harmonics of glottal source, darker bands represent harmonics closest to the peaks of resonance in the vocal tract, blank spaces indicate aperiodicity, filters set between 30 and 50 hz. Used to measure fundamental frequency and intonation, NOT used for making temporal measurements *duration and VOT)

a common name for consonants (j) as in you and (w) as in we is...

glide

/w/ and /j/ are

glides

F2 transitions

glides

semi-vowels

glides /j w/ function as onglides to vowels, function as consonants but have open vocal tract- so sometimes V and sometimes C

F2 and F3

glides acoustics: distinguished among themselves by ___ /w/: F1 and F2 are low /j/: low F1 and high F2

Sonorants

glides or semivowels /w/ and /j/ liquids /r/ and /l/ voiced similar to vowels

F2 = ____ ; F3 = ____

glides, liquids

What type of phonation onset has the shortest vocal rise time?

glottal attack

high frequencies

gradually disappear with age and go away first

glottogram

graph of laryngeal waveform, PGG or EGG, useful info about VF closure events, little info about open phase, periodicity, amp and shape of waveform measured qualitatively

greater energy

greater displacement of air particles =

women's speech

greater harmonic spacing: formant frequencies may be more difficult to estimate, H1 and H2 amplitudes are 6 dB stronger, higher F0 and more open, steeper spectral tilt, lower power

voiceless stops in English typically have voice onset time values of

greater than 25 msec

what are cilia?

hair cells

A spectrum of sound provides information about individual

harmonics

vocal tract resonates more strongly toward

harmonics within tube bandwidths

Voiced consonants

has a periodic laryngeal source

What does F1 do after stop

has a rising transition

stop consonants

have a brief stop gap, complete occlusion of airflow

what are non-sibilants

have a relatively flat spectrum

manner

how the airstream is modified as it passes through the vocal tract

What is auditory feedback?

hearing one's own speech, air and bone conduction

Ultrasound

help visualize tongue movement

For [j], f2 is ____

high

Large resonator yields ____ damping

high

When /i. is syllable-final: tongue dorsum is:

high

vowel in meat

high F1, close to F2, gap between F3

vowel in a lot

high f1, close f2, gap f3

vowel in cat

high f1, mid f2 and mid F3

F1

high for low vowels and low for high vowels

/i/ distinctive feature

high frequency energy from resonance within oral cavity- small oral cavity, large pharyngeal cavity

Sibilant

high frequency fricative speech sounds; to be a ___ , it needs more energy.

Sibilant

high frequency fricative speech sounds; to be a sibilant, it needs more energy. These are /s/, /z/, /ʃ/ and /ʒ/

What are sibilants and what is the volume

high frequency spectral peak. the sibilants are louder than non-sibilants.

VP port closure is moderate for

high vowels

low

high vowels have a (blank) F1

how is /w/ produced

high, back tongue position, rounded lips

How is /j/ produced

high, front tongue position

Which stop plosives are high and which ones are low when initial plosive is VL?

high- t, d middle- k,g low- p,b

in vowels that follow the plosive...

high- velar k,g mid- alveolar t,d low- bilabial p,b

infant's speech

higher F0 and formant frequencies, intonation is rise-fall, flat, fall *not consistent phonation types: harmonic doubling, biphonation, vocal tremor, noise components, nasalization

spectral roll-off

higher frequencies lose energy

the /s/ has a ____ - frequency spectral peak than the /sh/ because the /s/ is produced with a ___ cavity anterior to the constriction

higher; smaller

thyroarytnoids, use LCA and interarytnoids to close vf, use PCA to open vf

how do we stop air from coming out?

four

how many effective degrees of freedom during speech are there?

The sampling theorem (also known as the Nyquist theorem) is fundamental to digital audio. It proves that...

if the sampling rate is at least twice the highest frequency in a sound, and that if proper low-pass filtering is done, the output sound will be identical to the input sound

large

if the tongue is high front, such as eee- tiny front, (blank) open space in back of tongue

smaller

if the tongue is low back, such as ahhh- makes pharynx what?

close

if tongue height is high it is (blank)

open

if tongue height is low it is (blank)

When are VOT values negative?

if voicing onset precedes stop release

tense

if you can end a word with a vowel it is what?

low pass

if you want to focus on hearing the low frequencies

broad band spectrograms are shown to

illustrate change in vocal duration that depends on voicing feature of postvocalic consonant

retentive (backward)

in "sweet" start lip rounding in /s/

Speech sounds vary intrinsically in what?

in duration

In what is VOT measured?

in initial stops, with 4 categories of values

Where can intonation patterns be seen?

in phrase, word, or sentences

When does tactile feedback occur?

in speech via the articulators contacting one another, or air pressure changes in the glottis or subglottal region

How do speech production models tend to be expressed?

in terms of natural language: verbal descriptions, charts, definitions, and rules

How are assimilation and coarticulation differentiated?

in terms of: - number of articulators involved in each effect - number of speech sounds involved in each effect

anticipatory (forward)

in the "am" start lowering velum in /a/

Give an example of partial assimilation.

in the phrase "eat the cake", the 't' is produced in the lingua dental, rather than the alveolar position due to the influence of the following voiced /th/, thus the new /t/ is an allophone of the original phoneme, thus the change is phonetic not phonemic

Give an example of complete assimilation.

in the phrase "ten cards", /n/ is produced with the tongue dorsum on the velum, in preparation for the velar sound /k/ - this tongue position will actually produce the nasal /ng/ which has a lingua-velar place of articulation, thus there is complete assimilation of /n/ to /ng/

Where is aperiodic noise created?

in the vocal tract

Where is pre voicing VOT lead usually seen?

in voiced stops in Spanish, French, and Italian

maximal gain

increase in intensity= 6 dB (10 log10 4 = 6)

Decreasing the opening of the oral cavity results in..

increase pitch

stress

increased effort during the production of a syllable, increased intensity, fundamental frequency, and duration emphasis on syllable or word in a sentence/ compound word

rate

increased speech rate tends to have a direct correlation with movement velocity of orofacial structures

Amplitude of harmonics decreases as frequency ___.

increases

contracting the diaphragm increases/decreases the volume of the chest cavity

increases

F1

increases as jaw opens , as mouth gets wider, this increases

What does rising intonation result from?

increases in vocal tension that makes folds vibrate faster, which is a result of increased cricothyroid muscle activity

the relationship between volume and pressure is such that when volume - , pressure - and vise versa

increases, decreases

What increases pitch?

increasing length, decreasing mass, increasing tension NOT: Increasing mass

spatiotemporal index

index of consistency of movement across 10 repetitions of an utterance. sum of the SD for displacement of 10 data points at 50 equally spaced points

False

infant perceptual system is the same as the core perceptual system of adults. True or False

Feedback is incredibly important for who?

infants and younger children

The number of ways of producing vowel like sounds is virtually _____.

infinite.

What is tactile feedback?

information receives from touch, stimulation of touch receptors

What is internal feedback?

information within the brain about motor comments before the motor response (brain tellings muscles what to do) - information loop is entirely in the brain

the term "relaxation pressure" refers to the pressure that is created when we relax the muscles used for

inhalation

In what part of a syllable is the /l/ considered light

initial

auditory templates and feature detectors

input/cue is matched to templates

The oral and pharyngeal cavity during the phoneme /a/ is lager or smaller than /i.

larger

Mirror, flexible and rigid are types of

laryngoscopes

The fundamental and its harmonics decrease in ____ as they increase in _____

intensity/frequency

What are the 3 limitations of the simple target theory?

interpersonal variability, dynamic nature of vowel sounds, and ideal frequency target for vowel formants

What suprasegmental deals with pitch change

intonation

1. superior longitudinal, 2. inferior longitudinal, 3. transverse, 4. vertical

intrinsic muscles of the tongue (4)

quantal theory

invariance in the signal that we attend to; grew out of distinctive features when you cross a quintal boundary the sound changes ex: changes from vowel to fricative to stop and this is what we are attuned to

cite two components of a sound system

inventory, constraints, phonological rules

Complete assimilation

involves a change outside of phonemic category. Example- "ten cards' - the /n/ in 'ten' is articulated with the dorsum of tongue on the velum in anticipation for /k/. Produces velar-nasal /ng/ which is a different phoneme then /n/.

Partial assimilation

involves a phonetic change (One allaphone to another ) example: Eat, Eat the cake. The tounge tip typically makes contact with the alveolar ridge for /t/ in 'eat'. In "Eat the cake" the tongue is held on the articulation of the /Q/ in "the". Therefore /t/ has been assimilated to the place of articulation of /Q/

what is the motor theory of speech perception? (liberman et al)

involves a sensory and motor component. s-m this theory is a kind of an umbrella theory with multiple components. the first one is the most important. there does not appear to be a 1-1 relationship between acoustic signals, and our speech perception processes. cues are determined by their specific production context. we are aware of how the speaker's vocal tract is used in such contexts. this knowledge helps in comprehension. so we encode and decode audible speech rather than encipher/decipher it. speech is an acoustic code

The cerebellum is/is not part of the CNS

is

the spinal cord is/is not part of the CNS

is

What is diphthongization?

is a process that occurs on any pure vowel if it is spoken in a stressed diphthong like manner. it represents an example of allophonic, rather than contrastive variation.

F0 Declination

is the Tendency for F0 to Decrease Over the Course of the Utterance

what happens to the intraoral pressure during release of a stop

it drops

when /h/ is voiced -

it is umbedded between voiced segments

What happens to the intraoral pressure during closure of a stop

it rises

How is Rhythm defined

its defined according to the timing of syllables and the timing of the space between them

durational cues

juncture cues (breaks/no breaks) ex: keeps talking/keeps stalking pre-boundary lengthening: at semantic/phrase boundaries, longer duration of final (few) syllables than if they occurred in mid-sentence

Liquids

l (lateral) and r (rhotic) Have resonant frequencies (formants) that change fairly rapidly (faster than for dipthongs)

what stop has no constriction in front, so all frequencies

labial stops

what is the place of articulation that stops can be made?

labial, alveolar, velar

What are the four places of articulation

labiodental, linguadental, alveolar, and post alveolar

The vowel bounds are strongly subject to:

languages, dialectical variations, phonetic contexts, and even individual speaking styles (oral vs throaty resonance). Perception of vowels is able to see through this variability and interpret them correctly.

Does /u/ have a large or small oral and pharyngeal cavity

large

Active (motor) theories of speech perception

leans twoard motor theory of speech perception. active theories in general have the best opportunity to explain the phenomenon of invariable perception, despite the apparent lack of stable acoustic characteristics in the actual speech signal, like allophonic variation and coarticulation. it is possible that how sounds that are perceived as the same may differ depending on speaker, phonetic context, and specific multi influences between sounds. a BLUR characteristic.

primary fricative energy can be calculated based on

length of cavity anterior to constriction f=34,400 cm/4 x xcm

2 cues to voicing in fricatives

length of noise, duration of preceeding vowel, length of voicing bar

Pitch is determined by (on the level of the coal folds)

length, mass and tension of vocal folds

Consonants have_____, but _____ than vowels

less energy, more meaning

Nonstridents

less noisy, have acoustic energy across a very broad range of frequencies (can't say specific range) difficult to distinguish non stridents not because of voicing, but because the filters are so similar in length (interdental and labiodental)

What muscle is active during oral speech sounds

levator palatini

F3 is ____ for /l/

level

when the superior longitudinal muscle of the tongue contracts the tip of the tongue

lifts upward

To what extent can the Central Neural/Internal Feedback Hypothesis be tested?

limited by current investigation techniques, which do not allow for safe investigation (unethical bc must interrupt a person's brain processes)

H&H theory (hyper & hypo)

lindblom explains variability in production speaker and listener work together (signal, linguistic knowledge, contextual knowledge) hyper and hypo: if your signal is too messy, clean it up; vice versa

What do linguistically oriented models use?

linguistic and phonetic analysis to describe speech, such as the use of the International Phonetic Alphabet

Vocal tract during /u/ is lengthened due to

lip rounding and protrusion

What's an example of articulatory blockage?

lips closed

What areas have very large amounts of touch receptors?

lips, alveolar ridge, and especially the tongue

/r/ and /l/ are

liquids

F3 transitions

liquids

Approximates

liquids and glides

What is the formant pattern of fricatives

little or no formant structure compared to vowels/semivowels

major cue that tells you which vowel was produced

location of formants

major cue to identifying which semivowel was produced

location of formants; formant transition to/from vowels

cue to place of production for stops

location of most intense frequencies in burst

cue to place for fricatives

location of the most intense frequencies

voiceless stops beginning a word are usually produced with a long/short delay of voicing onset for the next vowel

long

everything else being equal, the difference between male and female frequencies of glottal cycles is due to that face that men have shorter/longer glottal folds resulting in higher/lower frequencies

longer, lower

Speech is what type of wave

longitudinal

post vocalic nasals

lose intensity during nasal

Back vowels have a ___ F2

low

For [w], f2 is ____

low

Is the amplitude high or low for a nasal murmer?

low

when /l/ is syllable-initial: tongue dorsum is

low

f3 is ____ for /r/

low (will drop)

vowel in sit

low F1, gap high F2 and close F3

high damping

low energy of all formants

describe the formants of /w/ and /u/

low f1 and low f2

vowel in bought

low f1 close f2 and gap high f3

vowel in ooze

low f1, close f2, gap with high F3

vowel in bet

low f1, gap high F2 and close f3

vowel in book

low f1, small gap f2 and high f3

vowel in got

low first formant, large gap to F2 and close to F3

nasal formant

low frequency around 300 Hz, but the highest energy. consonant energy is reduced because of boogers, higher formants have reduced energy, location changes with place of articulation

During stop gap for voiced

low in amplitude can be voiced all the way through or partially

VP port is looser for

low vowels

high

low vowels have a (blank) FW

Low vowels have a ____ tongue body, or a ___ F1

low, high

which of the following are examples of coarticulation? - lowering the velum a bit early when saying /an/ - tapping the tongue on the palte during production of /l/ - lip-rounding during the /s/ in /su - lowering the velum a bit early when saying /an/ and tapping the tongue on the palate during production of /l/ - tapping the tongue on the palter during production of /l/ and lip-rounding during the /s/ in /su/ - lowering the velum a bit early when saying /an/ and lip rounding during the /s/ in /su/ - all of the above

lowering the velum a bit when saying /an/ and lip-rounding during the /s/ in /su/

oral cavity during /a/ is increased by

lowering tongue passively by lowering jaw and lowering jaw by actively depressing tongue

Lip protrusion ____ formants- why?

lowers, it elongates the vocal tract

quarter wave resonator

lowest resonance of neutral vocal tract has a wavelength that is 4 times the length of the tube

F1

lowest resonant frequency of the vocal tract

What are the nasals?

m, n, ng

aerodynamic stability

maintenance of stable air pressure and airflow

What are the 3 ossicles and what dB boost do they give?

malleus, incus, and stapes + 5 dB

What is experimental research?

manipulate experimental variables. (e.g. independent variable or experimental treatment) while controlling the conditions of the study. (pattern playback strategies are very helpful).

What is a phonemic change?

manipulating formants I and II; vowels, dipthongs, semi-vowels behaviors; tongue heigh height; tongue advancement; lip opening (rounding, unrounding) manipulate formant 3: [r] sound as opposed to [l] nasality: couple or decouple the nasal cavity; cup de sac resonance

semivowels are divided into

manner classes

Plosive

manner of consonant articulation made by sudden release of air impounded behind an occlusion in the vocal tracts. Used synonymously with "stop".

Assimilation can occur according to

manner, place, and voicing

What has been found about /u/?

many have found that the lips begin to round for /u/ well before the actual vowel is to be produced

What does (p) stand for?

maximum pressure

What does (v) stand for?

maximum velocity

Antinode

maximum vibratory amplitude; formant frequency is lowered by constriction; maximum volume velocity or minimum pressure

Stops: Stop Gap voiced

may have voicing through all or part of the stop will be low in amplitude can see periodic vocal fold vibration from both wave form and wide band spectrum

phons

measure of human loudness/sensation

nasalance

measurement is ratio of nasal and oral sound pressure, trace looks like air pressure/flow waveform , different norms for different passages

electromyography

measurement of electrical activity in muscles caused by synaptic transmission from a motorneuron, measures muscle action potential (MAP), performed by comparing the electrical signal as it passes from one electrode to another electrode

nasometer

measures nasal resonance via nasal and oral microphones partitioned by a sound-separating plate

What are the average fundamental frequencies (f0) for men, women, and children?

men- 138; women- 270; children- ~ 403 for F1

direct realism

not really a theory you perceive changes in your environment when they are relevant to you; the actual objects of perception are directly perceived we are active perceivers, constantly learning still gesture based

schemata

novel sound production learned through motor programs, which are enhanced through repetition

Semivowels are considered consonants because they occur on the _____ of words

nucleus

semivowels never act as the ____ of the syllable, on the ____

nucleus/ periphery example: you - open and resonant, but next to vowel.

Frequency refers to

number of cycles per second

Suprasegmentals depend on:

numerous physical changes, convariation of several acoustic variables, and degree of contrast between variables across several syllables

obstruents

obstruct vocal tract - Tongue moves from vowel to obstruent - Tongue moves from obstruent to vowel - Locus (target) frequencies of obstruent

which of the following, if any, is not true about vowel neutralization or "vowel reduction"? vowel reduction ___ -involves keeping the articulators in a more central position that would normally be assumed in producing that vowel -results in formant values that are closer to those of the schwa -can be done without loss of intelligibility, as long as the speaker is careful to meet the perceptual needs of the listener - occurs more commonly on stressed syllables -all of the above -none of the above

occurs more commonly on stressed syllables

The vocal tract resonates at odd or even multiples

odd

triangular wave

odd harmonics; 12dB per octave roll-off. We know that the source is not this, and it has odd and even harmonics

higher resonances are at

odd numbered multiples of the lower resonance

sensory integration

often missed in these theories

consonants

one or more areas of relative constriction of vocal tract. source of soung: /+voiced and /+turbulent airflow (/s)

consonant

one or more areas of vocal tract narrowing by some degree of constriction (partial or complete)

why aren't place cues for affricates discussed?

only 1 place for production of English affricates

Nasals

only phonemes that sound exits the nasal cavity. Occlude oral cavity and open velopharyngeal port. ALL NASALS ARE VOICED Nasal resonance is constant.

F2 transitions and stops

only thing that changes is the start

What is short lag?

onset of vocal fold vibration follows shortly after release burst, voiced stops in English, range from -20 ms to +20 ms

Nasal sounds require a ____ VP port

open (lowered velum)

What are the two kinds of feedback systems?

open and closed loop

lips

open end of the tube (vocal tract)

vowels vs consonants

open versus closed vocal tract is seen as amount of energy passing through vocal tract ex-vowels have more energy

Most speech sounds are

oral

The vowel quadrilateral can be visualized within the

oral cavity

VP port needs to be tighter for

oral obstruents (require airtight seal)

during the production of nasal consonants the airflow is blocked in the - cavity but allowed to go through the - nasal cavity

oral, nasal

What condition may result if eustachian tube fails to permit pressure equalization when opened

otitis media

name 3 parts of the ear

outer, middle, inner

What is an open loop system?

output is preprogrammed, no feedback needed

long lag

over 40 ms

Rising intonation can do what?

override the natural inclination toward falling pitch to express excitement, ask a question, etc

average VOT for initial stops (voiceless)

p 58 msec t 70 msec k 80 msec

the postcentral gyrus (sensory strip) is located in the - lobe

parietal

What is a burst filtered by

part of vocal tract in front of constriction

Aperiodic sound sources can be generated by

partial adduction of the VFs, Various locations along supraglottal VT, Forcing the airstream through a constriction

What do the passive theories of speech perception say?

passive theories assume perception remains in the sensory processing domain entirely, somewhat like perception "falls into place automatically" without our active participation. don't need to refer to production to perceive speech- Fant.

x-ray microbeam

paths of tongue and jaw movements: /kaek/ normal loud speech. upper trace represents the midsagital contour of the palate.

intonation

pattern of fundamental frequency change in the production of an utterance, production of a statement or question, pitch rise and fall plus stress

juncture

pauses in speech stream

formant

peaks of energy on spectral slice

landmark detections using points of minimal and maximal change

perception is based on what?

when learning ones 1st language what comes first

perception of phoneme production of others

What is a closed loop system?

performance of system is fed back in for check

simple harmonic motion

periodic movement where a proportional amount of movement occurs during vibration pattern

vowels

periodic source, which means they are voiced and produced by VF

vocal tremor

periodic variation of frequency and amplitude

Voiced stops can cause small periodic sound because of

periodic vocal fold vibration

the spinal and cranial nerves are part of the - nervous system

peripheral

formants

perks in the spectral slice are (blank) because they are the ones that have the highest energy

How is aperiodic noise created?

phonated or unphonated breath stream is sent through constrictions formed in the vocal tract and the combination of strong airflow and narrow constriction makes the airflow turbulent and creates frication

temporal complexity

phonemes become shorter when syllable length increases, speech rate becomes a primary consideration, acoustic cues are not tightly bound, blending occurs so rate is met

What are the 2 kinds of diphthongs?

phonemic and non-phonemic

Complete Assimilation

phonemic class changes EX: Velarization of n/ before /k/ in "ten cards"

What 3 ways can vowels be analyzed?

phonemic distinction, articulatory properties, and acoustic characteristics

What is a phonemic diphthong?

phonemic or contrastive diphthongs are unique and independent sounds; they are relatively long in duration; for example, cow boy I. some acknowledge the Iu diphthong (typical for UK english and some words on East coast) as well.

Degree of VP closure varies with _____

phonetic context

What is the perceptual correlate of frequency

pitch

We hear increases in ______, _____ and _____ with stress

pitch (Fo), intensity (Amplitude) and length (duration)

Suprasegmentals involve variations in

pitch, loudness, and duration

Increases in this can also increase frequency

pitch/ subglottal pressure

Direction of F2 transition is a cue for

place of articulation

Consonant classification occurs along the following three demensions

place of articulation, manner of articulation, voicing

phonetic cues

place, manner, and voicing specific to language

Consonants defining features

place, manner, articulation, voicing

This is how consonants are classified

place, manner, voicing

what aspects of consonants are perceived categorically

place, manner, voicing

Audibly released stops are also called

plosives

semivowels are considered consonants due to their

position of occurrence

If everything is okay, feedback will be what?

positive

What are the points of constriction for /sh/ and /3/

posterior to the alveloar region, lips are rounded and protruded

in unreleased final stops, the ___ will be shorter if the stop is ____.

preceding vowel; voiceless

On perception of fricatives, list some important aspects that affect it

presence of fricative noise (manner) intensity (sibilants vs non sibilants) spectral cues for placement no reliable info about the remaining necessary distinctions.

cue to affricate

presence of silence; followed by burst, followed by noise

3 cues to voicing in stops

presence/absence of voicing bar

gating task

present listeners with word fragments of progressively increasing length of 50 ms each

aerodynamic measures

pressure and flow measured in the oral cavity to estimate the pressure/flow at the level of the vocal folds. PAS= phonatory aerodynamic system

Nasals

primary resonator is pharynx-nasal cavity....shape cannot be altered oral cavity is dead end resonator antiformants: reduced energy in a frequency range...location is a result of place or articulation in the dead end resonator voiced and low in amplitude nasal murmur: low frequency formant (approx. 300Hz); higher formants significantly dampened formant transitions similar to stops surrounding vowel nasalized complete closure of the oral cavitiy

vocal fold vibration

primary source of sound for speech is the.....

1. ack of invariance, 2. relevant unit of perceptual analysis, 3. lack of segmentation, 4. perceptual normalization, 5. specialization of speech perception, 6. contextual effects

problems in speech perception (6)?

normalization

process of simplification by smoothing out "noise". variability that we can ignore wihtout loss of information

plosives

produced with a period of complete contact between two articulators that briefly stops airflow

tense vowels

produced with greater muscle contraction and are produced at the extremes of articulatory posture, with tongue higher in oral cavity

F3

productions of /r/ are often conspicuous in spectrograms by virtue of the marked changes in ___ between /r/ segments and adjacent sounds.

one way that speakers signal the end of the a phrase is by

prolonging the final phonemes

suprasegmentals

prosidy, overlay the sequence of connected speech, express subtle of differences in meaning: stress intonation rate

when the inferior lonngitudinal muscle of the tongue contracts, the tip of the tongue

pulls downward

passive theories of speech perception is

purely sensory

when the posterior part of the genioglossus muscle contracts the tongue is

pushed forward

Rising intonation often implies a

question

Sonorants: liquids

quick articulator movement good formant structure sustainable

The elements of change in semi-vowels occur more ____ than those characteristic for diphthongs.

quickly.

rapid opening or closing gesture

rapid rise/fall in intensity

dendrites are the part of a neuron that - impulses from other neurons

recieve

the internal muscles of the larynx are enervated by the - branch of the CN X (vagus)

recurrent

PAS (phonatory aerodynamic system)

red lines show when VFs are closed (valleys on diagram) orange show pressure (peaked when VF are valley)

VOT and rate of speech

reduced contrast at faster rates

definition for active theories of speech

refers to own production when trying to perceive

For diphthongs, research suggests that it is the direction and steepness of the change that matters most, and that it is not very important that the on glides and off glides actually hit perceptual target freq.

shows that our system for speech perception is very flexible in coding what is intended rather than what occurs in a physical sense; support the motor theory of speech perception.

The sounds /s/ and /z/ are sibilant or non sibilant

sibilant

are /sh/ and /3/ sibilant or non-sibilant

sibilant

A ____ fricative noise is stronger than in _____.

sibilant; non-sibilants

fricative energy varies between

sibilants vs non sibilants

Speech Banana

significant link between hearing and speech

The brief cessation of airflow emitted from the vocal tract underlies the acoustic period of ____ characteristics of /p/, /t/, and /k/.

silence

2 cues to 'stop'

silence of closed phase; burst after silence

the stages in a voiceless stop consonant followed by a vowel, in sequence, are

silence, release burst, frication, aspiration, phonation

What can account for perceptual differences in juncture

silence, vowel-lengthening, presence/absence of phonation or aspiration

closure may be completely

silent

coarticulation

simultaneously articulating more than one phoneme. this is important in the perception of certain consonants

What are complex waves composed of?

sine waves of different frequencies

nasal stops

some linguists call- /b, m/, /d, n/, /g, ŋ/ what?

what are anti-resonances

sound absorbed

what is measured in decibels

sound intensity

what occurs in the inner ear

sound is changed from mechanical energy to vibrations in fluid then to electric impulses

what is the difference between sensation and perception in terms of hearing

sound is received vs meaningful awareness

What is meant by sound magnitude?

sound loudness

nasal murmur

sound of a nasal, acoustic waveform of nasal consonants

Fricative Sound

sound produced by forcing the airstream through a narrow articulatory constriction; one of the consonant manners of articulation.

what occurs in the middle ear

sound waves are changed from acoustic to mechanical energy

what generally occurs in the outer ear

sound waves are conducted to the middle ear

categorical perception

sounds perceived with abrupt shifts between groups

What are the basics of source filter theory?

source = the role of sound from vocal folds filter = the role of the vocal tract in modifying this sound The vocal tract is a fleshy tube, the shape of which can be altered by actions of the speech organs. A sound is created by the vibrating vocal folds and it is then modified by passing through the tube.

Source is to ___ as filter is to ___.

source = vocal fold vibration filter = vocal tract

/w/

source for this is from the vocal vibration

/z/

source for this is same but with vocal fold vibration

when several impulses travel down different axons at the same time towards the synapse with another neuron the effect is - summation

spatial

articulatory gestures

speaker has internal cognitive map of spatial targets of the vocal tract that directs articulator movement.

What are target models?

speakers attempt to hit a series of targets to correspond to the sound they are trying to produce

What was found when looking into compensatory strategies of proprioceptive intervention?

speakers were found to compensate immediately when disrupted, and speech often can continue on normally ex. can still produce /u/ while holding lips back from rounding

association areas are areas that are not dedicated to any

specific function

What are the two acoustic cues to place of articulation of fricatives

spectrum and intensity

False. maturation occurs well into adolescence

speech motor control is mature after by age 12. True or False?

Speech production is useless without what?

speech perception

general auditory theory

speech perception is a generic listening skill that we practice a lot and get really good at; nothing special about it perception drives production

consrtiction

speech production is an aerodynamic phenomenon based on airflow

coarticulation

speech sounds are not produced in isolation but in context of syllables, words, and phrases, individual sounds lose distinctiveness, vocal tract adjusting for more than one sound

Consonants

speech sounds characterized by obstruction of the vocal tract compared to vowel.

the speed of formant transitions is related to the ___ and depends on ___ of articualtion

speed of the movements of the articulators; manner

What muscle stiffens in acoustic reflex?

stapedius muscle.

falling intonation often implies a

statement

the speech sound(s) that involve(s) a transient aperiodic sound source is/are/ the

stop and affircate

what do F2 and F3 depend on

stop and vowel

release of the closure

stop burst

vocal tract closure

stop gap

affricates have

stop gap followed by frication

Affricatives involve a sequence of articulatory shapes. This sequence is the articulation of a - followed by the articulation of a

stop, fricative

Affricates

stop-fricative sequence approx. palatal place of articulation (noise energy in same range as palatal fricatives) short rise time of acoustic energy may/may not see distinct release if release not distinct, may perceptually confuse with fricative

stops/plosives

stopping of air flow and than an explosion/burst we have pairs of voiced/voiceless stops in english we have glottal stop but it is not a phoneme it is an allophonic variation bilabials: /p/ /b/ alveolars: /t/ /d/ velars: /k/ /g/

In what manner of articulation is there complete articulacy closure in the oral cavity?

stops

Nasal sounds have similar constriction sites as ____ within the oral cavity

stops

which manner class(es) involve(s) the build-up of pressure within the vocal tract?

stops and fricatives

which manner class(es) has/have the fastest formant transitions?

stops and nasals

Non-resonant are

stops, fricatives, affricates- dissimilar to vowels

Obstruents

stops, fricatives, and affricates

What are the non resonant consonants?

stops, fricatives, and affricates (all have more restricted airflow)

Obstruent

stops, fricatives, and affricates which have blocked or restricted airflow, have aperiodic sound sources in the upper vocal tracts, and can be voiced or voiceless.

Obstruents

stops, fricatives, and affricates, Characteristics: Blocked or restricted airflow, Aperiodic sound sources in upper vocal tract, May be voiced or voiceless, Supraglottal noise sources, Stop bursts, Frication

coarticulation

the process by which adjacent sounds influence each other's articulatory and acoustic properties -overlapping/simultaneous production of more than one speech gesture -undershoot of the ideal articulatory target for a sound in isolation -normal speech approx 5 syllables per second -typical/normal/expected -NOT THE SAME AS ASSIMILATION

anticipatory (regressive) coarticulation

the properties of an upcoming target influence the realization of the current speech sound -ex. key and coo

A watt is:

the rate energy is transmitted (A measure of sound intensity)

What is the perceptual/ Acoustic cue for a diphthong

the rate of change between formant 1 and formant 2

cavity

the resonating space

Delayed Auditory Feedback may also be a result of what?

the result of forcing a speaker to attend to auditory feedback info which can conflict with articulatory movement info

path

the sequence of positions in space occupied by the articulator

Steady state

the set of formants that characterize a prolonged /l/ or /r/ -may not be evident in all productions

higher the voice

the shorter the vocal tract, the (blank)

What is the Simple Target Theory?

the simple target theory states that vowels are identified perceptually by their formant frequencies (norms for formants 1 and 2 were determined for all vowels by Peterson and Barney in 1952). The theory implies that for a vowel to be decoded, all we need to be aware of is the formant frequencies (I and II).

/s/

the source for this is turbulent airflow off of the alveolar ridge. the tongue tip is up.

Give an example of assimilation.

the speaker takes a "shortcut" and does not hit every articulatory position - one sound is produced in another, similar location to make articulation more efficient

What is the unit of stress?

the syllable

What is coarticulation due to?

the temporal overlap in articulatory gestures for vowels and consonants ("stoop" - lip rounding starts at the /s/)

What is voice onset time?

the time between the release of articulatory blockage to the beginning of vocal fold vibration of the following word

Voice Onset Time

the time from the release of the stop to the onset of voicing (vocal fold vibration) important cue to identity of stop consonants have a burst, only for stops useful for pre-stress syllable position (initial stops) voiced stops are shorter than voiceless can measure closure duration

VOT definition

the time interval between the burst and the onset of voicing

trajectory

the timing of the sequence of positions

Vowel Transitions

the transition between the vowel and consonant and the consonant and vowel. Used to identify the place of articulation for the consonant.

vowel transitions

the transition between the vowel and the consonant (VC) or the consonant and the vowel (CV)--this is an example of coarticulation

the frication noise of the /h/ is produced by ___ and is filtered by ___

the vocal folds held near the midline; the entire vocal tract

formant

the vocal tract has an infinite number of (blank)

rule of standing wave patterns

the vocal tract will resonate only only at odd-numbered multiples of the lowest frequency

3 and 3

there are 6 possible degrees of freedom. how many rotational and how many translational?

Describe the place feature of nasals

there are two essential rules, direction and duration. Direction deals with adjustments of F2 to or from vowel; what is high or low depends on the vowel. Duration is how long it takes you to make F2 adjustments. longest adjustment is ng (hardest to make are lingua-alveolar so transitions take longer) medium long n shortest m (bilabial transition is the shortest)

Describe the manner feature of a nasal accousitcally

there is a presence of a nasal murmur and an occurrence of antiresonance (a perception of muffling, or damping). a nasal consonant means that the nasal cavity is coupled to the vocal tract; there is a cul de sac resonance in the back portion of the oral cavity behind the closure. the entire oral cavity is closed and creates a cul de sac resonance

When the jaw is more open...

there is more pharyngeal constriction

formants

these are concentrations of acoustic energy and act like band pass filters.

What following procedures would help you optimize results in interpreting spectrographic images clinically re: vowels?

use norms that are appropriate for the individual client (consider gender and age when judging a production). When possible, use the client as his/her own control for comparison (use a lucky, chance, good production as model in therapy). Measure formant frequencies during "steady states";that is, in cases where vowels last long enough for steady states to be available. Train generalization of targets to a variety of contexts (don't assume a production should be identical across phonetic contexts). for example, in some contexts, it is normal that vowels "neutralize". Basically, in all applications use information about f1 and f2 as primary source for vowel feedback.

What is contrastive stress?

used to differentiate between two words that differ only by a syllable ("I told you to REceive the guests, not DEceive them.") - contrast may only be implied ("This is my red bike." - weakly stressed syllables are as important as strong ones for contrast

How is descriptive and experiment studies related?

usually a new field of study initially engages in descriptive studies, which produce the first insights and hypotheses for experimental studies.

1. differences in physical properties of the larynx and vocal tract, 2. age, 3. gender, 4. habits of articulation, 5. suprasegmental features/speaking rate

variability among speakers may be due to what? (5)

In what stop is the front cavity further back and has a longer front cavity and mid- frequency energy?

velar stops

In oral sounds, what port is closed?

velopharyngeal port

hypernasality

veloppharyngeal incompetence, increase in formant bandwidths, decrease in overall vowel energy, introduction on nasal formant, rise in F1 and lowering of F2 and F3, presence of antiformants

When analyzing Waveforms, this provides information related to amplitude (___ displacement)

vertical displacement

nasal murmur

very low F1 (250-500 Hz) large nasal resonating space and narrow opening

This measure is obtained by having a client maximally exhale following a maximal inhalation

vital capacity

Harmonics

vocal fold resonances

What is long lag?

vocal fold vibration is delayed for a long time after articulatory release, voiceless stops, range from 25-100 ms

Why is aperiodic noise heard when glottis is closed?

vocal folds = vibrating, vibrating vocal folds because they're closed, airstream passing through constriction

noise components

vocal folds don't close all the way,

the space called "glottis" is anatomically defined by

vocal folds in the front, arytenoids and cricoid in the back

Slope of the source spectrum (spectral roll off) varies by ___.

vocal intensity (talk louder, vocal folds close faster)

frequency dependent

vocal tract filter is what?

formant frequency

vocal tract length affects (blank)?

9 cm

vocal tract length of a child (then F1=34,000/(9x4) = 944.44 Hz)

Formants

vocal tract resonances

voiced fricatives

vocal tract source may be the secondary sound source

each arytenoid cartilage has two processes called...

vocalic and muscular

syllable- final /r/ is often _____ or realized as an __________

vocalized/ extension of the preceding vowel

This is the duration of the period of time between the release of a stop and the beginning of vocal fold vibratino

voice onset time

Voiced Sounds

voice onset time is 50 ms or more

Voiceless sounds

voice onset time is less than 50 ms

Consonants produced with a periodic glottal tone are

voiced

glottal sound

voiced

the speech sound in english that involves a combination of complex periodic and continuous aperiodic sources is the ___

voiced fricative

Periodic source + Continuous noise=

voiced fricatives

Periodic source + transient aperiodic source at release =

voiced stops and voiced affricate

Burst releases are stronger for _____ stops; intraoral pressure during closure is greater because the glottis is open

voiceless

Consonants produced with no periodic glottal tone are

voiceless

supraglottic sound

voiceless

a sound that has sustained noise and no "voice bar" in the spectrogram is likely to be a ___

voiceless fricative

Stops: Stop Gap voiceless

voiceless should be silent

Burst release is more intense for voiceless stops than voiced stops why?

voiceless stops- adducted (more air) voiced stops abducted

Stridents

voiceless- just aperiodic noise=source voiced-aperiodic noise and peroidic vocal fold vibration different between alevolar and palatal=where noise energy is alveolar- 4-8 range of acoustics energy Palatal: 2.5 KHz- 8KHz Noise energy goes up to 8000 Hz

a long interval of frication and aspirationg occurs in ___ stops because the ___ at the time of stop release.

voiceless; vocal folds are apart

What happens when pitch rise is used with an incomplete utterance?

when used with an incomplete utterance like "let me see...", the conversational partner is less likely to interrupt than if pitch fell at the end

favorite frequencies

when we change the space between jaw, lips, etc. we are changing the what? these are also called formants

bottom

where do you look for dampening--bottom or top? It holds nasal murmur and such

Restoring Force produces changes in the Momentum of the system:

where the restoring force is lowest the momentum (velocity) of the system is highest. Where the restoring force is highest the momentum (velocity) is lowest

vowel transition

where vowel overlaps with previous and following consonant

Narrow Bandwidth

which bandwidth resolves frequency information well (harmonic structure) but time information poorly

lower frequencies

which frequencies have greater energy?

the higher frequency harmonics

which frequency harmonics are resonated more?

because going from totally closed to open

why does f1 rise for all the stops?

wide band vs narrow band frequencies

wide band allows 300 hz (3 harmonics at a time) and narrow band only allows 50 hz (1 harmonic at a time)

bandwidth

wideness of the band is called a (blank)

In the process of speech production, the lexicon supplies the

words

sequence of segments (each of which consists of a bundle of binary distinctive features)

words are represented in memory as what?

higher because the center frequencies for /S/ are higher

would the high pass filter for /s/ be lower or higher for /S/?

Do voiced obstruents combine periodic and aperiodic sources?

yes

Can we produce a retroflex back R (instead of bunched)?

yes, by hitting right spot with something.

what are the 3 types of VOT

zero VOT, Positive VOT, Negative VOT

Norms of Conversational Speech

~ Average: 7 cm H2O ~Range: 3-12 cm H2O ~7-10 cm H2O is normative for speech

S to Z

~ Duration of how long a person can hold /s/ as opposed to /z/ ~should be equal durations. If greater than 1 then glottis is not completely closed ~1.4= Pathology

Vocal Fold Histology/ Viscosity

~ Layers are: 1. Epithelial 2. Superficial 3. Intermediate 4. Deep 5. Thyroarytenoid muscle as you go down, it becomes harder to vibrate Superficial layer is the most susceptible to damage b/c the easier the vibration the more susceptible to damage Lower viscosity means it moves more, so there is more damage

Smither and Hixon

~ Tests glottal resistance ~ Measured in cm H2O/ L/ Sec. ~ Have client elicit a syllable train ~ Not exact or specific but can tell you when there is a problem

Laryngeal Airway Resistance

~ measure of the amount of resistance the VF offer airflow ~Measures glottal efficiency

Transglottal flow

~How much air goes past a point in a certain amount of time

Horizontal Phase Difference

~Open posteror to anterior ~close anterior to posterior

Vocal Rise Time

~Where to take acoustic measure ~ When you get a sound to reach a steady state from onset to sound.

Phonemes we need to know

·/s/ 3500 Hz+, dark noise, longer duration. ·/z/ 3500Hz+, dark noise, shorter duration. ·/ʃ/ 2000+, dark noise, longer duration. ·/ʒ/ 2000 Hz+, dark noise, short duration. ·/f/ 500Hz+, light noise, longer duration. ·/v/ 500Hz+, light noise, short duration. ·/θ/ 2000Hz+, trailing tail, light noise, short duration ·/ð/ 2000Hz+, light noise, short duration ·/r? (or the other two) F3 below 2000Hz, clear formants and striations. ·/i/ F1 & F2 very far apart, clear formants and striations. ·/ə/ Short in duration, clear formants and striations.

voicing cues for affricates

ʤ will be the only one voiced showing F0 and periodicity

What is the formula for wavelength?

λ = c/f

Review of terms.....

• Power = Total output of sound source in all directions • Intensity = Rate of energy flow through a unit size • Pressure = Force applied perpendicular to the surface of an object • Watt = Rate energy is transmitted • Pascal = Force divided by area, 1 Pa = 1 Newton/meter2

Variability in Voice Onset Time

• Prevoicing: voicing begins just before release (negative) • Simultaneous: voicing begins on release (near 0) • Voicing begins after air is released (positive)

Acoustic Cues for Stops

• Stop gap • Burst • Voice onset time • Post‐stop vowel formant transition

Mobile Articulators

• Tongue • Mandible • Velum • Lips • Pharynx • Larynx

Release Burst

• Transient burst noise on release of the stop gap and impounded air • Duration approximately 10-30 ms for voiced stops and slightly longer for voiceless cognates • Observed in waveform as sudden change in amplitude • Observed in spectrogram as gray broadband

Resonances required for vocal tract model

• Tube open at one end closed at other - Quarter wave resonance • Tube open at both ends -Half wave resonance • Tube closed at both ends -Half wave resonance • Helmholtz resonance - "Jug" resonance

Why is digitizing speech SO NEAT

• Virtually perfect recordings • Absolutely perfect copies • Ease of cataloging and retrieving sound • Flexibility -- once speech is digitally encoded

formant transitions

•Articulation is seldom steady state for long •Articulators moving from one sound to another - coarticulation happens

vowel nasalization

•Coarticulatory effect •Portion of vowel closest to nasal consonant becomes nasalized. • Antiformants and formants (zeros and poles) • Acoustic evidence for nasalization - Visible lack of harmonic energy - F1 raised - Dampening of energy (lower spectral peaks) for F1, F2,F3 • Acoustic Characteristics of Voiceless and Voiced Affricates • Prosody

manner of articulation

•Complete, transient cessation of airflow •Constriction with continuous airflow •Fricatives constrict the air all the way from the glottis with continuous airflow

liquids

•Formant transition similar to vowels, with a steady state portion (depending upon context) •/l/ complex due to lateral emission - Formants & antiformants - Antiformants (zeros) =dampen energy - Arise from division of airflow in vocal tract - (similar to homorganic /n/) • /r/ - - F3 decreases (mid-palatal constriction) (VC)

Suprasegmental

•Frequency, duration, amplitude •Frequency of voicing •Duration of voicing •Amplitude of voicing

Periodic Complex Waves

•Fundamental frequency plus: •Some combination of harmonics •Harmonics are frequencies related to fundamental by a ratio of whole numbers. •For example: 1:2, 1:3. 1:4... •If this is the case, the frequencies are said to be in harmonic relationship with one another.

approximants

•Glides (Semivowels) -Lingual-alveolar /j / & bilabial /w/ •Liquids -Retroflex /r/ & lingual-alveolar /l/ •Constriction insufficient for Venturi effect & frication noise •But are consonants despite being relatively unconstricted & presence of formants but can not be syllabic nuclei •Central stream of airflow except /l/ •Like fricatives, lip rounding/protrusion important feature for some (not just /w/ but also /j/ & sometimes /r/ & /l/)

Dimensions and Common Units:

•Magnitude/Amplitude: Pressure (Pascals, Pa) •Frequency: Repetitions per second of wave period (Hertz, Hz) •Period: Duration of single repetitions (Seconds, Milliseconds) •Phase: Location in cycle expressed in circular scale (Degrees) •Wavelength: Length of one repetition in a medium

nasal airflow and acoustics

•Regulated by the velopharyngeal port •Excessive nasal resonance in the acoustic signal is perceived as hypernasality

intra-oral air pressure

•The air pressure within the oral cavity • Dependent upon - Degree of constriction of the phoneme - Intensity

Burst and Aspiration Noise of Voiceless Stop

•The voiced/voiceless categorization is not always straightforward. • VOT= time from release of stop closure to onset of voicing. Can be variable - Pre-voicing: voicing begins just before release - Simultaneous: voicing begins upon release - Voicing begins after air is released • <20 ms = voiced >25 ms = voiceless - Prevoiced, short lag, long lag - Under 20 ms is short lag, while over 40 ms is long lag

Labio-dental /f,v/ & lingual-dental /θ, ð/

•small anterior resonating cavity •Broad constriction •Low energy, broad spectrum


Ensembles d'études connexes

Chapter 12 Business Presentations

View Set

Chapter 27: Safety, Security, and Emergency Preparedness

View Set

PMR: Chapter 1 - Recognize Core Terminology

View Set

3-D Geo Review - Surface Area & Volume

View Set

Learning: Module 3: Section 2_05-2_07

View Set

Chapter 3 Principles of Supervision

View Set