COMD 5070 Final Exam Study Guide

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Palatometry and what it shows us (which sounds show up best? What are its limitations?)

(AKA Electropalatography or EPG, or dynamic electropalatography) it's the sounds that involve contact with the palate (many consonants, some vowels). A general answer is what we need here. It's sort of what you would expect: those sounds that make palatal contact (lingual consonants, plus some vowels) will show up well. Other sounds like bilabials or low vowels won't have much contact. -Useful in asessing articulation problems -Provides biofeedback during therapy to see right wrong for tongue placement. Looking at contact patterns of the tongue. Abnormal articulation revealed by unusual contact patterns. Palatometry limitations -Influence of motor equivalents: means u can achievve same acoustic results in few ways. -As consequence, its not realistic to exprect client to match perfectly clinician model Coarticulation: sounds adjacent to target sound can influence production of sound. -Does not reveal tongue movements. -Often comined w acoustics This mean client who is learning to articulate and doesn't necessarily do it in same way as CLN. Shows tongue contact patterns but doesn't reveal tongue movement. REASEARCH tool to see what shes doing wpeech.

Coordinating speaking and breathing (how do speakers time resiration and talking relative to each other?

---that question refers to starting speech at the top of an exhalation, rather than at the end of one. ---Yes - it's really about managing speech breathing so that we begin speaking at the end of an inhalation, rather than half way or even later through an exhalation. c. Student wrote: have to have appropriate respiratory phase for phonation, suitable vocal fold adduction, timing of voice onsets and offsets, vocal tract structures move, timing and coordination have to be precise for speech

McGurk Effect (what does listener experience?)

--If there is a mismatch bw what is seen and what is heard. --we are use to integrating things that match some resistant to this effect. --visual input will have an input in how we perceive speech sounds. Person speaking proceeds series of sounds /ga ga ga/. Researcher vids person saying /ba ba ba/. Then combine /ba/ audio w /ga/ video, so hearing /ba/, but seing /ga/. w closed eyes, still hear /ba/, but when you watch theis mised up video, uou actually hear them say /da da da/. Most ppl hear differences; visual input influences sound perception. Illusion occurs when auditory component of one sound is paired with visual component of another sounds, leading to perception of third sound. Visual information person gets from seeing person speak changes way they hear sound.

Controlling speech in movements vs. Sounds vs. Syllables (what do theories suggest about how brain plans speech production?)

--Several gestures for a single phoneme --they're NOT separate and isolated movements are also fluid and continuous = coarticulation. --segmentation- we can identify them BUT movements are nearly continuous --Sensitivity to sound differences is more refined near the phonetic boundaries. --Psychoacoustic skills sharper here --Small changes close to the phonetic boundary --Differences distant from the boundary are LESS clear. --We have better sensitivity where it make a difference to our ability to perceive speech and distinguish categorically within phonemes. Some say we have natural abilities to perceive psychophysical boundaries may account for categories. --Language may take advantage of natural boundaries. --meaningful speech differences located either side of the natural, existing boundary.

Perceptual Assimilation (definition and examples)

--We hear an unfamiliar foreign sound. --we typically put it into one of our own categories. --Japanese speakers struggle to differentiate /r/ and /l/ in English --both are variants of a single Japanese phoneme --with training, they can improve performance --some sounds are non-assimilable ------clicks aren't English speech sounds -------we have no phonetic place to put them We DON"T need language competence to perceive phonetic difference! As English speakers exposed to these unusual Thai VOT Categories, we experience PERCEPTUAL assimilation. Hear unfamiliar foreign sound. Typically tru to put into one of our own categories, bcus this is way our brain bcus wired from speaking our own native languag as long as we have. This can be problemfor native japanese speakers learning to speak english, bsuc struggle to differentiate /r/ & /L/ in english, since both are varients of single japanese phoneme categories. Training can improve performance & can because profiecient with distinguishable sound & different VOT categories. BUT, some sounds are non-assimilable: Clicks aren't speech sounds; so no phonetic place to put them. There is no category in english for click sound (phoneme) both are variants of single japanese phoneme. Training can improve performance. Some sounds are non-assimimilable. Clicks aren't english spelling sounds; have no phonetic place to put them, which is troublesome for new listener trying to get head wrapped around new language.

Perceptual magnet effect (what is it? How does it relate to our experience as listeners?)

--perceptual magnet - acoustic variants CLOSE to the prototype are harder to distinguish from it; it pulls them to it. --If close then drawn into it and perceive to be the prototype. --acoustic variants away from the prototype are easier to distinguish from each other; there is no magnet effect --acoustic variants in a foreign language are easier to distinguish; magnet effect relies on experience with a language. --if we are a native of a language then that magnet is stronger. --great deal of variability in phonemes we produce. How dow we recognize what sound it is if every person is different. --Sound that is close to ideal target it tends to be drawn towards that target so that 'close enough' tends to count. Assigns sound to category and doesn't have to be perfect match but if its close it counts. -We have our own long-term memory of a prototypic sound. --if not protitpic it would be disimilar from the prototype.

Motor program theories (definition and limitations)

--plans are made before movements begin --planning involves movement selection and sequencing - selecting right movements and placing in right order for intelligible speech. --production is like playing out a written musical score-composer writes the music and the orchestra plays it. but it like we are both. --reaction time data support this theory - if u r going to produce a longer utterance is longer than short utterance - in theory more planning an prepared for longer utterance. --Criticism - can we store all possible movement patterns for any sound or syllable we speak? so storage and retrieval issue. --Criticism - can a program be flexibly adapted? to meet circumstances for different rate or loudness conditions.

Fourier Transform = Frequency domain display

-A line SPECTRUM shows the frequency components of a periodic sound. -Frequency domain description of the signal -has harmonics that are multiples of the fundamental -has nothing bw the lines (the lines represent the harmonic frequencies) x-FREQUENCY y - AMPLITUDE Has split up each individual sound from the total so u can hear each individually. Benefit - is to show us what those ind. components are and the relative proportions of each height of each bar says the strength of each. This is like our human speech and it is 'NEARLY PERIODIC' The upper harmonics get progressively weaker as you go up in frequency.

Sampling Rate (How does it relate to playback quality, file size, frequencies saved in Recording, nyquist, etc.?)

-Sampling Rate (snapshots): (specified in Hz) frequency of numbers stored/written to act as sampled analog signal: discrete snapshots in rapid succession. Linked with quality level. Numbers represent amplitude values; more samples, better can represent original signal. Rapidly changing signal needs more samples/dos to reflect all changes in original. Too low sampling rate & changes missed. Think movie shot @ 30fps looks continuous. Any lower tho would look jerky. -Higher Sampling Rate selection: gives better fidelity; BUT= files & storage disk space uses more memory processing time, & computational power. Nyquist Frequency=Half of sample rate. Highest freQ can record & play back accurately. Must sample @2x rate of the highest FreQ. Ex: CD sample rate=44,100 amps or snapshots/sec & stores signal up to 22,050Hz (corresponds to upper limit of most ppls hearing). To record up to 100Hz, SR=200Hz 1/2 Sample Rate=Nyquist FreQ (100HZ=original amount).

Acoustic Nasometry (how does it work conceptually?)

-Strap around persons head. -two microphones -divider plate bw microphones -Upper record nasal/ acoustic signal -lower record oral acoustic signal -proportion of energy from two mics= NASALANCE -display show degree of NASALANCE - relative amount of nasal energy to total energy. -useful in clinical assessment - bcus you can tell if falls wi normal limits for their nasalance during production of a given utterance. -provides BIOFEEDBACK signal during speech. so person can see how well they're managing the oral/nasal balance in their speech production. Nasal M & N would be high Measure low for vowels nasality for vowels bcus mostly oral. Kay Elemetrics - commercially. Headset attaches to pc and has a screen display that will reveal in real time how much nasalance there was. Like /m/ or /n/ then reading would be HIGH none from oral cavity. but if oral sound like vowel then would be low % of nasality for that sound. Commercially avail device and manual comes with passages that have been tested w numerous ppl for clinical norms. Manual will have whether if prns production falls within normal range or exceeds it for nasality. that can help with Treatment or medical intervention involve prosthetic to possibly help with or surgery.

Types of Spectra: Line, FFT, LPC, (they reveal different features of speech; What is each one best suited for?)

-Waveform: Time domain view of sound. -Spectrum: FreQ domain view of sound. Indiv ingredient @ single point in time. (X=FreQ; Y=Intensity) Spectra: Plural form of spectrum -Line Spectrum: (Type of FDD) shows FreQ components of periodic sound. Singel vertical line=Sine wave. Ea vercl line is single freQ=

Time Domain Data (TDD)

-a waveform represents sound directly -(air pressure) changes over time. x- TIME y- AMPLITUDE

Spectrum-Line

1. A line spectrum is a snapshot in time. 2.a line spectrum shows the FREQUENCY COMPONENTS of a periodic sound. 3. A pure tone has a single vertical line on a spectrum. 4.Not sure about this..Separates them individually with heights showing strength of each. Represent the harmonic frequency. X=Frequency y=amplitude

Acoustic goal theories of speech (what do these theories say about how speech is controlled?)

1. Acoustic goal ---Specifics of movement control are less important than the acoustic/perceptual result. ---MOTOR EQUIVALENCE allows flexibility - more than one way to achieve the goal. (Accomplish same thing a few different ways) ---even with biteblock that what comes out is the goal. 2. Articulatory Gesture ---vocal tract configurations correspond to sounds ---movement patterns stored for retrieval and use ---that the jaw, tongue, movements are important and are the goal of system and that we work for these to be accurately produced and that they are stored somewhere for us to retrieve for use. 3. Aerodynamic Targets ---other view is that we don't have acoustic or articulatory targets but that we have Aerodynamic targets. ---that we have to accurately manage the pressure in the vocal tract by whatever means is necessary and that ultimately if we have the pressures and right flows at right time and context then we will be able to produce speech accurately. ---maintaining pressure accurately depends on valves we have available to use in speech production mechanism. 1. larynx (ab or adducting more) 2. velorpharyngeal port (open or closed) 3. constrictions we make with the tongue or lips in vocal tract further downstream. ---speech sounds involve regulating pressure & flow ---pressure stability relies on correct valving.

Spectrum -(FFT) Fast Fourier Transform

1. Clearly shows us range of HARMONIC energy 2. each peak is a harmonica 3. BUT will not show us / less clear at showing formants. 4. more revealing of the (source) = problem with voice

Empirical

1. Empirical: Based on data

What is Science?

1. Empirical: Based on data 2. Deterministic: Obeys physical laws of cause and effect (not random) 3. Predictive: if you do this...then that will happen; it can influence another one 4. Parsimonious: use simplest explanation possible rather than make unnecessarily complicated - boil it down to its essence, without dumbing down.

Filter types (What do high or low pass or band pass or band reject filter do?

1. High Pass Filter - Allows High frequencies thru but removes or attinuates or holds lower Frequencies 2. Low Pass Filter - Allows lower frequencies thru but removes or attinuates or holds higher Frequencies 3. Band Pass Filter - Allows a band of frequencies thru but removes or attinuates or holds higher and lower Frequencies 4. Band Reject Filter - Allows a both higher and lower frequencies thru but removes or attinuates or holds a band of Frequencies

Open-loop vs. Closed-loop control of skilled movements

1. Open-loop - we learn over trials if success then we continue to use it confidently. ---predictive ---based on experience ---e.g., archery (once arrow is launched, no further input can be given to it.) 2. Closed-loop - we rely on incoming signals to guide our ongoing actions. The adjustments that we make will help us correct or improve the results. Constant feedback. ---ongoing ---leads to corrective adjustments ---e.g., driving along a twisty road. We use open-loop bcus we don't have enough time to make adjustments.

Simplifying Complex EMG Signals (signal processing approaches that help us recognize patterns in what starts out as complex, even messy, signal)

1. Reactification - All negative values switched to positive. 2. Smoothing: Often use Low Pass filter 3. Averaging: Done across many repetitions of same behavior, to get composite picture of kinds of things happening in the muscles when activated for given purpose.

Spectrum - (LPC) Linear predictive coding

1. Shows SPECTRAL ENVELOPE 2. good at revealing formants 3. doesn't show harmonics 4.more revealing of filter = articulation problems 5. ***will show you what the vocal tract is doing where its resonant frequencies are relevant to one another.

What does each Time vs frequency domain displays shows us about sound.

1. Time domain - shows us the time view x- time y-Amplitude 2.Frequency domain - shows us the frequency view. -sine wave: single line on a spectrum -complex periodic signals: multiple lines x-frequency y-Amplitude 3.What would noise look like in a spectrum ? -all frequencies -equal amplitude -random phase

Within vs. Across--category changes in sounds (how do small acoustic changes result in change in identity of phoneme?)

1. Within-category change: heard as the same. Small adjustments to VOT changes made below like at 10 VOT or well above like 50-60 VOT then listeners are likely to say that nothing has changed 2. Across-category change(phone boundary): heard as different. If changes made they will detected. boundary itself can make a big difference. if u cross the boundary then u will perceive a different sound being spoken.

3D Spectogram features (3 dimensions)

1. x- TIME 2. Y- FREQUENCY 3. INTENSITY - darkness indicates more energy lighter shows less energy Hybrid showing both time and frequencies.. evolving over time. -display reflects the contribution of many structures and movements. -much detail is present for even simple utterances. -need to be selective, specific in interpreting the display. wide band spectrogram - temporal time detail but poor frequency vertical striation of glottal pulses view narrow band spectrogram - frequency detail but poor time horizontal bands view of harmonics

What does pattern playback machine do?

1950's Creates sounds. -Draw patterns graphically on a display -device converts contours into sounds -allows manipulation of isolated components -Computers replaced pp Device drew sounds graphically on display (patterns) & then device -Manipulation to sound so that they could study them the way how ppl responded to those changes -why go thru all this trouble just to be able to adjust tiny little parameters of speech? This is the work of a speech perception researcher. converted those traces/contours into sounds; so kind of opposite of resulting spectrographic display from acoustic analysis. Benefit: Allowed manipulation of isolated/specific components of sound; tiny adustments to parameters of speech. Could make subtle adjustments & see what resulted (the work of speech perception researcher). PCs used today & replaced PC devices.

Hypothesis

A TENTATIVE PREDICTION on a specific topic -Should be testable -e.g. "faster movements will be less accurate"

Model

A simulation used to EXPLAIN or TEST or PREDICT whats going to happen. -computational -weather forecast -animal - excised canine larynx -synthetic - silicone vocal folds

Nasal Air Flow (during which sounds would it be high or low?) Aerodynamic Measure

Aerodynamic measures Measure for NASAL air flow -Nasal sounds have HIGH -Vowels should be LOW -Pressure consonants should be near ZERO bcus velorpharyngeal port should be closed. -if you do measure nasal FLOW during oral consonants it indicates leakage Measure ORAL air pressure -should be high for STOPS/pressure consonants. Calculate the CP RESISTANCE if you divide the ORAL PRESSURE by the NASAL FLOW. The VP Port resistance is a measure of how well the VP port is closing. In normal system u would expect for oral sounds resistance to be extremely high but in pathological case u may find that this resistance is somewhat reduced bcus of the LEAKAGE around the edge of the velum as its trying to approximate the pharyngeal wall constriction. calculated by how well the Nasal Air flow values/Nascence Readings: Nasal Sounds (like/m & n/) have HIGH nascence reading. Oral Sounds (vowel): have LOW nascence reading. Consonants: Have air flow pressure @ approx ZERO. Aerodynamic measures: Possible to make aerodynamic measures. 1. Measure Nasal Air Flow: Low for vowels & almost zero for pressure consonants due to closure of velorpharyngeal port. If flow measured during consonant productions, then indicates some leakage. 2. Measure oral Air pressure: High for stops/pressure consonants. 3. Calculate Vport Resistance: divide oral pressure by nasal flow, which measures how well vport closes. Normal Sx: Extrememly high for oral sounds @ this resistance. Disordered Sx: Resistance somewhat reduced, due to leakage around velum's edge while trying to approximate pharyngeal wall constriction. Low pressure is atmospheric pressure around head. High pressure is lungs. Flow Balance: Measures nasal & Oral air flow & examines relative proportion of each, as well as examine timing of each flow component. Uses small split-flow mask to make measures (like modded anesthesia) with divider plate in middle. Holes in mask let air flow thru when patient speaks & instrumentation is attached that's specifically designed to measure how much air flowing thru holes when patient speaking. Displays relative proportion of nasal & oral air flow to see how relates to production of specific phoneme spoken by patient easy to use; hooks up to PC sound card, & comes with software to quantify proportion of nasalance for test sentences spoken by patient. OroNasal Mask Glottal enterprise: Another device balance of flow -Measures nasal air flow -measures oral air flow -with a mask with divider plate, like anesthesia mask that has holes to allow air to flow through when person is speaking and instrumentation attached to it designed to measure how much air is flowing through holes when person is speaking. -Examines relative proportion of each phoneme person is speaking. -examine timing of each flow component -use split flow mask to make measures. do sound card says test sentences

Tools for transduring articulator movement (how do magnetic or electromagnet system track movement? Which tools reveal only lip or jaw (not tongue) movements?

Articulatory movement: Measuring Lip & Jaw movements: 1. Head-Mounted Strain Gauge system 2. Optical Movement Detection 3. Magnetic Tracking: Tracks position in 3D space. with PERMANENT magnet BUT only tracks one magnet so can only at one point in space as it moves during speech production (can b any articulator) Used study for bilingual speakers. Cons: ---only tracks one magnet ---accuracy depends on keeping headset/sensors steady. ----Mild sound distortions from some magnet positions, ---interference from metal objects Pros: ---Quick set-up ---Gives simple straightforward speech production measures without lot of prep set-up vs other systems. 4. Electromagnetic Articulography: uses several multi electromagnetic (EM) field to track several structures @ same time NOT single permanent magnet. ---each transmitter has its own frequency. ----multiple channels possible. -Tracks a dozen or more sensors. - allows reference points for stability. -----Wires to each sensor -Some articulatory interference. Several freqency specific transmitters and each one send out own freq of EM signals. Individual EM sensing coils attached to articulators pick up signals from transmitters; tracks dozen or more sensors @ once. Allows reference points for stability. Attaches to articulators.

What bite block shows us about speech (how does speech change if you fix jaw position?)

Bite Block Example: Can put bite block bw teeth & still talk intelligibly, despite adapted movements--changes dynamics, but result is still fine. ADAPTABLE Motor Equivalence (ME) works well in this theory, bcus doesn't matter too much how specific movements made; all brain really wants is good result @ effort's end. Static perturbation studies: bite block placed in bw molars to seperate upper/lower teeth by few mm & have patient fix their jaw position, bcus holding block in place by keeping jaw closed against it. Normal speech with vowels, consonants etc involves jaw bobbing up/down to produce them all, but fixing jaw w block takes jaw out of equation. Interesting how quickly & well lips/Tongue adapt to fixed jaw without any special training or practice. Lip movements =slightly bigger to compensate for jaw not carrying upper lip. Tongue movements=slightly bigger. Not really noticeable to pt, since happens so quickly. Shows how adaptable speech planning and production system is, & suggests that control signals not rigidly fixed, but adaptable to speakers circumstances.

Fourier Transforms/Analysis (What is result of Fourier transform of sound? What does it reveal about complex sound?)

Came up with way of analyzing complex signals and decomposing or splitting them into a series of individual components. 1.All periodic sounds are made of a combination of since waves. 2.You can break them up -amplitudes vary (how big they are) -phase angles vary (where they are in cycle) -frequencies vary (many dif freq) Fourier taught us that Even complex sounds can be broken down into their ind. sinoudal components. FOURIER TRANSFORM - We take a time domain waveform (like microphone signal) and we analyze it and create a spectrum from the time domain waveform - Analyze a cake to learn its ingredients. Time domain x-time y-amplitude THEN fourier transform =Frequency domain (spectrum:slice in time) x-frequency y-amplitude

Two types of Coarticulation (Anticipatory vs. Retentive)

CoArticulation: Way in which sounds influence each other; Happens all the time. Children must learn to CoArticulate each segment has specifications (voiced/voiceless, fricatives vs stops). Individual sound features influence neighboring sounds in 2 directions; Forward (Anticipatory): Earlier sound is influenced by later sound. There's anticipation of upcoming sound /spoon/ has lip rounding @ /s/ which doesn't happen at isolation, /oo/ influence is strong, so lip rounding happens 2 sounds earlier, during /s/. backward (retentive): Later sound is influenced by earlier sound. /no/ /o/ sound nasalized due to preceding nasal sound /n/.

What EMG shows us (Where does signal come from? What does it mean?)

EMG - Neurons & Muscles: Motor neurons innervate muscles. Study of electrical activities in muscles. -Motor neurons innervate muscles. -Motor units consist of - a motor neuron -a few or many muscle fibers Where does signal come from? Muscle contraction follows neural stimulation. Brain send impulse down axon to muscle fiber, causes to contract when stimulated. Electrical activity measured during contraction and measured with EMG reflect activity of muscle. Electrodes detect activity of many motor units. Can't always determine what motion occurred directly from EMG signal, bc may be co-active muscles on other side of joint. Electrical activation of muscles causes them to contract, but antagonist muscle may oppose movement, so muscles may not shorten as much as expected. Thus, link bw EMG & movement not simple. Researcher use SIMULTANEOUS MEASURES, since not easily predictable from EMG activity itself-measure limb/other structures motions @ same rate as EMG activity measured. What signal means? EMG-Info: Not straighforward, but electrical activity represent signals sent to muscles that originated from CNS & reveals neural control details. Learn what brain has told muscles to do, bcus activity measures in muscles reflects CNS signals that came down axon to muscle. Learn about strategies/details of neural control of muscles activation. EMG data complicated.

EMG and Electrode Type (Surface vs. Intramuscular Recordings)

EMG is an Indirect way of studying movements. Study of electrical activities in muscles. -Motor neurons innervate muscles. -Motor units consist of - a motor neuron -a few or many muscle fibers -muscle contraction follows neural stimulation -electrical activity measured during contraction -powerful amplifier so it can be recorded bw 1000 and 10,000 -Electrical stimulation of muscles causes contraction -antagonist muscle opposing the movement so it may not be easily predictable. -link bw EMG & movement is not simple -many researchers measure EMG & movement (motion of limbs) simultaneously. -Timing can be very valuable. -We can LEARN what the brain CNS from axon has told the muscle to do. -EMG data can be complicated. Shows muscles have different goals from different control area. Electrode Types - 1. Intramuscular Electrode (invasive): For FINE detail or muscle activity. -inserted into muscle, so it can measure electrical activity. When laryngeal EMG activity measure wanted, need to record activity of very small & very inaccessible muscles; like thryoArytnd, cricothyroid, lateral cricAryt, have to insert electrode. Usually inserted electrode by hypodermic needle, tiny hook-wire electrode thru skin into laryn, & into these muscles in order to make measures of their activity during different speech production tasks. Amplifier boosts signal -gain often bw 1,000 and 10,000 2. Surface electrode for overall activity measures of larger muscles. -triceps, biceps, muscles of the leg bcus You can measure electrical activity thru the leg bcus there are so many muscle fibers firing at the same time. EMG involves sticking little tiny electrodes into velum muscles. Not comfie, & not use a lot for clinical approach, but used by some reseachers to eval activation patterns of muscles in velo mechanism, so learn more about the brain's signals sent to these muscle. Helps to understand coordination & coordination better. Clinical do use for biofeedback so they can see tension they are causing. The signal are small so we need a strong amplifier

VOT & Categorical perception (how does categorical perception differ from continuous perception?)

English has TWO VOT categories english b (voiced) +25 P (voiceless) spanish b -4 p 1. VOT is the same as Categorical = ----Abrupt change with a crossover point. ----listener seem to respond in a nonlinear way very abrupt. ---vary the acoustics along a continuum ---gradual change from start to end ---at one point, perceived sound abruptly changes ---discontinuous perception of a continuum ---a category boundary has been crossed. ---within-category change: head as the same ---across-category change; heard as different 2. Continuous = gradual change

Auditory feedback (how does it help with learning to speak vs. How we control speech as adults?)

Feedback is essential for learning. - babies are experimenting with their vocal tract and making sounds (babbling) and listening for their results and then making adaptations over time so that the sounds they produce start to mimic and match those produced by their caregivers. Also important for quality control e.g. adventitious deafness - suddenly later in life w/past normal hearing. As time goes by, their speech becomes less accurate and less precise overtime. ppl with post lingual deafness they became less accurate some of their speech vowel formant would change, fricative spectral was altered bcus they weren't producing speech as much as they use to. Later revd cochlear implant then when some of hearing was restored their speech became more accurate again. During early learning of speech and language auditory feedback as an infant is very important bcus it can help us learn the association bw movements and the sounds that results from those movements and then when we become more PROFICIENT speakers we use auditory feedback perhaps in a more general sense to maintain the quality of what we are doing and it isn't necessary for the moment by moment monitoring of what we do. Feedbacks role: Feedback comes up in all sorts of contexts. Ex. Buy something online, & merchant want you to leave FB. Or others FB can help you to make purchase decision. Also important for speech production & couple different kinds: Open-loop control

Non-speech oral motor exercises (what are some problems with these approaches?)

If you want better speech, then speech should be practiced. SOme clinicians believe that strenthening tongue, thru greater range of motion,, will improve patiens ability to speak. BUT NO EVIDENCE SUPPORTS THIS!!! Bcus of principle called EXERCISE SPECIFICITY: Boils down to you get what you pay for...if training as athlete to run marathon, then you practice running longer, longer distances. For speech production movements =rapid, accurate, & highly coordinated, but do not rely on strength so doesn't help if clinician has patient force tonge against resistance or chew gum, or excises. Value for testing non speech behaviors, in order to evaluate for disorder: tasks performed to look for confirmatory signs. Ex Patient comes in with diagnosis from neurologist suggesting condition so when clinician examines with oral peripheral exam (OPE), helps clinician to learn whether observed features from tasks are consistent with Doctors diagnosis.

Semitone Standard Deviation (STSD) (How do numbers reflect our perception of intonation in speech?

Mean fundamental rises the mean fundamental frequency will also rise. So we convert into semitones bcus they are scaled appropriately so they reflect proportional changes in intonation around a certain average fundamental frequency this matches our human perceptual system but also allows us to compare like when the mean fundamental frequency is different amongst two different individuals Benefit is u can compare ppl who differ in average Fo Using semitones corresponds more closely to our pitch perception bcus it is not linear. -People would have a similar standard deviation in semitones but not in hz -A semitone is always 1/12th of an octave -Standard deviation in HZ hard to compare across males, females -Semitone standard deviation (STSD) makes values comparable for high or low mean Fo. The mean fundamental frequency (Fo) is what we perceive as pitch. But what can be more interesting from clinical standpoint is Fo variability, bcus it reflects speaker's intonation. From statistics you'll remember that standard deviation (SD) is measure of variability around the mean. Small SD in FO values would reflect flat and monotone voice. Large SD would mean that voice went well above and below average. So voice would have more rising and falling pitch contours, and thus better intonation than flat, monotone pattern.

Measuring Nasalance (What type of instrument Quantifies this?)

Nasalance: Relative amount of nasal energy to total energy; Representation of relative proportion of energy from 2 mics. useful for clinical assessments, bc 1. Determines whether patient falls within normal limits for their nasalance during production of given utterance. 2. Provides biofeedback signal during speech. Feedback during therapy lets patient see own ability to manage oral nasal balance in their speech production. Nasalance increases/decreases with different phonemes. Nasometer (Kay Elemetrics): Commercially available device; relies on ACOUSTIC NASOMETRY. Represents degree of nasalance over time. L to R=Time. Nasalance increase/decreases w production of different phonemes along Y-axis. Manual has # of passages for patient to read. Tested extensively with large numbers of people to collect clinical norms. Manual says whether patients production being tested falls within normal range or exceeds normal limits for nasality. OroNasal Mask System: Company called Glottal Enterprises. Designed to meassure relative amount of air flow coming from mouth and nose during speach; How much air is flowing through these little holes in mask when person is speaking. Pointed part on top R rests on bridge of nose. software quantifies proportion of nasalance for test sentence that you have person speak.

Hypernasality (How perceptual severity relates to physiology)

Nasalization: (not necessarily disordered normal) Oral and Nasal cavities are linked. Not necessarily disordered, as it depends on consonant context. Ex. Nasalization of vowel can occur bw 2 nasal consonants. Ex: in word (moon), /u/ vowel is bw 2 nasals, so has normal nasalized sound. But for (Food) where vowel is vw 2 oral contexts, there shouldn't be nasalization. Depending on context, there's cases of oral & nasal cavities linking where it does not affect overall speech quality. Nasal Emission: (Abnormal). Occurs for patient with fairly severe VPI & air escapes thru nose when oral pressure high, like when producing voiceless frictative/stop. Hypernasality: (abnormal) Results in vowel distortion, bc nasal & oral cavities are linked. Nasal cavatities has dampening effect & influences clarity of vowel produced, since it acts as antiresonance, which damps other formants. -vowel distortion -nasal cavaties act as an antiresonance -formants substantially damped Nasality Problems: Caused by velophrangeal problems: Different mechanisms result in velopharyngeal problems: velo pharyngeal incompetence or poor timing of muscle movements. Not always obvious from patients speech if nasality issues due to VPI or out of sync muscles. So, fairly thorough clinical evals done to test variety of issues regarding structural integrity & velopharyngeal movement in both non/speech contexts. ENT may perform flexible endoscopy to evaluation function of structures in variety of different speaking tasks. Functional complexity: No perfect link bw our nasality perception & size of velopharangeal port opening. In normal speech, may have very modest VP opening that does NOT result in any perceivable nasality. Patient may be able to make adjustments with minor leak, like NASAL GRIMMACE: May see scrunched up face when patient consticts nose bit to increase resistance in nasal cavities. Non-speech behavior may be very different from speech activity; Also, behavior of patient during speech can be different from way vport behaves when not speaking. Larger urban centers may have CPal/Cranial Facial Clinics set up to eval Vphrangeal function, for CPAL or other patients with functional or structural reasons for Vpherangeal probs. For CPal, therapy patients for variety of issues related to cranial facial deformity. For functunal problems with nasality, then same professionals (SLP & ENT surgeon) work together to perform clinical assessments & use various instrumentation to document very detailed performance aspects NOT immediately perceptually obvious to listener. Velopharyngeal incompetence: (VPI) Occurs from lack of good closure of velorpharyngealport. likley due to weakness in constrictor muscs that close around stiffened velum, or maybe problems in raising & stiffening velum itself. Velopharyngeal incompetence -Velum fails to raise, stiffen adequately -Pharyngeal constrictors weak Poor timing of velar movements -muscle strength not impaired -timing inappropriate -inter-articulator coordination reduced.

Deterministic

Obeys physical laws of cause and effect (not random)

Semitones (how many semitones in one octave, two octaves, etc.)

Octave: changes mathematical relationship bw frequencies; doubling (up)/halving (down) of frequency 400= 800 (up) & 200 (down) -12 semitone= 1 octave 200-400Hz=1 octave = 12 semitones 200-800Hz=2 octaves = 24 semitones

Identification vs. discrimination experiments in speech perception (how do these two experimental approaches differ? What does research participant do differently for each one?)

People who conduct research into human speech perception use complex synthesis software and hardware to produce their speech stimuli (synthetic speech stimuli) why not normal speech? Two response types 1. Identification: Person reports by saying, writing down, or pushing a button, the sound that they perceived. 2. Discrimination: listener is to determine whether two stimuli were the same or different. Identification: listens to stimuli one at a time. ---listener reports or labels what was heard ---open-response set (say anything they want) or forced choice (multiple choice) ---no right or wrong answers, just response patterns. Discrimination: ---no need to label what was heard ---just if two stimuli were same or different? ---Correct or incorrect

Acoustic features of stressed speech (acoustic parameters that may change when word is stressed - Review Project 7 on Suprasegmentals)

Project 7 coverd changed in duration, Fo, Amplitude on stressed vs unstressed speech. 1. duration 2. Fo 3. Amplitude (dB) Invariant: sound didn't vary or change. Sounds ARE variant. Sound is made depends. Coarticulation (coproduction) how sounds influence one another. Speech Rate: bcus it effects all of the sounds we produce in an utterance. Not a linear or stretching/compression./ Especially because all sounds are different Not all affected equally in their duration. Vowels and fricatives can be longer but not much to do with affricate (cha ja) and fricatives. Slowing speech may help intelligibility. They work on separating out the words to give listeners additional help. Adding pauses makes it a little easier. Pacing boards to push finger up down through slot to produce each syllable in issolation (forces them to slow down dramatically). or you can use a metronome to slow down if their speech is badly disordered. That's right - when we emphasize a word, the general notion is that duration, F0 and amplitude increase. However, you may have found when you did this project that one or two of these changed, and the other didn't so much, or may even have in some cases gone down. There's nothing wrong with this; we just tend to emphasize words in our own special way :)

Types of Prosody (Emphasizing meaning Vs. Expressing emotion)

Prosody is not restricted to a single segment. syllables, words, and phrases. Intonation 3 key elements: Physiologic sence 1. Fundamental Frequency (Pitch - perception) 2. Intensity (Loudness - perception) 3. Duration If you want to add emphasis on a particular word or syllable then you would increase INTENSITY simultaneous in FUNDAMENTAL frequency and DURATION. Primarily driven by respiratory system. 2 classes prosody types: 1. Linguistic Prosody (Pitch/Pause-based)- Helps listener understand GRAMMAR of what was said (i.e. Question or statement). Powerful tool for language acquisition & to express meaning (parents use child directed speech which has exaggerated intonational contours. 1. Pitch-Based: Is it question or statement? It's in mail vs its in mail??? 2. Pause-Based: Can change meaning. Ex. Woman without her man is nothing vs a woman, to her, man is nothing. 2. Affective Prosody (AKA Emotional Prosody): Has to do with emotive EXPRESSION in person's voice, like anger, happiness, surprise, etc., that help listener determine emotive state of speaker (can even be over phone lines). Voice is extremely expressive of emotional state, which is conveyed thru prosodic acoustic cues. Ex. Don't talk to me in the tone of voice! Kids acquire speech and language thru parental interaction. Acoustic cues.to express meaning in our day to day interaction. Prosodic insufficiency bcus they're unable to adjust respiratory output or laryngeal problems that makes them unable to adjust the fundamental frequency. Speech Rate: Not a linear stretching compression

Relaxed Vs. Clear speech (how do they differ?)

Relaxed vs Clear speech: Speak more naturally; Smaller articulatory movements. Clear speech requires greater articulatory movement and more subglottal pressure. Clarity: -Segment duration increase (longer) -release our stops (HaTTTT) with a very clear burst release at the end. -Lindbloms H & H hypothesis - they adjust effort depending on circumstances so if loud environment or speaking with person of hard of hearing then they adjust level dependat and put more muscular effort if needed. Impact of loudness on speech: Loud speech easier than deliberate articulation focus. Loudness has more consistent & similar articulatory movements. Increasing loudness is simple thing for therapy and can involve different articulatory movements. Lindblom's H & H Hypothesis (Hypo & Hyper): When the articulators DON'T meet the targets. Especially with increased rate. Articulatory effort adjusted according to circumstance. Ex. While in very quiet environment without noise distractions, energy expenditure may be minimized by speaking casually. Whereas greater force and effort expended & articulation levels increased when you are in noisy environment or speaking to someone with Hearing loss, adjust effort level upwards and articulate with greater force and muscular effort into speech production also increases metabolic expenditure. Many ppl tend to experience articulatory undershoot. Artic don't necessarily meet the targets that we would predict them to meet. a/e/u u would not move tongue as far as expect to as u would when u produce them in isolation. So when u speak more naturally u make smaller articulatory movements. especially true when speech increases in rate. Ex: skier down a run. if u want to go quicker u cut corners as much as u can minimally gone around but cut corners to gain speed.

phonetic memory (definition, how is it acquired?)

Sound memory - ability we have to remember the sounds we hear. IMMEDIATE memory - short term acts as a buffer --we can repeat back unfamiliar sounds --this immediate memory decays rapidly PHONETIC MEMORY IS LASTING --gained through repeated exposure to sounds --forms long-term sound template --ideal form of a sound that we can compare sounds that we hear. --template is a master form of sound as it should be produced in our opinion. --we then compare the sounds we hear against this to see if its a good match psychoacoustic abilities are NOT learned they are are innate but language is by learning by 6 months they

Direct measures vs. Estimates of subglottal pressure (How can we get definite measures vs. How can we get reasonable estimate?)

Subglottal pressure is key contributor in adjusting loudness (vocal intensity). Respiratory system Direct SubGlottal Pressure Measure (PS or PSUB) 1.Tracheal Puncture: Medical procedure; Dr. punctures trachea & directlyinserts mini transducer that directly measures pressure; hard to get volunteers (invasive). 2. Esophageal Pressure (Balloon): Used much more in past; Transducer in balloon, swallow balloon/transducer part way into esophagus & sensor measures pressure on shared wall bw trach/esophagus. Estimating SubGlottal Pressure: Indirect Estimate of Intra-Oral Pressure (io): Is oral pressure during /p/ closure. Idea is to meassure pressure at specific point/time in speech production, where io pressure is equal to subglottal pressure (measured directly below larynx). This briefly occurs during production of /p/ & vowel (bilabial plosive). Pt produces series of CV syllables (pa pa pa) & during point of lip closure @ /p/. vfolds abduct (laryngeal devoicing gesture). This directly links trach/mouth-pressure equalizes bw 2 spaces so theres no pressure drop @ larynx bc voiceless. • MEASURING SUBGLOTTAL PRESSURE: i. Tracheal puncture - they put a miniaturized pressure transducer 1. Direct, accurate measures during speech 2. Medical procedure - physician 3. Hard to attract volunteers ii. Esophageal pressure - esophageal balloon. They swallow the miniaturized pressure transducer part way into the esophagus and then sensor measures the pressure on the shared wall of the trachea and esophagus. Not used common today. 1. Shared wall: posterior trachea, anterior esophagus 2. Swallow the sensor 3. Pressure is lower than lung pressure 4. Not a very practical or common procedure iii. Estimating Psub (subglottal pressure) - by measuring intraoral air pressure at a particular time. 1. When does oral pressure EQUAL Psub. First find a time when the pressure in the mouth will be the same as below the larynx and this will occur just for a brief instance during the production of a /P/ when the person is also a vowel. Series of consonant vowel and syllables. Person says a series of /pa/ /pa/ syllables and then u came make ur measures from that. 2. A series of /pa/ syllables lets us estimate subglottal pressure during /p/ closure. If we also measure the mean air flow during the vowel, then we divide the pressure by the flow to get a good estimate of laryngeal resistance - i.e. how much the larynx resists the flow of air, which we regulate by the level of vocal fold adduction. You don't need to do any calculations of this on the test. 3. During the point of lip closure during the /p/ production, there is laryngeal devoicing gestures or abduction of the vocal folds this then links directly the trachea and the mouth and the pressure equalizes very quickly bw these two spaces. There is then no pressure drop at the larynx because its voiceless. So pressure u measure in the mouth is the pressure has been driving phonation during the vowel part during this series of syllables. 4. Accurate estimate??!!? Oral estimating of subglottal pressure from oral pressure measurement correspond very closely was pressure that was directly measured from the trachea. 5. Oral air flow - when vowels more airflow and when consonant constriction the airflow will cease. 6. Wide band airflow signal very sensitive to rapid changes in airflow as vf open and close.

Nyquist Frequency

The Nyquist frequency comes into play during the recording or digitization. If you select a sample rate of, say, 16,000 Hz, then the Nyquist is half of that, or 8,000 Hz. This means in practical terms that frequencies up to 8,000 Hz in your incoming analog signal (from the microphone) will be correctly recorded and can be played back. But frequencies above 8,000 Hz will not be saved. Most modern recorders and computers will automatically low-pass filter your incoming signal at the Nyquist frequency to prevent aliasing, which would contaminate your recording with incorrectly saved frequencies that are above the Nyquist. So it's during recording that we are interested in the Nyquist.

Stop Consonant Burst Features (Where is this sound made?) Acoustic characteristics)

The main thing to be clear on is that the burst is produced at the place of articulation as the stop is released, and that its spectral features change with place of articulation (bilabial, alveolar, velar). Also know that aspiration arises at the level of the vocal folds, and it occurs following the burst in prevocalic voiceless stops. That's about it :) • Complete closure: air pressure builds up behind the point of constriction and as closure is released this air is released sound (frication energy) larger flow peak for a voiceless consonant. (P,T,K) • Place: bilabial, alveolar, velar • Manner: brief closure of the vocal tract • Voicing: voiced or voiceless • Silence during closure i. Minor voicing energy possible • Air pressure impounded behind constriction • Burst upon release of closure • Transition: vocal tract adjusts for the sound. Stop gap or stop closure • Total or near-total absence of energy (lack of energy during time of articulatory constriction is being formed.) • Most easily seen between vowels APA ATA where the energy stopped • Hard to see if stop is phrase-initial TA or KA • Typically 50 to 150 ms in duration 1/10th of second • Vf oscillation may continue into voiced stop • Visible "voice bar' GRAY trace indicating very low frequency on spectrogram (SLIDE 8) • The stop gap duration is less for voiced stops and voiceless stops have longer durations • Dysarthria they may sound slurred and imprecised. Or ppl who stutter have longer stop closures or stop gaps. Which looks flat on microphone signal and white in spectograph. • Moments of silence are not really perceptual obvious. abscence of anergy for that brief time. 10th of second. • Voicing bar. Where u have gray trace = low frequency • Dysarthia - bcus tongue and lips may not approximate then their stop closure may be shorter speech slurred and impersized • Stuttering - their stop closure may be longer • Do measurements before and after intervention treatment then we can document changes stop closure duration like stop closure duration as index of articulatory adequacy. • TRANSIENT AND FRICATION • Air pressure build behind that constriction i. Bilabial stop - constriction is behind the lips (P, B) ii. Alveolar stop - constriction is behind the constriction forms behind tongue at the alveolar ridge (t, d) • Once constriction is released the pressure that has been building up behind it forces air out thru the space that is now gradually opening. • Transient - first sharp BRIEF burst of air (when open soda can) • Frication - air flows through narrow gap as opening continues to become wider and wider air becomes turbulent(bcus the airflow is still high bcus pressure hasn't decreased to match that of atmosphere around the speaker) trying to match the . LOWER AMPLITUDE but much LONGER duration. • Gradually widens • Often hard to separate TRANSIENT from FRICATION. Just know they're occuring. After stop closure there is burst that you can see on spectrogram as tall narrow dark band of noise which is acoustic representation of aerodynamic event you see. Stops: Consonants that involve stoppage of vocal tract or complete closure of vocal tract with articulatory constriction. During closure: Silence, then burst. Silence during closure: Brief instant when no sound leaves vocal tract and reflected in microphone signal as brief silent event. Sometimes small amount of energy sneaks through for voiced stops. Air pressure is compounded behind constriction. Burst: upon release of closure, bc air pressure builds up behind point of constriction and as closure released, air bursts out. Ex: pull on soda can and get released burst makes sound. Frication energy: creates charateristic and of stop release of vocal tract. Frication: noise that continues in different form, after burst of air has been released; lower amplitude and longer duration, air continues as opening becomes wider. continues thru closure of voiced stop: vocal fold oscillation. ****from professor ----First is the stop closure, also often called the stop gap. Then as the stop closure opens we get a burst of frication (noise on a spectrogram, or a flow peak in the air flow signal). For voiceless stops you are correct - aspiration follows the burst. VOT is the time between the end of the stop closure/start of frication and the onset of voicing for a following vowel. Transitions are changes in the formants over time in moving from the closed vocal tract of a consonant to the open tract of the vowel that follows. You may see evidence of formants as early as the aspiration but they are much more obvious as phonation begins for the following vowel.

electromyography

The measurement electrical activity in muscles as they are activated

Source vs. filter changes (source-filter theory; know examples of a change to the source only or the filter only)

The source is the laryngeal output (voice or whisper, both of which are generated by the vocal folds) and the filter is the vocal tract - i.e. all structures above the vocal folds (pharynx, mouth, nose) that shape the sound provided by the larynx. Larynx=Sound Source Vocal Tract=Filter Lungs=Pressure source Filter doesn't just remove stuff, but actually resonates, enhances, strengthens some of frequency components. Larynx needs lungs to provide air pressure it needs to function. In theory, vocal tract and larynx should be able to act independently, but they interact also. In this model, presure from lungs and interacting with larynx to generate voicing as sound source, then sent to V tract acting as filter. i. Loudness (make it louder or softer) if softly then u provide very low pressure from the lungs to the larynx. Very loudly then u need to provide greater pressure to the larynx. Vf movements get bigger when louder. ii. Pitch - u can stretch the vocal folds to increase their tension and increase the frequency at which they are vibrating. U do this by contracting the cricoid thyroid muscle. Which then rocks the thyroid cartilage forward relative to cricoid cartilage and pulls on the vf so they become stretched. iii. Voice quality (breathy vs tight pressed)- subtly adjusting the space bw the vf specially around the aretrynoids. If breather weaker then make space larger so more air will escape bw vf as they vibrate or u can press the vf more firmly for tighter sounding voice. You do this increasing the level of contraction in the muscles of adduction. Primarily the Lateral carcoid artenoid and thyroid aretnoid. iv. Phonation vs whispering - so even if the vf are not vibrating and just pushing air bw them to cause turbulence as an whisper you're still generating a sound source this invovles the lungs to provide pressure and vf to form constriction to make this turbulence happen PHONATION involves vibration of vf as they oscillate in n out being driven by the same pressure source from the lungs. -In theory, glottal source and vocal tract filter are independent -Fo can change (raise or lower your pitch), causing harmonics change -While tract configuration remains constant -Tongue can move, changing filter characteristics -While Fo remains the same -you can adjust one without changing the other

Tongue movements during typical speech (how acoustically identified phoneme boundaries relate to movements of tongue)

Tongue speed suggests that tongue doesn't stand still very long-keeps moving dynamically thru all sounds. If think about fact that tongue movement extremely dynamic, even in reliably identified phoneme boundaries, then question comes up about how brain plans these movements, bc NOT discrete, individual, separable movements, but continuous flow of tongue motion dynamically thru all sounds even as different phonemes produced.

Pressure vs. flow vs. resistance - how they relate (Ohm's Law and the voice; what determines how much air flows through the larynx? How can we compute an estimate of laryngeal resistance?)

Two things determine the flow. 1. PRESSURE 2. LARYNGEAL RESISTANCE We control pressure by how hard we exhale or pump air with our lungs. We regulate resistance by how tightly we adduct the vocal folds. If the folds are loose we have a breathy voice with high flow. If they close tightly we have a more pressed vocal quality with lower air flow. OHM'S LAW: Equation to manage flows and pressures in speech production. E=IR Pressure= Flow x Resistance R=E/I Resistance = Pressure/Flow Describes relationship bw voltage, current & resistance. E=Voltage/pressure (pushing force in electricity) I= Current/flow (flow of electrons through wire) R=Resistance (amount of resisting flow of current) Estimating laryngeal resistance • You can measure flow (w/pneumotachograph mask) • You can estimate Psub subglottic pressure (during the closure for /p/ when you measure oral pressure) • You can calculate estimated laryngeal resistance by using flow and Psub. So you divide the pressure you measured in the mouth by the flow you measured in the vowel and then = resistance • Laryngeal airway resistance (Rlaw) i. Psub (cmH2O) divided by flow (L/s) • Pressure drops across the glottis (stronger below the larynx and lower above the larynx. • Difference in pressure is transglottal pressure. • Pressure below minus pressure above. • For a given driving pressure: • Higher resistance means lower flow • Lower resistance means higher flow • Flowing air makes the vocal fold move • Disordered voice often aerodynamically different i. Low flows in vocal hyperfunction (strained pressed voice) ii. High flow in vocal fold paralysis = breathy (where vf don't meet at midline) • air flow constrictions form fricatives = we push airflow thru small constriction. • flow peaks occur at stop release = when we release a stop there's a burst of air that comes out bcus pressure has been building up behind it. • Both at source=larynx filter= vocal tract involve management of air flow so speaking has a great deal to deal how the air flows thru our vocal tract. Subglottal pressure from lungs drive voice and flow thru larynx during phonation. Resistance created by vocal fold adduction.

What determines how much air flows through larynx?

Two things determine the flow. 1. PRESSURE 2. LARYNGEAL RESISTANCE of Vocal folds determines how much air flows thru larynx. We control pressure by how hard we exhale or pump air with our lungs. We regulate resistance by how tightly we adduct the vocal folds. If the folds are loose we have a breathy voice with high flow. If they close tightly we have a more pressed vocal quality with lower air flow. Driving pressure: For any given amount to make voice work-higher resistance @ VF, than more tightly adducted, and flow necesariy drops according to Ohm's law. Lower resistance for any given driving pressure then flow will increase. ohm's law states that pressure, flow, & resistance are all linearly related to one another. Glotall Volume velocity: volume of air flowing through glottis as function of time during phonation; AKA transglottal airflow.

How we parse words from stream of sounds (how can we know where one word ends and next begins?)

Use segmenting of words How do you decide where one word begins and next begins? --Some "top-down" processing is essential -----Word candidates - linguistic knowledge about which words are potential candidates to be segmented, then can apply that knowledge to be heard sound stream. -----Phonotactics help narrow the options - refers to rules governing which sounds allowed to occur next to each other in given language. -----Short-list the possibilities - we tend to make short list for words that might be present and then -----Select the most logical in the context

Calculate Average Air Flow (dividing volume used by time)

Volume of air that flows in given time. (Volume divided by time). average flow= 1 liter of air and phonation lasts for 5 seconds = 1/5 liter per second or 200 cc/second or 0.2 liters/second If you phonate for 15 seconds and use 5 liters of air, then each liter has lasted 3(=15/5) seconds. So flow rate would be 1/3 of liter/sec.

Infant studies of speech perception (what have we learned about categorical perception? How do we know baby hears differences?)

We DON'T need language competence to perceive phonetic differences. --infant heart rate dips for novel(one that stands out) stimuli = discriminates difference. --infant heart rate will stabilize with idental stimuli --reveals discrimination of a difference -- (-20/0ms) not different (/b/ and /b/) --(20ms/40ms) different (/b/ and /p/) --60ms/80ms not different (/p/ and /p/) --VOT boundaries match phonetic categories --seen in infants as young as 1 month Are our brains hard wired for these sound categories? --initial openness to more categories --4-6 months old respond to non-native phonetic contrasts - parents can't detect these. --ability diminishes by 1 year --psychoacoustic ability changes and narrows to fit local language --experience is shaping perception honed in on sounds that are meaninful in their language and ignore those that are not important. --language experiences alter how we perceive --psychoacoustic abilities not learned but anate --language is from experience. --kids perceive their misarticulated sounds less well --perception leads production --studies show that some have low phonological awareness and affects literacy skills

Perturbation vs. modulation - Acoustic definitions; what does each one like alone or in combo? how are these phenomena different? What would be an example of each?

You can have both perturbation AND modulation. Perturbation sound hoarseness/rough is random and rapid change in either period/amplitude from cycle -to-cycle. BOTH jitter (Frequency/DURATION Perturbation) and shimmer (AMPLITUDE Perturbation) co-occur Modulation sounds shaky - but is more rythmic, slower, gradual, pattern: (FM/AM co-occur) FM=Frequency modulation AM = Amplitude modulation BOTH FM/AM CO-OCCUR Combined perturbation & modulation= Hoarse Voice & Shaky/Tremulous

Theory

a BROADER conceptual explanation based on many OBSERVATIONS -explanations of multitasking

Predictive

if you this...then that will happen; it can influence another one.

Parsimonious

use simplest explanation possible rather than make unnecessarily complicated - boil it down to its essence, without dumbing down.

Recognizing Emotional tone in speech (which listener group performed best?)

women and EMT Polyglots results were better bcus perhaps bcus learning a second language might sensitize them to other sounds and make them more effective listeners. Emotional Tone Recognition: Voice is very reflective of person's emotional state. Limbic sx helps regulate emotional responses and can have strong influence on laryngeal function: Has connection to larynx that influences how voice sounds. Emotional stress takes toll on voice. Dysphonia: Can be due to emotional factors w/o any health related issues. Affective prosody & mother tongue- Dromey Study: 142 listeners, 21 langs, All fluent Eng.) 3 groups: 1.EMT Monoglots (ENG Mother Tongue) Grew up speaking ENGLISH with no other languages. 2.EMT Polyglots-Grew up speaking English, but later learned another language. 3. ESL/OMT (Other Mother Tongue). Listeners in sound booth heard randomized recording of actor speaking english in neutral or angry tones and marked off which emotion they perceived used by speaker. Choices included Angry, Neutral, and Foils (False choice, like happy, surprised). Question: wanted to determine accuracy of identifying correct emotion and learn if any different bw different language acquired groups. Results: women from all groups were better at iding emotional context of speech. Polygots outperformed other groups, but no statistical difference bw monoglots or ESL/OMT. Why: So, why does person who learns another language have advantage? Likely bc process of learning another language have advantage? Likely bc process of learning another language, sensitizes you to sounds & human communication in general, and more effective listerner, so more able to id emotion in speech correctly. But why not for ESL since they had 2nd language? Bc study was done in ENG, and ESLs didn't grow up listening to ENGlish prosodic patterns like native english speakers.

F1 and F2 - how they reflect tongue movement (vertical, vs. front/back)

• Just know that F1 rises as the tongue moves from high to low vowels and F2 increases as you move from back to front vowels. Don't worry about consonants beyond remembering that F1 rises from C to V because the tongue/jaw height decreases from C to V. Please don't memorize that chart with all the C-V transitions. It's just there as an illustration of the concept in general, but you do not need to commit it to memory. Decreasing Formant Frequency: All formants decrease in Frequency by lowering larynx in neck or lip rounding, bc it lengthens vocal tract. Lip rounding causes lower frequencies bc causes vocal tract to lengthen, resonation thru tube(from glottis to outside world thru lips), bc longer w/lip rounding...it changes tubes dimensions and causes lower frequencies. Lowering larynx: lowers frequencies when lower larynx with neck, bc it also increases tube length. Like organ pipe resonates lower frequencies. High vowels: F1, lower frequency Low Vowels: F1, higher frequency Back vowels: F2, lower frequency Front Vowels: F2, Higher frequency Formant change: wider mouth = higher F1 Stop closure = lower F1; c to v transition = F1 increase; F2 & F3 influenced by place of constriction. • F1, F2 identify the specific vowel - help listener to identify which vowel they are listening to. • Absolute formant values vary (men, women, children have similar vowel quadrilateral shape. i. Differences with dialect - ii. Men's vocal tracts are largest lower formant frequencies iii. Women medium formant frequencies, children have higher formants iv. Graphing F1 versus F2 shows "vowel space"


Set pelajaran terkait

Trauma and relevant brain structures

View Set

C. Quantitative Reasoning: BASIC CONCEPTS

View Set

Chapter 36: Pain Management in Children

View Set

5th Grade Bible Ch. 5 Test Review

View Set

Alcohol use disorders and withdrawal +NPC

View Set

Computer User Support - Chapter 8: System Management 2

View Set

Practice Exam Domain 6 (RHIT Exam)

View Set

Handling Hazardous Roadway Conditions/ Collisions and Other Emergencies (8.5-8.6)

View Set