Sensation & Perception Exam 2 (Michel)
Ambiguity and "perceptual committees"
- A metaphor for how perception works - Committees must integrate conflicting opinions and reach a consensus. - Many different and sometimes competing principles are involved in perception. - Perception results from the consensus that emerges.
Characterizing power spectra (mixtures of wavelengths)
- These indistinguishable power spectra are called color metamers - Any perceivable color can be reproduced using only three "colors" of light; one to stimulate each of the three cone types - Ideally, you would use three distinct wavelengths of light, selected to independently stimulate each of the three cone types
Size and position cues
- relative size - relative height - texture gradient - familiar size
Color mixing and color spaces
-Additive color mixing: a mixture of lights with different spectra - If light A and light B are both reflected from a surface to the eye, in the perception of color, the effects of those two lights add together. Subtractive color mixing: A mixture of pigments. - If pigment A and B mix, some of the light shining on the surface will be subtracted by A and some by B. Only the remainder contributes to the perception of color. -A color space is a three dimensional space that describes all colors. There are many possible color spaces. • Most of these separate color into one intensity component and two chromaticity components - The one pictured on the right is a slice through the CIE 1931 color space, which is commonly used by artists and engineers -Most of these separate color into one intensity component and two chromaticity components - The one pictured on the right is a slice through the CIE 1931 color space, which is commonly used by artists and engineers - The gamut describes what colors any set of three lights can reproduce - Other color spaces like the RGB and HSB ones used in Powerpoint, are device-specific - These are defined in terms of the intensity of each sub-pixel of the device
Real vs. apparent motion
-Apparent Motion: A motion percept that results from the rapid display of stationary images in different locations. (Flip books) -Real Motion:
Color Blindness
-Color vision deficiency resulting in anomalous color vision -About 8% of male population; 0.5% of female population -Mostly due to missing M or L cones which are sex linked -Anomalous trichromacy: one cone has abnormal cone pigment with absorption shifted closer to another -Dichromacy: protanopia (missing L cone), deuteranopia (missing M cone), tritanopia (missing S cone) -Monochromacy (true color blindess): Cone monochromacy (only one cone type; Rod monochromacy (marked by very poor vision, photophobia, and involuntary eye oscillations)
The History of Trichromatic Theory: Helmholtz
-Discovered the trichromatic nature of color perception -Experimented with mixing colored lights
The History of Trichromatic Theory: Maxwell's color matching experiments
-Given any "test" light (Bluish), you can match it by adjusting the intensities of just three other lights (Blue, green, red).
Hue cancellation experiments and unique hues
-Hurvich and Jameson (1957) -Attempts to quantify the amounts needed for each color -Start with a color such as bluish green -Goal: end up with pure blue -Shine some red light to cancel out the green light -Adjust the intensity of the red light until there is no sign of either green or red in the blue patch -Unique hue: Any of four colors that can be described with only a single color term: red, yellow, green, blue. -Unique blue is a blue that has no red or green tint. -Red has no unique wavelength locus, because all very long wavelengths of spectral light look red.
illusory contours and occlusions
-Illusory contours are perceived edges or boundaries that do not actually exist in the physical image. These contours are often perceived as a result of the visual system's tendency to fill in missing or incomplete information in the visual field. -occlusions occur when one object partially or completely covers another object, resulting in the occluded object being partially or completely hidden from view.
Properties of visual neurons beyond V1
-Increasing receptive field size & position invariance -sensitivity to boundary ownership area (V2) -Tuning for complex shapes -Object or part-specific tuning with viewpoint invariance
The Physiological Basis of Stereopsis
-Input from two eyes must converge onto the same cell. -Many neurons: respond best when the same image falls on corresponding points in the two retinas.
What does this tell us about the neural locus of MAE?
-It has low-level neural circuitry -Perception of objects is relatively slow -Motion is based on explicit position tracking that compounds position errors, making it inaccurate
Displays interocular transfer (MAE)
-Motion must be coded in neurons that respond to both eyes (V1 or later) -fMRI suggests that adaptation in middle temporal area (MT or V5) is responsible for MAE
The problem with visual object templates
-Real-world objects vary in appearance -Real-world objects can be viewed from many different angles and distances.
1. Why have two eyes?
-Redundancy: same reason you have two kidneys -Expanded FOV: more so in rabbits, but also true for us -Binocular summation: pool twice as many samples of light in overlapping region of FOV -Depth perception -this is analogous to doubling your sample size in a research study, and it works for exactly the same reason Stereopsis via binocular disparity, and depth from vergence
Object Recognition Models
-Selfrige's "Pandemonium" model of letter recognition Recognize objects by pooling together feature decisions made by "feature demons". Represents the idea of multiple layers of neurons that react more "loudly" to specific images.
Development of stereopsis
-Takes time, but is not very gradual
Color Constancy
-The tendency of a surface to appear the same color under a wide range of illuminants. -The visual system uses a variety of tricks to make sure that things look the same color, regardless of the illuminant. Ex: Purpose of color vision is to tell us about objects; "is this tomato ripe?" -Visual system tries to discount the effects of the illuminant. It cares about the properties of the surface, not the illuminant. -Still unknown how this is exactly done, but believed to take place in the cortex beyond V1. -Possible to fool the system by using a light source with an unusual spectrum (blue and black dress)
Motion perception vs. Object position tracking
-There is a lot of evidence that motion is a low-level process. -Motion-direction selectivity in V1 neurons (from Hubel & Weisel film in Lecture 6)
Viewpoint-invariant models
-Viewpoint invariance: many objects can be easily recognized from a wide variety of different angles/conditions -Propose representations of objects that don't change as a function of the observer's viewpoint.
The History of Trichromatic Theory: Newton
-White light could be broken up into 7 constituent colors. -Realized colors weren't an inherent property of light, but subjective. -Suggested that we might have seven types of sensors sensitive to light of differing energy
2. How can information from just one eye provide a percept of depth?
-You can get a lot of info. from a single eye.
Three Steps to Color Perception
1. Detection: Wavelengths of light must be detected in the first place 2. Discrimination: We must be able to tell the difference between one wavelength (or mixture of wavelengths) and another 3. Appearance: We want to assign perceived colors to lights and surfaces in the world and have those perceived colors be stable over time, regardless of different lighting conditions
Binocular Disparity & Stereopsis
A cue for depth perception that depends on the fact that the distance between the eyes provides two slightly disparate views of the world that, when combined, give us a perception of depth. Stereopsis: The depth perception that results from binocular disparity under normal viewing conditions.
What is light?
A form of electromagnetic radiation that is visible to the human eye.
Mid-Level Vision
A loosely defined stage of visual processing that comes after basic features have been extracted from the image (low-level vision) and before object recognition and scene understanding (high-level vision). Involves the perception of edges and surfaces Determines which regions of an image should be grouped together into objects
Linear perspective
A monocular cue for perceiving depth; the more parallel lines converge, the greater their perceived distance.
Why a single receptor can't give you wavelength/spectrum information
A single cone can't tell you anything about the color/wavelength of light. This is because a single type of photoreceptor cannot make color discriminations.
The moon illusion
A visual illusion involving the misperception that the moon is larger when it is on the horizon than when it is directly overhead.
Accidental and non-accidental viewpoints
Accidental: Viewing position that produces some regularity in the image that is not present in the world. Perceptual system will not adopt interpretations that assume an accidental viewpoint. (The "small" woman on the chair picture) Non-Accidental: "Typical" viewpoint. Your interpretation of it won't change if you move the camera a little bit.
Color Afterimages
After staring at one color for a long time we will see an "afterimage" of the "opposite" color
Waterfall illusion
An aftereffect of movement that occurs after viewing a stimulus moving in one direction, such as a waterfall. Viewing the waterfall makes other objects appear to move in the opposite direction.
Motion Aftereffects
An illusion that occurs after a person views a moving stimulus and then sees movement in the opposite direction when viewing a stationary stimulus immediately afterward. -Neural adaptation caused by mechanisms similar to spatial frequency adaptation (tilt-aftereffect) and color adaptation (negative afterimages)
What is motion?
An object's change in position over time.
Dichoptic stimuli and binocular rivalry
Bincoular rivalry: occurs when different and incompatible images are presented to the two eyes. Dichoptic stimuli: a different image is presented to each eye.
Depth from shadows
Cast shadows convey a sense of relative depth
Viewpoint-dependency and face recognition
Changing the orientation of an object according to POV. Ex: Mona Lisa painting flipped upside down except for her facial features. -viewpoint dependency refers to the fact that recognizing a face can be more difficult when it is viewed from an unfamiliar or unusual angle. -Object recognition is somewhat but not entirely viewpoint invariant.
Depth perception and the fundamental ambiguity between size and distance
Depth Perception: Refers to our ability to figure out how far away different things are in the world -We have to use a variety of cues to decide the distance between things.
Surface Reflectance Functions
Describe how much light an object reflects as a function of the wavelength of the light -The fraction of the incoming light that is reflected back
Absorption spectra for human cone photoreceptors
Describes photoreceptor response as a function of light wavelength. -A single photoreceptor doesn't "see" color; it just gives a greater response to some wavelengths than to others.
Power Spectrum
Description of the amount of energy (or power) at each wavelength.
Color Metamers
Different mixtures of wavelengths that look identical in spite of physical differences.
Disparity-tuned neurons in V1
Disparity-tuned neurons in V1 are specialized neurons that respond to differences in the position of visual stimuli between the two eyes, which is known as binocular disparity. Binocular disparity is an important cue for depth perception, as it provides information about the relative distance of objects in the visual field.
Dorsal (where) vs. ventral (what) pathways
Dorsal: Location, spatial. Starts in V1 and goest o posterior parietal cortex. Ventral: Object recognition, starts in V1 and goes to the inferior temporal cortex.
Theoretical vs. empirical horopters
Empirical: less curved than the theoretical eventually becomes more concave than covex.
The problem of finding edges
Finding edges in sensation and perception is a fundamental problem that the visual system must solve in order to create a meaningful representation of the visual world. One of the key challenges in finding edges is that the image projected onto the retina is not a perfect representation of the visual scene. -To combat this visual system uses: Top-down and bottom-up processing.
finding just the "right" edges
Humans can see illusory contours because they are the visual systems best guess at observing invisible contours.
Grouping and Gestalt principles
Idea that humans perceive objects as organize patterns & objects; organizing/categorizing stimuli into groups w/ meaning. -"The whole is greater than the sum of its parts" -Good continuation: two elements will tend to group together when they lie on the same contour. -Similarity: objects similar will group together. -Proximity: objects near each other will group together. -Common region: objects will group together if they appear to be part of the same larger region. -Common fate: objects moving together tend to be grouped -Synchrony: objects moving together tend to be grouped.
Types of Eye Movements Know the characteristics and function/purpose of each type
Involuntary eye movements Fixational Eye Movements Fixational eye movements occur when observers try to maintain fixation on an object. They fall into three categories: Tremor: Noisy oscillating movement with frequency of about 90 Hz and amplitude about the diameter of a cone (around 1 arcmin of visual angle) Drift: Slow motions of the eye that occur between microsaccades. Seem to be important for reducing information redundancy in retinal image Microsaccades: Fast, jerky eye movements that carry the retinal image over a range of many dozens to hundreds of photoreceptor widths. Critical for forestalling Troxler fading. Optokinetic Nystagmus (OKN) Movement triggered by tracking of a moving field (as opposed to tracking a single target) Nystagmus is a jerky looking oscillatory movement of the eyes Example 1: looking out of the window of a moving car, without focusing on anything in particular Example 2: trying to track a moving train while standing on the train platform Vestibulo-Ocular Reflex (VOR) Maintain vergence and line of sight when head is moved during fixation or smooth pursuit Works as a reflex that depends on input from the vestibular system Necessary for stabilizing vision during body movements Voluntary eye movements - Vergence - These are eye movement meant to bring a common point into fixation (focus) in both eyes - Convergent eye movements turn the eyes inward - Divergent eye movement turn the eyes outward - Smooth Pursuit - Smooth pursuit movements are voluntary eye movement in which the eyes move smoothly to follow a moving object - Useful for tracking objects and useful for discriminating or identifying physical details of moving objects - Normally can be only executed by tracking a moving object. If you try to move your eyes smoothly without tracking anything, you will end up actually making a series of saccades (jerky eye movements) - Saccades - A saccade is a type of eye movement that can be voluntarily or elicited involuntarily, which the eyes rapidly change fixation from one object or location to another - Saccades are probably the most important and certainly the most widely studied form of eye movement. They are made to primarily gain information, by moving the high acuity fovea to regions of interest in the scene
Opponent channels
L-M (red - green) • S - (L+M) (blue -yellow) • L+M - (L+M) (black - white )
3. How does the brain combine information from the two eyes to get a percept of depth?
Monocular Cues to Three-Dimensional Space Except in extremely impoverished perceptual environments (such as what you might find in a perceptual lab experiment) every view of the world provides multiple depth cues.
Panum's fusional area and diplopia
Panum's fusional area refers to the region in space where the two slightly disparate images from each eye can be fused into a single, three-dimensional perception. This area is also called the zone of binocular single vision. Diplopia: Diplopia, also known as double vision, is a condition in which a person sees two images of a single object. These images may appear side by side, one on top of the other, or at an angle to each other.
RF size
RFs refer to the specific region of the visual field that will cause the neuron to respond when stimulated. -In general, neurons in the early visual areas, such as V1, have relatively small RFs that respond to visual stimuli within a limited region of the visual field. -As visual information is processed in higher visual areas, the RF size of neurons tends to increase. -The size of a neuron's RF can also be modulated by factors such as attention and context.
Selfridge's Pandemonium model
Recognize objects by pooling together feature decisions made by "feature demons". Represents the idea of multiple layers of neurons that react more "loudly" to specific images. -Bottom level: data or feature demons -Middle level: cognitive demons -Top level: Decision demons
Color vision under photopic vs. scotopic conditions
Scotopic (very dim): -Only rods are active -All rods have the same sensitivity to various wavelengths of light -Per principle of univariance, rods cannot sense differences in color -We are effectively colorblind in very dim conditions. Photopic: -S, M, and L cones who each have different photopigments and absorption spectrum -Different wavelengths of light can be discriminated based on differences in the responses they elicit among the three receptor types
Physiology of Eye Movements
Six muscles are attached to each eye, arranged in three pairs. -Eye movements are controlled by an extensive network of structures in the brain (lateral intraparietal area, frontal eye fields, and superior colliculus)
Cone-Opponent Cells in the Retina and LGN
Some retinal ganglion cells have center-surround receptive fields with color opponency • Negative color afterimages are believed to be caused by the neural adaptation of these cells • The perceived polarity is the opposite of the original stimulus, as defined by the color-opponent circuits - Red produces green afterimages - Blue produces yellow afterimages - Light stimuli produce dark negative afterimages - Ewald Hering (1834-1918) noticed that some color combinations are "legal" while others are "illegal." • We can have bluish green (cyan), reddish yellow (orange), or bluish red (purple). • We cannot have reddish green or bluish yellow.
Basic Principles of Color Perception
Step 1: Light Detection Step 2: WavelengthDiscrimination Step 3: Color Appearance -Most of the light we see is reflected -We see only a small part of the electromagnetic spectrum
Stereoscopes and Stereograms
Stereoscope: A device for presenting one image to one eye and another image to the other eye. (Viewmasters) Stereograms: Stereograms are two-dimensional images that are designed to create the illusion of a three-dimensional scene.
Stimulus invariances
Stimulus invariances refer to the ability of the visual system to recognize and respond to visual stimuli that vary in certain properties, such as orientation, size, position, or lighting conditions. -Hierarchial processing -Tuning to multiple features -Experience-dependent plasticity -Attention and context
Structural descriptions and Biederman's geons
T junctions: indicate occlusion Y junctions: indicate corners facing the observer Arrow junctions: indicate corners facing away from the observer; each of these features is still present if the object is scaled, shifted, or rotated by a small amount -Taking any real-world object and describing them using "geometric ions" (kinda like those 3d shapes used in elementary school when you were learning shapes)
Compression and texture gradients
Texture gradients: provide depth information indirectly by telling us about the orientation of a surface. Compression: Elements of the texture get smaller and closer together.
How to build a motion sensitive cell? : The Reichardt Detector
The Reichardt detector consists of two photoreceptor cells that are separated by a short distance in the visual field. The signals from these cells are then fed into two separate channels, each of which consists of a delay line and a multiplication stage. The delay line introduces a time delay in the signals from one channel relative to the other, which allows the signals to be compared at a later time. Must incorporate: -connection to at least two cells representing different retinal positions -delay between response from "first" and "second" cells -mechanism that only responds when (delayed) response from first and second cells occurs simultaneously Responds best only to motion in a particular direction -Responds best only to motion at a particular velocity (determined by delay)
Optic Flow: Using Motion Information to Navigate
The changing angular position of points in a perspective image that we experience as we move through the world.
The correspondence problem in stereopsis
The correspondence problem in stereopsis refers to the challenge faced by the visual system in matching corresponding features between the images received by each eye in order to create a three-dimensional perception of the world.
The horopter (locations with zero disparity)
The horopter is an imaginary surface or line in space that corresponds to the set of points in the visual field that are seen with zero disparity. In other words, when an object lies on the horopter, it is seen at the same location by both eyes, and its images fall on the corresponding points of each retina.
Eye movement regions of the brain (LIP, FEF, SC)
The lateral intraparietal area (LIP) in the parietal cortex The frontal eye fields (FEF) in the frontal cortex The superior colliculus (SC) in the midbrain
Illuminant
The light source
Correspondence problems in motion
The problem of determining correspondence between features in successive frames. -The correspondence problem in motion is analogous to that in stereopsis -Motion viewed through a narrow aperture is ambiguous
Saccadic suppression
The reduction of visual sensitivity that occurs when we make saccadic eye movements
Opponent Color Theory
The theory that perception of color is based on the output of three mechanisms, each of them resulting from an opponency between two colors: red-green, blue-yellow, and black-white. -Some retinal ganglion cells have center-surround receptive fields with color opponency
The Trichromatic Solution
The theory that the color of any light is defined in our visual system by the relationships of three numbers, the outputs of the three receptor (cones) types. -Note: this is NOT a general property of light itself or of the visual systems of all animals
Triangulation Cues to Three-Dimensional Space
Triangulation cues are visual cues that our brains use to perceive the depth and three-dimensional shape of objects in our environment.
anamorphosis in art
Use of the rules of linear perspective to create a distorted 2D image that only looks correct when viewed from a particular viewpoint.
The visual cortical hierarchy
V1 is at the bottom of the hierarchy (V1 has a tiny receptive field, kind of primitive) The further you go the more complex (larger receptive fields) and selective it gets --> higher order cortical areas Two types: 1. dorsal/parietal route, mostly the where and the how - space and guiding motor planning 2. Ventral/temporal, what - recognition V4 has much larger receptive field than V1 (integrating more information over larger area of space) The highest order is going to be very complex, can cover almost entire visual field Higher visual cortical areas leads to functional specialization (color stimuli and motion stimuli)
Main visual areas: V1, V2, V4, & IT/LOC
V1: V2: Sensitivity to boundary ownership V4: Tuning for complex shapes IT/LOC:
Other areas beyond V1 (V2, V4, IT/LOC)
V2: -Integration of visual information -Stereopsis and depth perception -Color processing -Object recognition -Attention and perception V4: -Ventral -Color processing -Object recognition -Attention and perception -Visual memory IT/LOC: -Object recognition -Visual memory -Holistic processing (overall configuration of visual features perceived instead of individual features) -Integration of visual, touch, and audition
Can we build an object-selective receptive field from the RFs of V1 neurons? What are the difficulties in accomplishing this?
We can build a very specific and simple RF for object-specific things, but it would never work because we need to see the object from different angles, sizes, colors, etc.
The Problem of Object Recognition
We do not know exactly how the brain determines object recognition. For example, we don't know how we make the receptive fields for houses vs. regular LGN RFs.
Objects in the Brain
We recognize objects by: -Retinal Ganglion Cells and LGN: Spots -Primary Visual Cortex: Bars The extrastriate cortex (region of cortex bordering the primary visual cortex and containing multiple areas involved in visual processing) turns the spots and bars into objects and surfaces.
The Principle of Univariance
With regard to cones, the principle that absorption of a photon of light results in the same response regardless of the wavelength of the light
Why code motion as a fundamental perceptual dimension?
You can build motion perception off of existing systems (systems for recognizing objects and determining relative position).
Motion parallax
a depth cue in which the relative movement of elements in a scene gives depth information when the observer moves relative to the scene
Nonaccidental features
a feature of an object that is not dependent on the exact (or accidental) viewing position of the observer
Relative size
a monocular cue for perceiving depth; if we assume that two objects are similar in size, we perceive the one that casts the smaller retinal image as farther away
Deep Neural Nets and their connection to:
a type of artificial neural network that is designed to model complex nonlinear relationships between inputs and outputs. They are called "deep" because they have multiple layers of neurons, which allow them to learn hierarchical representations of data. DNNs are inspired by the structure and function of the human brain, which also has multiple layers of neurons that process information in a hierarchical fashion.
Binocular Depth Cues
clues about distance based on the differing views of the two eyes
Pictorial Depth Cues
clues about distance that can be given in a flat picture
"Grandmother" cells
hypothetical neuron that represents a complex but specific concept or object. It activates when a person "sees, hears, or otherwise sensibly discriminates" a specific entity, such as his or her grandmother.
Akinetopsia
inability to see objects in motion caused by disruptions to area MT
Motion Cues
information that specifies the distance of an object on the basis of its movement
Accommodation & blur
intraocular muscles are responsible for adjusting the shape of the lens; brings near or far objects into focus on the retina.
Computation of Visual Motion
motion coded along opponent channels
Spectral
referring to the wavelength of light
Oculomotor musculature
six muscles are attached to each eye, arranged in three pairs
Random Dot Stereograms
stereograms in which the images consist of a randomly arranged set of black and white dots, with the left-eye and right-eye images arranged identically except that some of the dots are moved to the left or the right in one of the images, creating either a crossed or an uncrossed disparity
Optic flow
the complex motion of points in the visual field caused by relative movement between the observer and environment; provides information about the relative distance of objects from the observer and of the relative direction of movement
Aerial perspective
the haziness that surrounds objects that are farther away from the viewer, causing the distance to be perceived as greater
Second-Order Motion
the motion of an object that is defined by changes in contrast or texture, but not by luminance
First-Order Motion
the motion of an object that is defined by changes in luminance
What is Color
the property possessed by an object of producing different sensations on the eye as a result of the way the object reflects or emits light.
Vergence angle
the relative angle between your eyes when you are looking at an object provides information about how far away the object is
Involuntary eye movements
unavoidable small eye movements that occur during fixation
Occlusion
when one object partially covers another