Final Exam - Cognition
One of GOFAI - Logical Theorist (Newell and Simon 1955)
- discovers proofs for theorems in symbolic logic using two things: a list of axioms and previously proved theorems - the logical theorist (machine) was able to prove 38 out of the 52 theorms of Principia, one proof being even more elegant than the original one. - Newell and Simon compared the LT's operation with transcripts of humans solving the same problems to show that this program demonstrated human type thinking.
What are the limitations to the proposal for the Case Study: inferring multimodal shape representations
- prior distribution over shapes p(h) was quite limited - the model was given the parts of Fribbles (e.g., central body) and only had to learn the choice of parts for each object and each category What would a more realistic prior over object shape look like? - what are the parts into which all the objects can be decomposed? Which parts are more probably (occur more frequently) than others? -What are the rules for combining parts to make objects? Are certain arrangements of parts more probable? - How do humans learn to decompose/recompose object shapes? -Is knowledge of object shape in general innate or learned from experience? (E.g., how would learn an inventory of object parts?)
What are the components of a Turing machine?
- read/write head - machine table containing instructions for reading/writing and moving the head - memory for the current state - infinitely long tape The head can read and write a single symbol in a cell of the tape, and move to the left or right along the tape.
One of the GOFAI - SHRDLU (Winograd 1970)
- responds to questions about its block world - can carry out commands on the simulated objects in its block world, and report the results - it can even ask for clarifications (about the commands) and give explanations and answers - Organization of SHRDLU includes monitoring, input, grammar, semantics, dicitonary, semantic features, programmar, blocks ,answer, planner, mover, data, with arrows between these different parts -Example of SHRDLU P: When did you pick up [the green pyramid]? C: While I was stacking up the red cube, a large red block, and a large green cube P: Why (explanation)? C: To get rid of it P: Why did you do that? C: To clean off the red cube. P: Thank you C: You're welcome
The functionalist philosophy that underlies Logical Theorist:
- the functional equivalence (the ability to think and prove theorems) does not imply structural equivalence (neurons does not equal circuits) - Figuring out what neural mechanisms do the information processing functions in the brain is a completely separate task for another level of theory construction that the logical theorist does not cover - Their theory is one about the information processes involved in problems-solving, and NOT theory about the neural or electronic (circuitry) mechanisms for information processing - basically, the functional equivalence does not equal the structural equivalence. The logical theorist isn't trying to explain the mechanisms behind the information processing. it is simply a theory about the actually problem solving process.
Case study: vowel systems (the numbers)
-37 vowels in the standard vowel chart -if n-vowel systems were created by choosing 'n' vowels at random, we would expect there to be: 7700 different possible 3 vowel systems, 66,045 4-vowel systems, and etc... (37/n) = possible systems of size 'n' the number of logically-possible systems is already large when n =3, and grows extremely fast with 'n'. Do we find an enormous range of variation in our typological surveys of many languages?
What are some typical sources of noise? (MENCH)
-Memory noise - Environmental and experimentally-induced noise -Neural firing noise [S, pp. 360-361] -Corrupted input (e.g., for language: false starts, speech errors, etc..) - Higher cognition: "everybody lies" - Dr. House EMNCH
Some new (or renewed) areas of the field
-Moral psychology -Cultural universals -Consciousness -Aesthetic psychology
How is UMG understood as one of many mental information processing systems?
-UMG converts stimuli into symbolic moral representations by a series of domain-specific transformations -the visual system converts patterns of light (the proximal stimulus) into symbolic representations (of edges, surfaces, geons, objects, faces, etc and their relative locations) by a series of domain-specific transformations. ex) detection of an oriented edge at a particular location in the visual field -the linguistic system converts patterns of acoustic energy into symbolic representations of speech sounds, words, sentences, and ultimately meanings by a series of domain-specific transformations ex) categorization of a speech signal as an instance of [ka] [ga]
Turing's discoveries + other developments in mathematics and computer science led to a wave of Artificial Intelligence research, which included which 5 discoveries aka GOFAI (good old fashion artificial intelligence)?
1. Logical Theorist (Newell & Simon's 1955) 2. DENDRAL (Feigenbaum, Bunchanan & Lederberg 1971) 3. ELIZA (Weizenbaum 1960) 4. SHRDLU (Winograd 1970) 5. Scripts (Schank 1972) and Frames (Minsky 1975)
What are the two components to any Bayesian account?
1. Prior beliefs [p(h)] - made up of innate predispositions and previous experience 2. Likelihood [p(d I h)] - made up of generative/causal model of observations (the model should include sources of noise).. - probability of observing data D if hypothesis H were true = likelihood of hypothesis H given observed data D
What are the plausible types of prior knowledge that could lead to these deviations? (variation from presented values)
Ambiguity in Vision: 3D -> 2D - a major source of uncertainty in visual perception is due to mapping from 3D space to 2D retinal images -While measurement noise can (in principle) be mitigated/reduced by repeated observations, the ambiguity inherently existent in 3D to 2D projection, can only be overcome by PRIORS on possible percepts (also 'constraints' or 'heuristics')
What is the input to a Turing machine?
An initial string of symbols on the tape (and the position of the machine head on the tape)
Turing's claim, following his invention of the Turing Machine was:
Any explicitly stated computational task could be performed by a machine with an appropriate set of finite instructions
Case study I: co-sleeping across cultures - Who ought to sleep by whom in the human family?
The historically evolved behavioral script that calls for nighttime separation from children and parents is applied on a nightly basis in middle-class American families. But, for adults in Asia, Africa, and Central America, this separation is a form of "child neglect" Japanese think of Americans as "merciless" for forcing children to be on their own and isolated in a dark room throughout the night. People in Orissa, India have similar moral concerns about the practice, which they see as parental irresponsibility.
What are the functions of the different components of a Turing machine?
The machine table, the set of machine states, and the set of symbols (or alphabet of a Turing machine as finite; only the tape is infinite) The machine table and set of machine states of a Turing machine can be thought of as the program that the machine is running - Ex. if in state q0 and read symbol 1 on the tape, write a blank ('_'), go to state q1, and move right one cell
Responses to the Chinese Room argument - the systems reply
The systems reply - an individual person locked in a room may not understand the story, but the fact is that he is merely part of an entire system and the system does understand the story - if strong AI is to be a branch of psychology, then it needs to be able to distinguish between the systems that are genuinely mental (have believe and understanding) from those that are not - The study of the mind starts with the facts that humans have beliefs while thermostats, cellphones and adding machines don't. If you get a theory that denies this point, you have produced a counterexample to the theory and the theory is false.
What principles could explain these perceptual effects? (greater variability, bias)
They propose that perceivers have 'prior knowledge' that is adapted to the environment AND favors cardinal (90/180 degree) orientations AND non-cardinal orientations have more perceptual noise. Key: in the figure on the left, brighter values have higher probbility.
Noisy measurements in vision: orientation perception (Girshik, Landy, and Simoncelli 2011)
They reported systematic error functions and biases in comparison of edge orientation due to these perceptual effects: -'greater variability' in perception of non-cardinal values (orientations other than 90 and 180 degrees) -'bias' to perceive values closer to cardinals (closer to 90 and 180 degrees than true values) What principles could explain these perceptual effects?
Example question about noisy measurements from polls - margin of error, inference
A quick calculation of the margin of error for a poll with n respondents and outcome proportion p (e.g., proportion REP) is +_1.96 x [square root of p(1-p) over n] If Poll1 has 500 respondents and p[superscript 1] = 0.48, while Poll2 has 100 respondents and p[superscript 2] = 0.5, intuitively how should we combine the evidence from the two polls to arrive at an INFERENCE about the probability of the election outcome? a) Give the two polls equal weight because each was based on a random sample from the population of voters b) combine the polling data giving greater weight to Poll1 c) combine the polling data giving greater weight to Poll2 d) the evidence from the polls cannot be combined because their outcomes are not the same [p superscript 1 is not equal to p superscript ] Poll1: 48% (+_2.23%) vs. Poll2: 52% (+_5.00%)
What principle or principles determine the vowels that are likely/possible in a system of size 'n'?
ADAPTIVE DISPERSION: speakers/listeners have a universal preference for systems in which the vowels are more distinct (i.e. more dispersed) Intuitively: which types of language would be easier to comprehend if perceived similarity is represented by proximity in the chart?
Defining computation with Turing machines
Alan Turing, British mathematician, computer scientist, WWII code breaker
What is the relation to Marr's Three levels of description? (2 of 3)
Algorithmic The algorithm description of a computational system states the explicit procedure (algorithm) by which the system computes its function (i.e. specifying the particular sequence of events for each input) Ex.) Two of the Turing machines discussed earlier compute the same function, but with different algorithms (machine states and tables) Algorithms can differ in many ways, for example: - correctness/soundness (is the desired function computed?) - efficiency (speed: how much time is required by the algorithm in the worst case? Space: what are the memory requirements of the algorithm in the worst case?_ - Intermediate representations (created between input and output)
What are noisy measurements?
All of our sensory evidence about/from the world + the internal workings of the mind/brain = are corrupted by random fluctuations (noise) The main consequence of this noise is, (since it corrupts), sensory measurements and other neural representations are not perfectly VERIDICAL indicators of external reality.
What are some contemporary analogues of the turing test?
Bronze medal: most human-like computer silver: fool 2+ judges with text input gold : foll half of judges with multi-modal input
Survey of evidence (claim 3)
Claim: 'Socratic method' reveals stable, systematic moral intuitions that are: common across various demographics and independent of explicit justifications If this is correct, it would suggest a certain degree of universality of moral knowledge, and a certain type of independence from other knowledge (modularity) of the 'moral faculty' (moral faculty would be independent from other knowledge) Mikhail 2007 reviews a number of 'trolley problems' that are claimed to evoke such intuitions - stable, systematic, demographicallycrossuniversal, don't need explicit justifications
Survey of Evidence (Claim 1)
Claim: All languages have words for basic deontic concepts (relating to,concerning duties or obligation) : obligatory permissible forbidden If correct, this would be parallel to other linguistic universals and might suggest the existence of the Universal Moral Grammar
Survey of Evidence (Claim 2)
Claim: Young (3-4 yr old) children can distinguish 'genuine' moral violations from 'social convention' violations Example testing scenario: a. Timmy wore pj's to school, is that okay? b. Timmy hit the little girl next to him for no reason, is that okay? What if the teacher said it was okay, would it then be ok? a. yes b. NO
Case Study I: co-sleeping across cultures (different regions)
Cleveland, Ohio -only 3% of children regularly slept in their parents' bedroom during the first year of life, and only 1% did so after their first birthday Appalachian Kentucky -71% of children between 2 months and 2 years, 47% of children between 2years and 4 years co-slept with a parents Japan -50% of 11-15 year old urban japanese boys and girls slept in the same room as their mother or father or both; only 14% of 11-15 yr olds slept alone Whiting -survey of customary sleeping arrangements in 134 societies: infants and mothers co-sleep most of the time
Cognitive universals: color terms Potential problem
Color system of Berinmo (spoken in Papua New Guinea) Different when compared to the color categories of English
What is the Universal Turing machine?
It is possible to define a Turing machine that is universal, in that, it can simulate the computation of any other Turing machine. Simulation of Turing machine T by a UTM is performed by placing a description of T on the tape My desktop computer is also a UTM (or would be given an infinite tape)
Do we find an enormous range of variation in our typological surveys of many languages?
NO! There are many fewer attested vowel systems than logically-possible ones. n=3 [i a u] (e.g., Haida) Nearly all languages have at least these three vowels n=4 [i e a u] (e.g., bandjalong) +5 other systems, all rare n = 5 [i e a o u] (e.g., Spanish) [i 3 a cu ] (e.g., Nootka) +10 other systems, all rare
What is the Church-Turing thesis?
No method of computation carried out by a mechanical process can be more powerful than a Turing machine. In other words, no mechanical process can compute anything that cannot be computed by a universal Turing machine.
Noisy measurements in vision: color perception
On each trial of a simple visual working memory experiment (Bae et al. 2014), participants were shown a single colored square. Then, after a brief memory delay, they were asked to report its color by clicking on a wheel. a. a procedure of delayed estimation b. color spaces used in Experiment 1 (LAB or HSV color wheel) Results: - the reported colors varied (sometimes a lot) from the presented values - the error observed in the reported values is consistent with our color-specific measurement noise distributions of the type shown below. What are plausible types of prior knowledge that could lead to these deviations (variation from presented values)?
What do the symbols stand for in the CIE L*a*b* color space?
One 'slice' through the CIE L *a*b* color space L* dimension: lightness (here fixed at L* = 70) a* dimension: green-red b* dimension: blue-yellow
What distinguishes the 'trolley' and 'bystander' scenarios from the 'footbridge' and 'transplant' scenarios according to this theory ?
Our explanation should use the concepts and representations of the theory-- 'intended vs. unintended' ('ends' vs 'side effects'), chains of causation, the deontic rules==and the process of converting the scenario into an act tree
Lecture 12/01/16 - Cognitive Universals
Outline Some empirical domains in which cognitive universals have been studied - vowel systems - color terms - noun phrase word order - kinship systems (kemp and reiger 2012) - morality (mikhail 2007)
Lecture 11/10 - Computational theory of mind
Outline Hypothesis: Cognition is computation - turing machine - The Church-Turing thesis - The Turing test Three levels at which information processing systems can be described: functional (aka computational), algorithmic, implementational Future of artificial intelligence
Lecture 11/17: Probabilistic Models of Cognition
Outline: - noisy and ambiguous measurements of the world (dating, politics, vision,..) -Bayes' Theorem (rationally combining noisy and ambiguous measurements WITH prior beliefs - posterior, likelihood, prior) -Case study: inferring multimodal shape representations -Case study: inferring physical stability -General issues for probabilistic modeling of cognition
According to Searle, who has a mind?
"Could a machine think?" My own view is that ONLY a machine could think, and only very special kinds of machines--namely brains and machines that have the same causal properties as brains. -That is the main reason strong AI has not told us much about thinking--it has nothing to tell us about machines. By it's own definition, strong AI is about programs, and programs are not machines. - Intentionality [aboutness] is a biological phenomenon, causally dependent on the specific biochemistry of its origins just as lactation, photosynthesis, or any other biological phenomena (so, computers can't have it?) - No one would suppose that we could make milk and sugar by running a computer simulation of the formal sequences in lactation and photosynthesis, (but why do we suppose that we could creating thinking and minds by running a computer simulation of the formal sequences of thinking?)
Examples of noisy measurements (OkCupid)
"The Big Lies People Tell in Online Dating" - people are 2 inches shorter in real life - attractive pictures are more out-of-date ***!!! OkInference: If a person's profile reports a height of 6 feet, what is the probability of each possible true height (height given data: p(hId)?)
What are some of the Universal Turing Machines?
(each of which requires a particular encoding of other Turing machines in terms of its symbol set): Penrose (1989) - Universal Turing Machine described as a single binary number that stretches across 2.5 pages Minsky (1962) - 7-state, 4-symbol UTM Wolfram (1985) - 2-state, 5-syumbol UTM UTMs with small numbers of states and symbols are expected to still have large machine tables. Nevertheless, these existence proofs are remarkable for their compactness.
Criticisms against the Logical Theorist (and its descendant, the General Problem Solver)
- "It only does what it has been programmed to do" - "It cannot learn" (from mistakes or previously encountered blind spots) - "It can only solve puzzles or problems that are easily translated in symbolic form" - "It does not have common sense"
Summary of Lecture 11/15
- Artificial Intelligence (AI): can computers think (like humans)? - Classic AI Models (GOFAI) - Strong vs Weak AI
One of GOFAI - DENDRAL (Feigenbaum, Buchanan, and Lederberg 1971)
- Attempted to figure out what kind of organic compound was being analyzed from mass spectrography data - This makes it, not a general problem solver, but a highly specialized 'module' - It utilizes a massive storage of knowledge about the chemicals (to figure out the organic compounds) - It is one of the first EXPERT SYSTEMS, which are used to imitate/aid the work of domain experts - Others soon followed: MYCIN (helped physicians select antibiotics for patients with severe bacterial infections) and MOLGEN, MACSYMA, and more - One of the 500 rules in the database of MYCIN is "if the infection is meningitis and these organisms were not scene in the culture and the type of infection may be bacterial and the patient has been burned, there is suggestive evidence that pseudomonas aeruginosa is one of the organisms that might be causing the infection"
One of the GOFAI - ELIZA (Weizenbaum 1960)
- ELIZA tried to behave like a Rogerian psychotherapist -"In what way" "Can you think of a specific example" "Your boy friend made you come here" "I'm sorry to hear that you are depressed" - can restate answers to empathetic responses and ask some questions that are relevant and probing -surface level, kind of
Questions about cognitive universals (nature vs nurture)
- NATURE - Do culture and environment impose concepts and categories on the human mind? - NURTURE - Or do our mental concepts and categories reflect our innate systems of perception, classification, grouping etc..? *recall discussion of Universal Grammar proposed by Chomsky and the possibility of Universal Moral Grammar in the Mikhail reading
Interim summary for lecture on 11/17
- Noise is unavoidable in measurement by the senses (and in other representations of the mind/brain). This applies to color, orientation, luminance, position, length..... -The inherent ambiguity in the mapping of sensory data (e.g., retinal images) to states of the world (e.g., objects) makes the problem of perception IMPOSSIBLE or ILL-POSED, without 'priors/constraints/heuristics'. Infinitely many realities could have given rise to any retinal image.
Searle's Chinese Room argument (1980) - though experiment as argument
- Suppose that someone who has 0 knowledge of Chinese is put into a room containing Chinese symbols. - Then, the person is given a set of rules, or a program in English, that explains how to arrange the Chinese symbols (to form an answer) whenever a question (sequence) of Chinese is presented - Suppose that the program becomes so good, that the person becomes proficient at following it, that his 'answers' mirrors that of native chinese speakers. -But in the Chinese case, unlike the English case, the person produces answers by manipulating uninterpreted formal symbols. As far as the Chinese is concerned, the person is simply behaving like a computer. He/she is performing computational operations (receiving questions and giving answers) on formally specific elements (chinese symbols and program). For the purposes of the chinese, he/she is a instantiation of the computer program.
What are the methods for combining noisy measurements data from polls?
- adjustment (based on prior experience): Likely vs. registered voters, pollster biases -weight (evidential value): sample size, recency, pollster rating OkPollster: Do we know who will win (why or why not)? Given all of the polling data (d), what is the probability of each republican winning (p(h = REPId))? *!!!
What are the three main components of Mikhail's formal analysis of moral intuitions?
1. Tacit knowledge of (at least) the following deontic rules -prohibition of intentional battery = purposefuly or knowing causing harmful or offensive contact with another individual or otherwise invading another individual's physical integrity without his/her consent -principle of double effect - an otherwise prohibited action may be permissible if a) the prohibited act is not directly intended b) the good but not bad effects are directly inteded c) the good effects outweigh the bad effects d) no morally preferable alternative is available 2. Symbolic representations (structural descriptions, act trees) that distinguish and relate means, ends, and side effects 3. Conversion rules that map the stimulus (e.g., a description of a scenario or real-life scenario) via a series of rules, into a symbolic representation of the type in - scenario (stimulus) -> temporal causal structure -> moral structure (good vs. bad effects) -> intentional structure: if an act ('mean') has both good and bad effects, the good effects are the intended ones ('ends') and the bad effects are the unintended ones ('side effects')
What are the five main questions relevant to Universal Moral Grammar? CAPIE
1. What constitutes moral knowledge? (Competence) 2. How is moral knowledge acquired? (Acquisition) 3. How is moral knowledge put to use? (Performance) 4. How is moral knowledge physically realized in the brain? (implementation) 5. How did moral knowledge evolve in the species? (Evolution) these are analogous to questions that have been asked about Universal Grammar for language, and similar questions could apply to other cognitive domains.
What are the four steps of testing these cognitive universal proposals? (cycle)
1. gather parallel data from many individuals in many groups 2. identify common properties and observed types of variation 3. propose cognitive universals that account for the restricted variation 4. go to step 1 get data -> identify properties/variations -> propose for restricted variation -> repeat
What are the two purposes of the infinite tape?
1. it provides the machine with access to the (simulated) environment, or outside world; and it provides the machine with unlimited memory (because it can both read and write to the tape)
What are the cognitive universals?
: aspects of cognition that are (approximately) invariant across individuals, cultures, languages, geographic location, and other variables. They play a central role in the 'nurture' vs 'nature' debate.
Defining computation with Turing machines - what is a turing machine?
A Turing machine is a simple mathematical construct that can be used to define computation in a precise, rigorous way.
Case study: Vowel systems (basic background)
A fairly standard vowel chart Basic data: What vowels are present in each language in the typological survey? -Troubetzkoy (1939): 34 vowel systems for different langauges -Joseph Greenberg's Language Universals project - Stanford phonology archies -UPSID (UCLA phonological segment inventory database) (Maddieson and Precoda with vowel systems) -The World Atlas of Language Structures, a joint project that planned maps
What is a specific Turing machine defined by?
A finite machine table, a finite set of machine states, and a finite alphabet of symbols. (one of the states is typically designed as the start state)
One of the GOFAI - Scripts (Roger Schank 1972) and Frames (Marvin Minsky 1975)
A hallmark of human intelligence is that: we understand more than is literally stated or shown. - an example of this being human language comprehension - example conversation would be A: what time is it? B: Wendy Williams is on. From this sentence, we can understand more than what is stated. "Cody went to Olivia's fourth birthday party. After she blew out the candles and everyone sang, he gave her a new truck." - Why did Cody give Olivia a truck? -Name a song that was sung at the party -Is it likely that any kid got jealous or grabby? We can use 'common sense' to answer these questions, even though information is not explicitly given in the statement above. How can we program computers to answer them as well?
Case study: Color terms
A munsell color chart The munsell color system (value goes up and down, hue goes around, chroma is the radius of the circle) -white is on the top, black on the bottom -the colors with maximal chroma or saturation (on the 'outer skin' of the space)
Responses to the Church-Turing thesis
BAsed on the work by Alonzo Church ,and otheres, it remains a conjecture (thesis, not law) All attempts to disprove it so far have failed: known computational systems that look and act very different from Turing machines--such as ordinary computers--are nevertheless not more 'powerful' McCullough and Pitts, proved that a simple type of artificial neural network, in which units compute operations like and, or and not, could be used to construct a universal turing machine. This might mean that real neural networks are no more powerful than Turing machines--in which case the Church-Turing thesis is correct for brain-like mechanical process, which are what Cognitive science studies.
Case study: color terms - cognitive universals for color terms
Basic data: how are colors (chips) grouped into categories by the languages in the survey? Tasks: a) assign color name to each chip b) select focal chip given a color name
General issues in Bayesian perception and cognition?
Bayesian computations have been successfully applied to many other domains of perception (color, shape) and cognition (learning, prediciton) Many fundamental issues remain open: - does the mind/brain compute probabilities according to the Bayes' Theorem exactly or approximately? (in all circumstances or only for certain specialized cognitive abilities? Is the approximation affected by time pressure, attention, etc...?) -How does the mind/brain compute probabilities (exactly or approx)? (how does the mind represent probabilities with tables? What are the neural counterparts of digital Bayesian inference? -What is the origin of prior distributions in the mind/brain? (are they due to genetics--innate prior knowledge? Do they reflect statistical properties of the environment?
Cognitive universals: color terms - Berlin, Berlin and Kay experiment (Basic color terms: their universality and evolution)
Berlin, Berlin, and Kay - basic color terms: their universality and evolution - eleven elementary color categories: black, white, red, orange, yellow, brown, green, blue, crimson, pink, gray -all languages have color terms for white and black -if a language has three color terms, one of them will be a term for 'red' n= 2 black and white n=3 black, white, red n=4 black, white, red, yellow/green n=5 black white, red, yellow, green n=6 black, white, red, yellow, green, blue Problem: green/blue ('grue) color terms in Mayan and other languages
Lecture 11/29/16 - Moral Cognition
Core areas of contemporary Cognitive Science: language, vision, neuroscience, computational modeling
Are there real world trolley problems?
Examples suggested by David Emonds 6/3/1994, Nazis unleashed a new type of flying bomb (doodlebug) on London. The Nazis targeted the bombs V1 at the densely-populated center of the capital. However, unknown to the German high command, the bombs were falling in a few miles short of the center, in a more sparsely-populated area. An obvious plan presented itself to British military chiefs. If the Nazis could be persuaded their boms were on target, they wouldn't alter their trajector. Better still, if they could be convinced that the bombs were falling north of the capital then they would readjust their aim so that they fell further south--perhaps end up plopping harmlessly in the countryside. But this ploy meant that the bombs were more likely to land in south London. Churchill decided to let the deception go ahead, but this was a highly controversial decision within the cabinet.
Are our intuitions about these fictional scenarios, in which a fixed set of possible actions are presented and the consequences of those actions are completely known ahead of time, predictive of how we would act or think in real-life situations? (knowing the consequences ahead of time vs real-life situations)
Experimental results like those presented above are claimed to be similar across individuals and demographic groups. Participants find it difficult to provide explicit justifications for their tacit moral intuitions: 'very odd, I don't know why I chose differently in the two scenarios, the end result is the same. I just chose my gut response.' 'it's amazing that I would not throw a person but would throw a switch to kill a person.' How does this claimed evidence bear on the possible existence of UMG?
What is the relation to Marr's Three levels of description? (1 of 3)
Functional The functional description of a computational system describes the input-output mapping that it ocmputes (what ouput it returns for input in a particular set) ex.) Two of the turing machines discussed earlier both compute the function of addition of integers, with numbers represented as strings of '1's preceded and followed by '0's
What is the computation performed by a Turing machine?
Given a particular input, is the sequence of read/write/move/change-state operations that it makes until it stops/halts when it is presented with thati nput
What is the output of a Turing machine?
Given a particular input, is the string of symbols that are on the tape after the computation for that input halts
Case Study I: co-sleeping across cultures - Comparison of Orissa, India and Hyde Park Chicago
Hypothetical family of: father, mother, s15, s11, s8, d14, d3 - co-sleeping arrangement task (sort family members into sleeping spaces under hypothetical rsources constraints (1-7 rooms) -preference conflict task (evaluate and rank various arrangements that deviate from culturally accepted ones Additional evidence: survey of who actually slept by whom in 160 Orissa households Co-sleeping arrangement task -877 logically-possible arrangements of the seven family members. 95% of them were uniformly ruled immoral (ex. f d3/ m d14/ s11 s8/ s15) -two room results: (orissa father and sons and mother with daughters or father with sons and mother with daughters + youngest son) (hypde park: father, mother, daughters and sons in separate rooms) Some determining factors hypothesized by Shweder -Orissa: protection of the vulnerable'; respect for heirarchy; female chastity anxiety -Hyde Park: the sacred couple, autonomy Does it seem plausible that there would be a universal set of factors for co-sleeping, differently prioritized across cultures or other groups?
What is the relation of Marr's Three Levels of description? (3 of 3)
Implementational The implementational description of a computational system states the way in which the function and algorithm are realized in a physical way. Ex.) the first turing machine discussed above could be implemented with a literal read/write head and tape, or with a Java program and RAM memory. These two implementations would compute the same function with the same algorithm, but using very different physical stuff. The allure of implementation-level descriptions is strong, because they are so concrete. But this is often not the best level for understanding the function that a system computes, since implementation-level descriptions are typically quite complex (e.g., consider describing the synapses of every neuron in even a relatively small portion of the cortex)
Summary of Lecture 11/17
In many areas of cognition, the computations for learning and predicting are likely going to involve generative/causal (forward) models of experience. -Question: What 3D object 'h' is present given the noisy 2D images on the retinas 'd'? or the noisy haptic sensations on the hand? -Answer: compute the probability of the 2D images/haptic sensations given each possible 3D object. Select the one that has high posterior probability. Bayes' Theorem provides a method of going 'backward' from noisy, ambiguous, limited observations to their underlying causes (shapes, forces, etc).. Genrative/causal models of experience (means going from CAUSE -> OBSERVATIONS)
Probabilistic approaches to cognition - How does the human mind go beyond the data of experience? (CLIVS)
In other words, how does the mind build rich, abstract, 'veridical' (truthful, coinciding with reality) models of the world, when only given the sparse and noisy data that we observe through our senses? Models of cognition founded on probability theory have accounted for important empirical generalizations in many domains like: -CLIVS -Causal learning and inference -Language acquisition and understanding -Inductive learning and generalization -Visual scene perception -Social cognition
How can we program computers to answer questions that require 'common sense'? (use Script)
Input Script for a child's birthday party - go to party house - give presents to birthday child - play games - watch birthday cake being brought in - sing happy birthday (how many scripts do we need then? One for each type of party, event, situation, person, etc...???)
What are the recent trends in Artificial intelligence?
Probabilistic systems - probabilities are used in artificial intelligence to represent DEGREES OF BELIEF on basis of evidence and prior knowledge - Probabilistic inference used in 'models of cognitive processes': natural language understanding and learning, object recognition & tracking, and causal reasoning - In these probabilistic models (of cognitive processes), Bayes' Rule is used to relate the evidence/data (d) to possible cognitive hypotheses (H) - P(H I D) fish P(D I H) x P(H) - the possibility of hypothesis H in regards to Data D is the [possibility of data D in hypothesis H] multiplied by the [possibility of hypothesis H] ???*** - Is Bayes' Rule a fundamental principle of human and artificial cognition? -Since, in the models of cognitive processes like (understanding language, learning, object recognition), we use probabilistic inferences...and in these probabilistic models, we use Bayes' Rule, the answer may be YES, Bayes' Rule is a fundamental principle of human and artificial cognition
Criticisms about SHRDLU
Ray Kurzweil was not impressed: "SHRDLU could understand any meaningful English sentence, as long as you talked about colored blocks". - SHRDLU is a domain-specific system that can only talk about these colored blocks. Dissatisfaction with these types of systems have led to the search for domain-independent Natural Language Processing (NLP) systems (so, not enclosed in to talking about colored blocks, able to talk about other things?)
Criticisms about Scripts/Frames
Searle stated - if you are given a story about a man who walked into a restaurant, ordered a hamburger, was pleased, and left the waiter a large tip before paying his bill." and then asked whether or not the man ate the hamburger, you would answer "yes". Now Schank's machines can answer questions about restaurants in this similar fashion, partisans of strong artificial intelligence claim that in this question and answer sequence, the machine is not only simulating a human ability but also 1. the machine can literally be said to understand the story and provide answers to the questions based off that understanding 2. somehow, the way that the machine and its program works, EXPLAINS how human's understand and answer questions about stories. BOTH claims seem to be totally unsupported by Schank's work.
Case Study I: co-sleeping across cultures
Shweder et al. (1995) Who sleeps by whom revisited: a method for getting the moral goods implicit in practice -distribution of family members within available space varies across cultures, and is often associated with moral judgments -but many logically-possible sleeping arrangements are not preferred by cultures that have been studied -maybe there is a universal set of factors that determine these sleeping arrangements, but the prioritization (who sleeps by whom) differs across cultures/families
Case study: vowel systems - questions that come up with knowledge of disperson
Suppose that the attested (provided clear evidence for) vowel systems are those that have maximal dispersion (give the number 'n' of vowels) Does this completely and unambiguously support an innate preference for vowel dispersion on the part of speakers/listeners? Does this support a specifically linguistic preference for dispersion? -do similar dispersion principles apply to parts of language other than vowels? -does dispersion apply to cognitive systems other than the language module? How else could the observed limits on vowel systems arise in the absence of an innate preference (specifically linguistic or otherwise)?
The wave of AI research (the 5 main discoveries) led to criticisms and refutations, the most notable one being ?
The Chinese Room argument against artificial intelligence presented by Searle (1980)
Responses to the Chinese Room argument - the Robot reply
The Robot reply - suppose we put a computer inside a robot, and this computer (which is now a part of the robot) would not just take in formal symbols as input and give formal symbols at output, but actually operate the robot in a way that the robot does something very much like seeing, walking, moving about, eating... (so the computer takes the formal symbols and controls the robot's actions with it) -This kind of robot, unlike Schank's computer (which can only answer questions about block arrangement) have genuine understanding and other mental states.
What is the Turing test?
The Turing test approaches the issue of how computers relate to the mind/brain from the opposite direction, proposing a definition of what it would mean for a computer to have a mind or be intelligent.
Details about the Turing test?
The Turing test is also known as the Imitation game. It goes like this: - a human judge interacts (through a computer terminal) with two systems, one of which is human and the other of which is a non-human computer running a particular program - after a period of time, the interaction ends and the judge must determine which of the systems was human and which was not - if the judge cannot reliably discriminate the human from the computer, then the program has passed the Turing test and should be considered to be intelligent If the church-turing thesis is correct, then it should be possible (ultimately) for a computer program to pass the Turing test, at which point humans and machines would become computationally identical according to the test.
What happened in the Dartmouth conference of 1956?
The basis of the ideal that: all the aspects of learning or all features of intelligence can be so precisely (understood) and described that a machine can be made to simulate it. The term artificial intelligence (AI) was coined by John McCarthy. Minsky states: Artificial intelligence is the science of making machines do things that humans would need intelligence to do.
Responses to the Chinese Room argument - the brain simulator reply
The brain simulator reply - suppose we design a program that simulates the actual sequence of neuron firings at the synapse of a native Chinese speaker's brain when he understands stories in Chinese and gives answers to them. - then, a machine with this program (that mirrors neural firing sequence of a person), is understanding stories. If we say that, no this machine is not understanding stories, then we would also have to say that the native Chinese Speakers are not understanding the stories either. (since the mechanisms/sequence behind understanding and answering is the same) - "I thought the whole idea of strong AI is that we don't need to know how the brain works to know how the mind works" (neural sequences vs cognition) - "Imagine that instead of a monolingual man in a room shuffling around symbols, we have the man operate an elaborate set of water pipes and valves connecting them."
What was the summary of the proposal for the Case Study: inferring multimodal shape representations?
To infer the shape 'h' of an object from visual or haptic experiences 'd': - compute the probability of your experience of the object given each possible shape p(dIh). What 3D shape could the object have if this is what I saw/felt? -combine this with a prior probability distribution over shapes p(h). What is a likely object shape in general (prior to seeing/feeling it)? - select a shape 'h' with high posterior probability: p(h) ∝ p(dih) x p(h). Among plausible object shapes, which one fits the data for this object? To infer shape categories (H), perform the same type of computations but with object shapes (h) playing the role of the data: p(H) ∝ p(h)p(H) Generative/causal forward models of p(dIh) predict experience in the same modality as learning, or in a different modality (e.g., V-H condition)
Case Study: inferring multimodal shape representations
Transfer of object category knowledge across visual and haptic (touch) modalities: experimental and computational studies by Yildirim and Jacobs Questions asked: - Does the human mind form shape representations of objects that are common to the two modalities of sight and touch? - What mental generative/causal models relate hypothesized object shapes (h) to predicted visual and haptic data (d)? -Participants learned four categories of nonsence object ('Fribble'). - Separate groups of participants were trained on Visual examples and tested on Haptic examples or vice versa. - Cross-modality test performance was quite accurate (chance = 1/4) Each fribble is a complex 3D object with multiple parts and spatial relations among those parts. It can be experienced (seen, touched) in many ways. Each Fribble is a member (exemplar) of one of the four Fribble categories. One exemplar for each category (canonical project, top view, front view, right view) The human data can be explained by assuming that participants learned multimodal shape representations of the categories. Each category 'h' (canonical project top view, front view, right view) is a probabilistic generative/causal model for both visual and haptic experiences. In the figure, forward model is the term used for generative/causal model. - Haptic forward model vs visual forward model A computational model, trained similarly to how the participants were trained, formed prototype representations of the four Fribble categories that predict the human data.
Explain the evidence bearing on Universal Moral Grammar: Trolly problems
Trolley problems were originally conceived by Philippa Foot and studied by Judith Jarvias, along with other philosophers. Trolley A runaway trolley is about to run over and kill five people, but the driver can push a button that will turn the trolley onto a side track, where it will kill only one person. Is this permissible for the driver to push the button? Bystander A runaway trolley is about to run over and kill five people, but a bystander 'Hank' can throw a switch that will turn the trolley onto a side track, where it will only kill one person. Is it permissible for Hank to push the button? Footbridge A runaway trolley is about to run over and kill five people, but a bystander who is standing on a footbridge can shove a man in front of the train, thereby stopping the train and saving the five people but killing the man. Is it permissible to shove the man? Transplant Five patients are dying from organ failure, but a doctor can save al five if she cuts up a sixth healthy patient, removes his organs, and distributes them to the other five. Is it permissible to do this? Sample results of applying the Socratic method Trolley 94% Bystander 90% Footbridge 10% Transplant 8%
Searle's Chinese Room argument (1980) - background on what is weak/strong AI
Weak AI: Artificial intelligence can be used to better understand human cognition. Computers are tools for making existing cognitive theories more precise. And, AI could be used for practical, non-scientific purposes. Strong AI: A computer that is appropriately programmed IS a mind, in the sense that if they are given the right programs, computers can literally be said to understand and have other cognitive states. In strong AI, the programming computer has cognitive states, and so the programs are not just the tools that allow us to test psychological explanations, but the programs are themselves the explanations (for human cognition?)
What is Universal Grammar? What is the main evidence for UG? (2 things)
it is innate knowledge of the structure of natural languages (phonology, morphology, syntax, semantics,...) -'specifies the form of the grammar of a possible human language' -unconscious knowledge about language in general that guides the process of learning/acquiring particular languages -the main evidence for UG comes from 'language universals' - abstract properties shared by all languages - and 'learnability theorems' - proofs that learning language in the absence of prior knowledge is impossible
Standard example of Bayes' Theorem
look at desktop snapshot
Cognitive universals: color terms - World Color Survey and Kay et al (four principles for color systems)
World Color Survey (WCS) -111 languages, 2 hour in situ interviews with 25 native speakers of each language Kay et al. (1991,1999) proposed four principles for color systems 1. the color terms partition the color space. 2. Distinguish black and white 3. distinguish warm colors (red and yellow) from cool colors (green and blue) - this will account for the languages with 'grue' as a color term 4. distinguish red from other colors
Is there a universal moral grammar (UMG) with properties parallel to those of the universal grammar (UG) for language?
a complex and (possibly) domain-specific set of rules, concepts, and principles that generates/relate various types of mental representations. This system allows individuals to determine the deontic (relating to duties or obligation) status of an infinite variety of acts and omission of acts. (allows us to determine what we are obliged to do and not do )
Explain the processing by the Turing machine
a) input (initial tape): B011011001B b) output (final tape) Programmer's question is the input in each case and Machine's answer is the corresponding output
Example of a machine table
current machine state: 1, 2, 3, 4, Symbol read from tape: B (lank) - write 1, go t ostate 6 0 - write B, go to state 2 1 - move 1 cell left, go to state 1 What (partial) function does this Turing machine compute? Can we guess?
Future of AI
exponential growth of computing
What is Bayes' Theorem?
p(h I d) ∝ p(d I h) x p(h) posterior ∝ likelihood x prior - h is a variable over hypotheses (e.g., percepts-mental result of perceiving- of color, orientation, etc) - d is variable over data (e.g., noisy measurements of attribute values) - The 'prior distribution' p(h) encodes beliefs about the relative probabilities of the possible hypotheses before ('prior to') observation of data d. - The 'likelihood function' p(d I h) has a dual interpretation. Forward: what would the probability of data d be if hypothesis h were true? Backward: what is the likelihood of hypothesis h given observed data d? - The 'posterior distribution' p(h I d) encodes beliefs about the relative probabilities of possible hypotheses after ('post') observation of data d.
Cognitive universals: color terms - New proposal of Regier et al. (universal well-formedness of a color system)
the universal 'well-formedness' of a color system with 'n' categories/terms is determined by the following principle: -members of the same color category should be maximally similar; members of different color categories should be maximally dissimilar Mathematical formalization: the 'well-formedness' of a color system is equal to [total similarity of the colors in category i'] + [total dissimilarity between colors in categories 'i' and 'j'] look for symbols in slide Note: Regier et al. assume that the 'partition' principle from the previous slide, therefore each color/chip belongs to exactly one category Color chip similarity measured in the perceptually-motivated CIE L*a*b* color space
Hypothesis: Cognition is computation
we need to define the term computer or computation What should the definitions of these terms be? NOT "whatever the mind/brain is" or "whatever the mind/brain can do"
