Midterm 2

Ace your homework & exams now with Quizwiz!

Chinese Room argument

"The point of the argument is this: if the man in the room does not understand Chinese on the basis of implementing the appropriate program for understanding Chinese then neither does any other digital computer solely on that basis because no computer, qua computer, has anything the man does not have." Maybe we need to hook up the room to the system and have the person in the chinese room to learn from his environment (The robot reply) Maybe intelligence is an emergent property. Maybe the room only represents one neuron (Chinese Gymnasium)

Kim (subjects)

12 healthy multilingual volunteers (9 males, 3 females) 6 subjects ("early" bilinguals) were exposed to two languages during infancy, and 6 subjects were exposed to a second language in early adulthood Right handed or ambidextrous Mean age of initial second language exposure: 11.2 (+- 4.1) years

Advances in NLP

1980's saw the rise of using machine learning techniques for NLP. Supervised Learning Classification using Decision trees

How does Eliza Work?

2nd person singular pronouns to 1st person singular pronouns, and 1st person singular pronouns to 2nd person singular pronouns, e.g.: you --> me my --> your I'm --> You are Input: I am sick of you Output: How long have you been sick of me?

Artificial Neural Network

ANN: model of a biological neuron Critique: not completely biologically realistic. (doesn't account for conductance speed; there are different lengths of axons; simply an analogy to a neuron) Mimics the way brains absorb information Arrays can be trained by repeated exposures (like Hebbian learning)

Why is NLP difficult?

Ambiguities •Phonetics and phonology - "I scream" vs. "ice cream" •Morphology unionized = union + ized? un + ionized? •Semantics Jack invited Mary to the Halloween ball. I went to the restaurant across the bank. •Syntax Conversions between various syntaxes VSO (Hebrew,Arabic) SOV (Hindi , Japanese) SVO (English, Mandarin) •Inability to interpret emotions leads to undesired consequences. Eg : Tay bot

Neural Network

An Neural Network (NN) is an information processing paradigm that is inspired by the biological nervous systems, such as the brain, to process information.

DeepArt

Art created by computers, what is made by human? what is made by machines?

Radical beahviorism

BF Skinner Dispenses with talk of mental states Aims to explain behavior in terms of the environment Recognizes no dividing line between man and brute

Bias

Bias are the simplifying assumptions made by a model to make the target function easier to learn. Generally, parametric algorithms have a high bias making them fast to learn and easier to understand but generally less flexible. In turn, they have lower predictive performance on complex problems that fail to meet the simplifying assumptions of the algorithms bias. Low Bias: Suggests less assumptions about the form of the target function. High-Bias: Suggests more assumptions about the form of the target function. Examples of low-bias machine learning algorithms include: Decision Trees, k-Nearest Neighbors and Support Vector Machines. Examples of high-bias machine learning algorithms include: Linear Regression, Linear Discriminant Analysis and Logistic Regression.

Named entity recognition

Boeing is parsed as a (company) Wall Street is parsed a (location)

What CAN'T AI do

Buy a week's worth of groceries at Trader Joe's Converse successfully with another person for an hour Write an intentionally funny story

Understanding what systems learn

CLEVR: diagnostic dataset for compositional language and elementary visual reasoning

Consciousness and AI

Can a machine become conscious? According to the strong AI view,the answer is yes. According to the weak AI view, the answer is no.

Annual Loebner Prize Competition

Competitive Turing Test event

Kim et al. (Task)

Task: Subjects were told to silently describe events that occurred during a specific period of the previous day (morning, afternoon, night) • Graphical cues (of morning, afternoon, night) were shown in various orders for 10 sec during the 30-s task period • Subjects were instructed which language they were to imagine speaking.

Moderates

Moderates are able to have two worldviews but about different things. How is it possible to have two different circuits in the same brain that can't both be active at once? Mutual inhibition: if one is active the other is inhibited. Are there places in your body where there is mutual inhibition? Muscles. Every muscle in your body has an opposite muscle (flexors and extensors) Every single muscle in your body works like that. And there are others in your brain

Reinforcement Learning

Reinforcement learning is an area of machine learning inspired by behaviorist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward.

How AI fits into Cog Sci

Developing "intelligent" machines allow us to explore the principles that make minds work • Trying to build minds is a good way to test theories about their properties • Ideas from psychology, neuroscience, and linguistics can result in better AI systems

Recent AI Highlights

Dialog / QA • EntNets (BABI solved!) • CommAI • Wikipedia (Key-Value MemNets, SQuAD) • Dialog-based language learning • CLEVR Reinforcement Learning • TorchCraft • DoomAI Translation • CNN NMT Causality • Neural Causal Coefficient Vision • Weakly supervised learning • SharpMask

What is a GRAY AREA for AI

Drive safely along University Avenue Discover and prove a new mathematical theorem Perform a complex surgical operation

NLP in earlier days

ELIZA SHRDLU

SHRDLU

Early natural language understanding computer program, developed by Terry Winograd at MIT in around 1970. Includes a dictionary containing information about syntax and semantics. It remembers the discourse. ad capability to understand and remember language. First instance of interactive fiction. But only had specific domain language.H

Understanding what systems learn

Take a MC visual question Throw away the image and the question Encode the answer using word2vec features (average over words) Train a binary classifier to predict whether or not the answer is correct At test time, predict the answer with the highest score

Metaphor

Example: "we are spinning our wheels" Is there an image that sticks in our mind A car stuck in the mud A car stuck in the ice Wheels spinning: cars are not moving You have a relationship; its not going anywhere; you are putting effort into it and it is frustrating What you know about the source domain is mapped on to the domain of love Inferences are mapped. That discovery is contradicted in classic theory of metaphor Aristotle "meaning comes from essences of the world and we grasp the essences of the world, as a result every idea is a list of essences; sufficient and necessary conditions" Is that metaphor out in the world? No it is in our mind; the opposite of what aristotle says. It's a mapping of the mind of a structure of one idea to the structure The names of those structures are FRAMES

TAY AI

Extremely racist Microsoft Twitter bot Became racist; pro nazi How long did Tay live? Around 12 hours.

Generative Adversarial Net

Generative adversarial networks are a branch of unsupervised machine learning, implemented by a system of two neural networks competing against each other in a zero-sum game framework. They were first introduced by Ian Goodfellow et al. in 2014.[1] This technique can generate photographs that look authentic to human observers.[2] One network is generative and one is discriminative.[1] Typically, the generative network is taught to map from a latent space to a particular data distribution of interest, and the discriminative network is simultaneously taught to discriminate between instances from the true data distribution and synthesized instances produced by the generator. The generative network's training objective is to increase the error rate of the discriminative network (i.e., "fool" the discriminator network by producing novel synthesized instances that appear to have come from the true data distribution). These models are used for computer vision tasks.[1][3]

Aphasia

Greek "a" (without/lacking) + "phasia" (speech) Wernicke's area and Brocas area

Philosphy of the Flesh

History of philosophy from cog sci perspective We need to find set of metaphors used by all philosophers

Crum

Hypothesis: thinking is performed by computations operating on representations Concepts: Schemas Propositions: Declarative knowledge Rules: Procedural knowledge Analogies: Reasoning, problem solving, decision making Images: Visual imagery

Relationships between entities

Input: Boeing is located in seattle. Alan Mulally is the CEO Output: {Relationship = Company-location Company: Boeing Location: Seattle } {Relationship = Employer-Employee Employer = Boeing Employee = Alan Mulally}

Language deprivation

Is experience necessary to develop language? If linguistic experience is missing in the critical period, language ability is impaired Case study: Victor the "wild child" and genie

Unsupervised Learning

K-means clustering algorithm output example:

Vision: instance segmentation

Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick

Cognitive Reserve

Learn a new language

Machine Learning

Machine learning is the subfield of computer science that gives computers the ability to learn without being explicitly programmed.

ELIZA

NLP program created in 1965 at MIT artificial intelligence laboratory by Joseph Weizenbaum Acting in a manner of psychotherapist

MNIST - LeNET

Number recognition Convolutional Neural Networks are are a special kind of multi-layer neural networks. Like almost every other neural networks they are trained with a version of the back-propagation algorithm. Where they differ is in the architecture. Convolutional Neural Networks are designed to recognize visual patterns directly from pixel images with minimal preprocessing. They can recognize patterns with extreme variability (such as handwritten characters), and with robustness to distortions and simple geometric transformations. Good model to test neural networks. Challenging for a simple model to detect handwritten numbers. LeNet: convolational network structure; a way to deal with image data. Simple neural network. Two convolutional layers. Performs very well. Developed in 1988 LeNet-5 is our latest convolutional network designed for handwritten and machine-printed character recognition. Here is an example of LeNet-5 in action.

Kim et al (1997

Objective: investigate how mult. languages are represented in the brain using fMRI Focused on Broca's and Wernicke's areas

Where Math comes from?

Origin of mathematics Math is a human creation, it is not in the natural world

Overfitting

Overfitting occurs when a statistical model or machine learning algorithm captures the noise of the data. Intuitively, overfitting occurs when the model or the algorithm fits the data too well. Specifically, overfitting occurs if the model or algorithm shows low bias but high variance. Overfitting is often a result of an excessively complicated model, and it can be prevented by fitting multiple models and using validation or cross-validation to compare their predictive accuracies on test data.

SHRDLU (Sample Dialogue)

Person: Pick up a big red block. Computer: OK. Person: Grasp the pyramid. Computer: I DON'T UNDERSTAND WHICH PYRAMID YOU MEAN. Person (changing their mind): Find a block which is taller than the one you are holding and put it into the box. Computer: BY "IT", I ASSUME YOU MEAN THE BLOCK WHICH IS TALLER THAN THE ONE I AM HOLDING. Person: What does the box contain? Computer: THE BLUE PYRAMID AND THE BLUE BLOCK. Person: What is the pyramid supported by? Computer: THE BOX. Person: How many blocks are not in the box? Computer: FOUR OF THEM. Person: Is at least one of them narrower than the one which I told you to pick up? Computer: YES, THE RED CUBE.

George Lakoff's (Brain Concept article)

Proposes that the sensory-motor system has the right kind of structure to characterize both sensory-motor and more abstract concepts.

Frames

Related items Which word doesn't fit: Monday Tuesday elephant Thursday? Let's look at lots of these things and ask what kind of meaning structure brings those together? Frames semantics Took classic example : buying, sell, cost, price. What you have in a commerical event is a frame: buyer/seller/good/money. There is a narrative. First part (buyer) wants the goods and has the money. The seller has the goods and wants to get the goods. Exchange goods for money. Buyer has goods, seller has money Has semantic roles, relationships among things. Simple minded frame, some can be much more complicated. All sorts of things that use the frame but add stuff to it Metaphor: taking one frame and mapping it to another What is carrying out this mapping? Do ideas float in the air? If not, what is going on? Structure of brain has neural circuitry Every idea is within some sort of neural circuit What kinds of neural circuits are there in ideas How is it possible for neural circuits to give rise to ideas? Does neuroscience study this question? NO Studies what is going on at the level of neurons, medical things, looks at brain lesions, but could an fMRI pick out a metaphor? Could it pick out a mapping? Etc? NO You need more than just neuroscience.

Information Extraction

Relevant topics or key phases Takes maximum number of utterances and give key phrases that are born out the analysis

Politics

Right now the country is divided into different worldviews. A worldview is made up of a complex of ideas. Can you understand something that the neural circuitry in your mind does not allow you to understand? Nope. You can only understand what the neural circuitry of your brain will allow you to understand. Neural filter may cause you to ignore, put down, etc.

NLP Tasks

SYNTAX Tokenization It is the process of breaking a stream of text up into words, phrases , symbols and other meaningful elements called tokens.

Rules Rules

Specify the relationships between propositions: "if it is raining, then I will bring my umbrella" Are if-then structures.

Strong AI

Strong AI: we can develop artificial systems that really think

Tensorflow Playground

TensorFlow is an open source software library for machine learning across a range of tasks, and developed by Google to meet their needs for systems capable of building and training neural networks to detect and decipher patterns and correlations, analogous to the learning and reasoning which humans use.[3] It is currently used for both research and production at Google

Bias/Variance Tradeoff

The goal of any supervised machine learning algorithm is to achieve low bias and low variance. In turn the algorithm should achieve good prediction performance. You can see a general trend in the examples above: Parametric or linear machine learning algorithms often have a high bias but a low variance. Non-parametric or non-linear machine learning algorithms often have a low bias but a high variance. The parameterization of machine learning algorithms is often a battle to balance out bias and variance. Below are two examples of configuring the bias-variance trade-off for specific algorithms: The k-nearest neighbors algorithm has low bias and high variance, but the trade-off can be changed by increasing the value of k which increases the number of neighbors that contribute t the prediction and in turn increases the bias of the model. The support vector machine algorithm has low bias and high variance, but the trade-off can be changed by increasing the C parameter that influences the number of violations of the margin allowed in the training data which increases the bias but decreases the variance. There is no escaping the relationship between bias and variance in machine learning. Increasing the bias will decrease the variance. Increasing the variance will decrease the bias. There is a trade-off at play between these two concerns and the algorithms you choose and the way you choose to configure them are finding different balances in this trade-off for your problem In reality, we cannot calculate the real bias and variance error terms because we do not know the actual underlying target function. Nevertheless, as a framework, bias and variance provide the tools to understand the behavior of machine learning algorithms in the pursuit of predictive performance.

What is AI

The science of making machines that Think like humans Think rationally Act like humans Act rationally

Perceptrons

The simplest kind of "feed forward" neural network Orginally introduced by Frank Rosenblatt (1958)

Example of Recursion

This is the house that jack built This is the cheese that lay in the house that jack built

Role of cognitive science in NLP

To program computers to process language we need to recognize how our brain interprets and understands language.

Intuitive Physics: Physnet

UETORCH: open source environment combining torch and 3d game engine -simulation of falling bloacks -Task: predict whether a tower of blocks will fall or not, and trajectories -PhysNet uses deep networks to learn qualitative physics

Underfitting

Underfitting occurs when a statistical model or machine learning algorithm cannot capture the underlying trend of the data. Intuitively, underfitting occurs when the model or the algorithm does not fit the data well enough. Specifically, underfitting occurs if the model or algorithm shows low variance but high bias. Underfitting is often a result of an excessively simple model.

Noam CHomsky

Universal Grammar: proposes set of rules intended to explain language acquisition in child development

Multi-layer Feed forward Network

Use multiple ANNs (input layer --> hidden layers --> output layer) Backprop (backward error propogation): learning algorithm used or learning, instrumental in development of AI Can go back to tweak weights so error is reduced) analogous to taking the same exam over again. We have not found if humans can do this.

Variance

Variance is the amount that the estimate of the target function will change if different training data was used. The target function is estimated from the training data by a machine learning algorithm, so we should expect the algorithm to have some variance. Ideally, it should not change too much from one training dataset to the next, meaning that the algorithm is good at picking out the hidden underlying mapping between the inputs and the output variables. Machine learning algorithms that have a high variance are strongly influenced by the specifics of the training data. This means that the specifics of the training have influences the number and types of parameters used to characterize the mapping function. Low Variance: Suggests small changes to the estimate of the target function with changes to the training dataset. High Variance: Suggests large changes to the estimate of the target function with changes to the training dataset. Generally, nonparametric machine learning algorithms that have a lot of flexibility have a high variance. For example, decision trees have a high variance, that is even higher if the trees are not pruned before use. Examples of low-variance machine learning algorithms include: Linear Regression, Linear Discriminant Analysis and Logistic Regression. Examples of high-variance machine learning algorithms include: Decision Trees, k-Nearest Neighbors and Support Vector Machines.

Connectionist Model of Word Recognition (Rumelhart and McClelland)

Vision scientist using ANN (parallel constraint satisfaction) will allow the network to inhibit parts of the network while allowing you to use others.

Understanding what systems learn

Visual question answering on Visual7W dataset: What color is the jacket? How many cars are parked? What event is this? When is this scene taking place?

Associative Learning

When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased

Part of speech tagging

breaking individual words into their part of speech labels (i.e. nouns, adj, adv, v)

Brain Machine Interfaces

just using your thoughts you can control a robotic arm.

What CAN AI do?

play a decent game of table tennis drive safely along curving mountain road buy a weeks worth of groceries on the web unload a dishwasher and put everything away translate spoken chines into spoken english in real time

Language is special

relies on mental representations particular strengths and weaknesses: great for things that aren't there Uniquely Human: sugests an innate basis, trouble for empiricism Potentially infinite: can always produce novel sentences

Weak AI

we can develop artificial systems that act intelligently - The Turing test will be passed by a weak AI, most AI research aims at intelligent action.

Beyond MNIST: The ImageNet Task

• 1000 different object classes in 1.3 million high- resolution training images from the web. "ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images. Currently we have an average of over five hundred images per node. We hope ImageNet will become a useful resource for researchers, educators, students and all of you who share our passion for pictures. " • Jitendra Malik (an eminent professor at UCBerkeley EECS) said that this competition is a good test of whether deep neural networks work well for object recognition. People have been hardcoding features into these models. 2010-2011 best performing model gets half of these wrong. Alexnet: blew all the others out of the water. Error rate drop of about 10percent. People have been trying different ; error rate keeps going down. 2015 only 3.57% outperforms humans at this point. Google 22 layers

Visual System (AI)

• 1000 x 1000 visual map • For each location, encode: - orientation - direction of motion - speed - size - color - depth

Current challenges in AI

• Deep learning is impressive but very data hungry • Unsupervised learning: how do we learn without labels? • Transfer learning, low-shot learning: how do we leverage the labels we have? • Games, dialogue environments: how do we create interactive environments to generate supervision? • Efficiently scaling up for fast processing of large amounts of data • Do we know what is being learned and why systems work? • How do we get to common sense, intuitive physics, understanding?

Problems in machine learning

• Explaining the predictions made by machine learning model • Understanding why deep learning works • Democratizing (scalable) machine learning • Representing common-sense knowledge

Alan Turing

• His Turing machine help computer scientists understand the limits of mechanical computation. • Also explored how to judge success in making intelligent machines.

Research areas

• Improving state of the art in well established applications (e.g. in vision, natural language processing, speech) • Getting a better understanding of how and what machines learn (causality, theoretical and quantitative analyses of neural networks) • Devising new tasks, benchmarks and experimental frameworks for the research community (e.g. dialogue-based language learning, CommAI, Torch) • Scaling up (e.g. fast algorithms for nearest neighbor retrieval and text classification)

Supervised learning

• In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output. • Different types of supervised learning

ML vs AI

• Machine learning concerns the way we build machines is to train them to "score" high in "test. It's not really making them "intelligent", which is the goal of artificial intelligence. • Numerical Test for machine learning and Turing Test for artificial intelligence.

What we do at FAIR

• Open research in AI • Publish papers and code • Encourage external collaborations • Covering a wide range of research domains and approaches • Undertaking long term and ambitious goals (research does not need to be directly applied)

Neural networks are good at:

• Recognizing patterns: facial identities or facial expressions; spoken words; objects in real scenes • Recognizing anomalies: Spoken words; unusual transactions; unusual readings in nuclear power plant • Prediction: future stock prices; COIN

Robotics

• Robotics - Part mech. eng. - Part AI - Reality much harder than simulations! • Technologies - Vehicles - Rescue - Soccer! - Lots of automation... • Problems: - Tracking - Mapping Robotics - Motion planning

Different machine learning methods

• Supervised Learning • Unsupervised Learning • Reinforcement Learning

Kim et al (conclusions)

• The anatomical separation of the two languages in Broca's area varies depending on the time the second language was acquired. • Suggests that age of language acquisition may be a significant factor in determining the functional organization of this area in the brain.

The Turing Test

• This way of evaluating artificial intelligence systems is known as the Turing test • Turing considered a lot of possible problems - machines lacking consciousness - machines missing certain human attributes: creativity, making mistakes, learning, etc. • But if a machine passes the test, can we really conclude that it is intelligent?

Unsupervised Learning

• Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive this structure by clustering the data based on relationships among the variables in the data. • With unsupervised learning there is no feedback based on the prediction results.

Eliza

•Eliza simulated conversation by using a 'pattern matching' and substitution. •Uses a prescribed set of rules; so is inflexible. •Lack of memory: no relation between the current response and any previous stimulus or response. •There is no knowledge of the structure or meaning of the input. •The program has no world knowledge.

Sentiment Analysis

•Extract subjective information usually from a set of documents, often using online reviews to determine "polarity" about specific objects. •It is especially useful for identifying trends of public opinion in the social media, for the purpose of marketing. When you area analyzing sentences sentiment analysis is very difficult because of subjectivity

Natural Language Processing

•Natural language processing (NLP) is the field of computer science that deals with the interactions between human language and computers and how to get the computers to fruitfully process language •It is multidisciplinary :artificial intelligence , computational linguistics, cognitive science

Applications of NLP today

•Speech recognition •Information extraction •Machine translation •Search engines •Sentiment analysis •Question Answering

Noam Chomsky

•Transformational grammar (TG) or transformational-generative grammar (TGG) •Study of linguistics, that considers grammar to be a system of rules that generate exactly those combinations of words which form grammatical sentences in a given language. •TG involves the use of defined operations called transformations to produce new sentences from existing ones.


Related study sets

Chapter 7 The Nervous System Checkpoint

View Set

LUOA 3.13.1 Body Systems Part 1 (Girl's Health)

View Set

Mechanical Activity of the Heart - PGY300

View Set

CHapter 11 Mgmt 371 Human resources

View Set

Ch. 1 Investigating God's World 5th Grade A Beka Book

View Set