Final Exam

Ace your homework & exams now with Quizwiz!

0-9 is 10 numbers

Using a neural network, we saw how the minist data set (consisting of handwritten pictures of numbers 0-9) can be interpreted to predict which digit the picture contains. Why are there 10 output nodes?

We were but now we aren't.

Is Mississippi the last state in the union with regard to educational outcomes?

1000

About how many residents of the USA do I need to poll in order to get about a 3% margin of error?

gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space

According to Tufte, graphical excellence is that which...

Time, Electricity/Energy(cooling and power), Hardware, Storage(needs space, hard drive), Memory(RAM)

Data Science relies on computing. What kind of resources does this computing require?

- Node: neuron connected to other nodes, has activation number and bias - Edges: connects the nodes, has numerical weight - Weights: increases or decreases the strength go the signal of the connection -Layers: collection of nodes, signals pass through layers from beginning to end or back through the layers multiple times depending upon the approach. -Activation Function: Each neuron has an activation value that represents the results of a computed "activation function" y = σ (w1 a1 + w2 a2 + w3 a3 - C) - Bias value: sets sensitivity of node, method for calculating the activation number of each node after the input layer - Input Layer: nodes that represent the input - Output Layer: all the layers between input and output -Hidden Layer: nodes that represent the output

Be able to identify the parts of an ANN: nodes, edges, weights, layers, activation function, bias value, input layer, output layer, hidden layer

- Show the data - Induce the viewer to think about the substance of the findings rather that the methodology, the graphical design, or other aspects - Avoid distorting what the data have to say - Present many numbers in a small space, i.e., efficiently - Make large data sets coherent - Encourage the eye to compare different pieces of data - Reveal the data at several levels of detail, from a broad overview to the fine structure - Serve a clear purpose: description, exploration, tabulation, or decoration - Be closely integrated with the statistical and verbal descriptions of the data set

Be familiar with Tufte's set of principles for excellence in statistical graphics.

- Overestimate something short term. "We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run." Example: GPS (1978) Underestimate something long term (GPS is amazing long term not short term) - Imagining magic - "Any sufficiently advanced technology is indistinguishable from magic" - Performance vs competence - Narrowness of AI capabilities - Suitcase Words - Words that carry a variety of meanings like "Learning" - Exponentials - Moore's law (the number of components that could fit on a microchip would double every year was good for 50 years, but has slowed down) - but AI is not advancing in that way. - Hollywood Scenario - We will not suddenly be surprised by the existence of super-intelligences. - Speed of Deployment - Almost all innovations in robotics and AI take far, far, longer to be really widely deployed than people in the field and outside the field imagine.

Be familiar with some of the "seven deadly sins of AI predictions" in the article by Rodney Brooks

Example, Guitar plugged into amp, input guitar, output sound out of amp Want AI to discover data, input data, use rules to create output, it will find the rules existed with the data

Compare a "non-AI" computing approach to the AI computing approach.

Correlation = x and y are highly correlated (e.g., sun rising, rooster crowing), things happen together (rooster and sun) causation - Causation = x causes y (e.g., smokers often get lung disease), rooster doesn't cause sun to come up

Distinguish correlation from causation.

Linear regression is basically a line in the middle. Logistic regression - looking for yes or no. How do the independent variables (person's age, income, past purchases) relate to a yes/no outcome (whether they will like the advertisement). Logistic regression is used when the dependent variable is binary in nature. In contrast, Linear regression is used when the dependent variable is continuous and nature of the regression line is linear.

Distinguish linear regression from logistic regression.

Persistent memory is your hard drive. Storage is long term storage. Can put an entire database in RAM, takes longer to write to your hard drive. Memory is short term storage.

Distinguish persistent storage from system short-term memory (RAM).

For presenting information to laypeople - Comparing cheetah to car, Young Justin Bieber to present Justin Beiber

For whom are infographics useful?

Yes. 13.9% 2014 - 8.8% 2021

Has Mississippi's dropout rate improved?

- Classroom: Immediate instructional adjustments, Student report cards, Communication with families, Academic and behavioral interventions, Identification for additional supports - School / District: Professional development, Curricular selection, Building needs, Budget and planning (resource allocation), Program need, Policy development and implementation - State: Big "A" and little "a" accountability for schools and districts, Professional development, Policy and legislation, Funding, Program approval, Grant awards

How are data used by classroom teachers, school / district administrators, and state education policy makers?

Classical computers use boolean logic but quantum computers don't only use on and off. It uses quantum property of matter called indeterminacy(heads and tails, both yes and no at the same time). It's basically used for Physics. If enough are together, you can tackle problems like cryptography.

How are quantum computers different than classical computers with regard to logical operators?

Can model what police are doing and see if they are acting like a criminal. - Analyzes daily patrol patterns. - Manage patrol operations in real time. - Identify resource hotspots.

How can data science increase the accountability of law enforcement?

use crime data, location, and type to train a model where and when a specific crime is likely to happen, proactively patrol to help reduce crime rates and victimization

How can using historical crime data be used to reduce crime in the future?

Have to use proxy for fairness, can't model fairness.

How can we measure abstract concepts like "fairness" using data?

Example: you might label each photo with what kind of animal it contains, or label an audio recording with what genre of music, or you might label x-rays with a "yes" or "no" depending upon whether they contain a broken bone, each transaction could be labeled with whether it is fraudulent or not, or a job opening text could be labeled with an occupation "code." With labeled data, split into training / experiment sets.

How do data generally get labeled?

Mercator projection distorts the relative size of landmasses, exaggerating the size of land near the poles as compared to areas near the equator.

How do projections of the earth onto maps distort the relative sizes of continents?

- Community development is the process of making the community a better place to live and work and occurs primarily in the public sector. - Economic development is the process of creating wealth, in which community benefits are created secondarily. Show the benefits of living in an area and persuade you to live there.

How does Community Development differ from Economic Development?

- Acquire a set of unlabeled data. - break it down into groups - scatter plot - make a point in the center of the graph and see distance of point - try to find where groups are by putting dots on the board somewhere and keep moving them until they are all placed

How does K-Means Clustering work?

- Originally used for policing to identify risks that come from features of a landscape and to model how they co-locate to create unique behavior settings for crime. - Now used for public safety, medicine, public health, epidemiology, social welfare, transportation, city planning, emergency management, disaster recovery, homeland security, retail asset protection, and maritime shipping - Claims: Focuses on places rather than people to reduce crime in order to be more objective about why focused policing is done; reduce chance of violating fourth amendment. - Strategy: Optimize resources by placing police where crimes are more likely to occur.

How does Risk Terrain Modeling work?

- We learned this way. - Machine executes an action - Actions bring rewards and could transform the state of the environment - Goal of reinforcement learning is to learn a policy - Reinforced by wrong answer being bad and the good answer rewarded. - For example, 200K times before the car was able to park. Right thing is parking, bad thing is crashing.

How does reinforcement learning work?

Training PredPol on nuisance crimes sends more police to neighborhoods where they are more likely to arrest more people so that even if they are most interested in burglary and violence, they will naturally find more nuisance crimes during routine patrols. This creates a feedback loop spawning more policing in those areas.

How does the choice of training data (low-level crimes vs serious crimes) affect how a predictive crime model works?

- raised stops by 600 percent (700,000 stops between 2000 and 2010) - Annual homicides were down from 2,245 in 1990 to 400 by 2014. - Only .1 % (one of 1,000 people stopped) were linked to a violent crime. Mostly captured drug possession, underage drinking, etc. Many people felt harassed, were angry, found themselves arrested for resisting arrest NO

How effective was stop and frisk?

No. Crime prediction tends to unfairly focus on areas that are lower-income, and these areas are also disproportionally colored for economic reasons related to discrimination and lack of resources.

If a prediction system focuses on predicting where crime will take place instead of who is likely to commit it, does it achieve "color blindness"? Why or why not?

No

If data show a phenomenon could have happened by chance, can we reject the null hypothesis?

8

If your system has 128 GB of RAM available for processing data, how many gigabytes at a time can you process?

Alternative hypothesis

In hypothesis testing, which hypothesis has the burden of proof - the null hypothesis or the alternative hypothesis?

Cognitive Computing: Camgian delivers intelligent software services powered by data science and artificial intelligence that drive operational speed, scope, and scale.

Our guest speaker, Gary Butler, is CEO of Camgian. What does Camgian do?

Dependent variable (points scored in an NBA team) and one or more independent variables (average height of team, average number of points in previous game, average age of players, etc.)

Regression analysis helps us to discover relationships between what and what?

Some data is labeled and some data is unlabeled.

Semi-Supervised machine learning works with what kinds of data?

Labeled data

Supervised machine learning requires what kind of data?

unlabeled data

Unsupervised machine learning works with what kind of data?

Can create do faster decision making.

What advantage does Artificial Intelligence provide with regard to the battlefield?

- Internet AI - Recommendation algorithms (e.g., YouTube) - based on data labeled by users as they use the service. - Business AI - Optimizations or automation based on massive amounts of business data labeled over time by business processes (e.g., using past insurance claims to train a fraud detector) - Perception AI - Extending power of AI to video, audio, and digitizing the world around us with sensors and smart devices. (e.g., face recognition). Makes contact between digital and real worlds seamless. - Autonomous AI - Integrating the three previous waves, combines machine optimization based on data with sensory input so that machines shape the world around themselves. (e.g., an autonomous strawberry picker)

What are Kai-Fu Lee's four waves of artificial intelligence?

1943- Computing approach inspired by biological neurons.

What are artificial neural networks (ANN)?

computer system that emulates the decision-making ability of a human expert. Expert systems solve problems through a series of "if-then" rules.

What are expert systems?

- Supervised - dataset is a collection of labeled examples. - Unsupervised - dataset is a collection of unlabeled examples - Semi-Supervised - dataset contains both labeled and unlabeled examples. - Reinforcement - the machine lives in an environment, the dataset is a collection of features that describe the environment

What are four types of machine learning?

- geographic: can't change location - MS Brain Drain - cultural - family see manufacturing in a negative way -lower education - political - disagree on national immigrant policy

What are some challenges that Mississippi faces in attracting economic development projects?

- Course grades - Anecdotal records - Portfolio information - Discipline records - Demographic information - Attendance records - Enrollment - Formative assessment data - Summative assessment data

What are some data sources for K-12 education?

DS could go over strength trainers and talk to the coach about concerns. DS must know about the sport first so they can use data properly.

What are some drawbacks to the introduction of data science to sports teams?

NY Weather for 1980, Weather records, economic indicators and patient health evolution metrics

What are some examples of time-series data?

- Feed-forward: image recognition (most basic type) - Radial Basis Function: activation function is a radial basis function (value calculated based on distance function) prioritization of repairs in large electrical networks, time-series patterns - Kohonen Self-Organizing: pattern recognition (e.g., medical diagnosis) - Recurrent: (activation of layer saved in memory to affect next calculation allowing prediction) text-to-speech - memory of previous word affects prediction of next word. So if the first word is "mickey" there is greater probability that next word speaker says will be "mouse." - Convolutional: connect each neuron only to chunks of filtered input to bring out certain features - face recognition, computer vision. "Moving window approach" - Modular: collection of networks working independently but contributing toward the output. Used to reduce computational complexity, parallelize problem solving, etc. Biological inspiration. - Generative Adversarial Networks: two ANN working together

What are some popular types of neural networks used for machine learning?

Linear Regression, Logistic Regression, Nonlinear Regression, Support Vector Machine, Neural Networks, and Deep Learning

What are some supervised learning techniques?

Many processors work together in parallel to solve a single problem. Can do a specific task and math very fast. Great for research institutions and theoretical computing work. Usually shared resources, physics, fluid dynamics, nuclear simulations, etc.

What are supercomputers good for?

- Understanding - Trying to understand something yourself (Visualizing data as part of exploratory data analysis.) - Communication - Try to help others understand data (Visualizing data as a way to communicate meaning to others) Persuasion - Attempting to change someone's perspective (win an argument)

What are the three basic purposes of data visualization?

- Perform analysis on a sample of the data - Process smaller amounts of data/ Batch process the data(bite off a little at a time) - Reduce resolution/precision of data

What are three ways to reduce the time, electricity, and costs of a data science computing task?

A good sampling method produces a sample that well represents the population for the purposes to which the sample will be put. Sampling cannot be a mindless process. For example, sample voters should be young and old voters.

What characterizes a good sampling mechanism?

- greater computing power - cheaper - bigger and more data sets - back propagation(how to train neural network to get smarter) - train weights to get better answers

What contributed to the A.I. Spring?

They allow us to plot two numerical variables, as points, on a two dimensional graph. From these plots, we can understand if there is a relationship between the two variables, and what the strength of that relationship is.

What could I use a scatter plot to detect?

Based off of what data you have.

What determines which AI technique a data scientist should choose?

- Having pipelines for data and have way to process to data. - Makes sure data is ready to be used

What do Data Engineers do?

An economic developer is responsible for planning, designing, and implementing economic development strategies, as well as acting as a key liaison between public and private sectors and the community. They also provide information on the community needed by local industries and the private and public sectors.

What do economic developers do?

finding the appropriate value for all the weights and biases in order to produce accurate outputs

What does "learning" really mean when we are thinking of neural networks?

Large neural network that can model text. one generates candidates and one evaluates the candidate. Train discriminator on dataset. Seed generator with random activation. Iterate with back propagation so that generator gets better at generating and discriminator gets better at discriminating. Create realistic photos / art / music.

What does a GANN - Generative Adversarial Neural Network produce?

File system that allows access to files from multiple hosts sharing via a computer network. This makes it possible for multiple users on multiple machines to share files and storage resources. Google can string together many commodity hard drives into one giant storage volume.

What does a distributed file system allow us to do?

weather (redder it is hotter, yellow its colder) wealth (darkness of color)

What does a heat map show?

Height of trees over the year, Height of people in marching band

What does a histogram show?

probability of observing an outcome at least as extreme as the one observed in the original data

What does a p-value measure?

Error side between the answer and the whole.

What does margin of error represent?

How strong relationship is between independent and dependent variables. Example, how strong the linear relationship is between # of chimpanzees and percent likelihood of a successful hunt.

What does the correlation coefficient (r) measure?

recruitment and retention of skilled labor

What drives present-day economic development work?

Went really hard at small things like liter and fixing windows, hoping community stays true to standards and everything else fixes itself. Don't let the standards fall - help a neighborhood maintain its own order.

What is "broken windows policing?"

Used for detecting numerical fraud.

What is Benford's Law used for?

Trend toward giving humans insight into what's happening in the "black box" of the hidden layers.Not understanding how machine learning because it is so complex.

What is Explainable AI?

it's an AI that's better at creating content that has a language structure

What is GPT-3?

Distribute a lot of data. Set of tools that allow a data scientists to use a network of many computers to solve problems involving massive amounts of data and computation.

What is Hadoop for?

Math package allows analysis of Hadoop to run. Open Source Analytics engine for big datasets. Spark streams the data from Hadoop in and processes it

What is Spark for?

Machine learning platform

What is Tensor Flow for?

Business to have a dashboard, so they can have something to measure process with. set of processes, architectures, and technologies that convert raw data into meaningful information that drives profitable business actions.It is a suite of software and services to transform data into actionable intelligence and knowledge.

What is a Business Intelligence System (BIS)?

Graphic Processing Unit that solves matrix multiplication quickly. It's used for gaming and processing data.

What is a GPU?

Tensor processing unit which is a container for multi-dimensional data.

What is a TPU?

function that can analyze the feature vector of a state and determine the optimal action to take in order to maximize the reward

What is a policy in reinforcement learning?

Graph showing dots of height. What we are seeking is a line where the differences between the line and each point are as small as possible. For example, can find different groups of people by clustering different populations. Successful hunts by chimps

What is a scatterplot?

a kind of mathematical model representing the process/processes by which data are generated. Expressed in terms of the properties of the objects involved and always includes variables that are not fixed/determined but "distributed" You can model saltiness, temperature, and depth of the ocean. For example, if you lower the temperature, will it increase salinity?

What is a statistical model?

artificial way of human reasoning(doing what humans do) Definition of what is and what isn't an example of artificial intelligence changes over time.

What is artificial intelligence?

Method for training a neural a network to map inputs to outputs by optimizing the internal settings of the network.

What is backpropagation?

type of AI, using data to train pattern replicator building algorithms that rely on a collection of examples of some phenomenon. These examples can come from nature, be handcrafted by humans or generated by another algorithm. The process of solving a practical problem by 1) gathering a dataset, and 2) algorithmically building a statistical model based on that dataset. That statistical model is assumed to be used somehow to solve the practical problem.

What is machine learning?

Looking at results of next. With Google pages, don't retrieve the next page until you hit that page.

What is pagination of a data set?

Development of procedures, methods, and theorems that allow us to extract meaning and information from data that has been generated by stochastic (random) processes.

What is statistical inference?

SVM draws an imaginary 20,000 dimensional line that separates examples with positive labels from those with negative labels. - Find all the similarities in decisions and create a boundary through all data boundaries. - In highschool thousands of things in head: dating, college, school, but yet you created a decision boundary in finding school.

What is the "decision boundary" in a Support Vector Machine?

Refers to a series of setbacks for AI, period of AI didn't make progress, government didn't fund AI

What is the A.I. Winter?

Stochastic properties are distributed probabilistically. Stochastic changes by chance. Stochastic - How tall are the people born on January 1, 1973? Deterministic - How many years old are the people born on January 1, 1973?

What is the difference between a stochastic and a deterministic property?

Enhancing performance(winning) and reducing injuries

What is the goal of athlete engineering?

Color blind people are unable to distinguish the colors, so you need to combine patterns with color.

What is the problem with using only color to convey meaning in a graphic?

Census and Labor Data

What kind of data do economic developers use?

Straight line that is the line minimizing the distance between that and other variables, optimal line, distance between line and dots. Linear Regression is the process of finding a line that best fits the data points available on the plot, so that we can use it to predict output values for given inputs

What kind of line does linear regression draw relative to a scatterplot?

Bitcoin uses ASIC(Application Specific Integrated Circuit) and it will smoke any GPU.

What kind of processor should I use for cryptocurrency mining?

Prescription Monitoring Program (PMP) Data - Prescriber Info (Doctor) - Dispenser Info (Pharmacy / Pharmacist) - Drug Type Info (Dosage, Days Supply) - Patient Info - Total Dosage -Patients Drive Time

What kinds of data objects can I model in order to detect financial fraud or drug abuse?

Hidden layer

What makes machine learning using neural networks "deep learning"?

To measure items across classes or to monitor changes over time.

What would I use a bar chart to represent?

explanatory data analysis. Box plots visually show the distribution of numerical data and skewness through displaying the data quartiles (or percentiles) and averages.

What would I use a box plot to represent?

To organize data in order to see the size of components relative to the whole and showing percentage or proportional data.

What would I use a pie chart to represent?

To show how the parts of a whole change over time.

What would I use an area chart to represent?

When it is difficult or impossible to study the entire population (too big, too expensive, etc.), and when a sample that accurately represents the entire population can be acquired.

When should we sample a population in order to study it?

Fear all people will be taken over.

Why does Bill Joy suggest that AI raises the possibility that "the future does not need us?"

It's cheaper cause you only have to pay for the space you need to use and actual GPU's are expensive.

Why might we want to use cloud processing in Data Science?


Related study sets

ATI RN Nutrition Online Practice 2023 A

View Set

MKTG 301 Concept Check 3 Scott Wallace

View Set

Chapter 24: Digestive System: Check Your Recall (Part 1)

View Set

8.3 - Monetary Policy & Central Banks (from ppt)

View Set

(Dev10) WD: Section 3.2: CSS: Intro, Values

View Set

Synergy What is an Organism? (Topic C)

View Set