Math 317 - Exam 1
uniform
where every outcome is equally likely
experimental probability
calculated by actually conducting the random experiment repeatedly (many trials) and observing the frequency at which the events occur
random
chosen by chance from some particular probability distribution
mean (of a population)
for a population consisting of numeric values, the mean is the weighted average value; essentially, this is the sum of all possible values, each multiplied by its probability of occurrence
mean (of a sample)
for a sample of numeric data, the mean is the average value, that is, the sum of all the observed values divided by the number of observations
fair games
games with an expected value of zero
numerical (quantitative) data
number data for which it makes sense to perform arithmetic operations (such as averaging)
negative recency effect
people believe that after a string of losses, they become more likely to win in order to balance things out
neglect of sample size
people will ignore the size of a sample when making a judgment about the likeliness of an event
descriptive statistics
statistics that describe and summarize the data in a sample
bar graph
the bars are labeled according to categories and the heights of the bars represent how frequently the various outcomes occurred
population
the largest group about which the researcher or surveyor would like information
probability of an event
the likelihood of that event occurring. Often this can be thought of as the percentage of times the event would occur if a random experiment were performed a very large number of times
least squares regression line
the line that minimizes the sum of the squares of the vertical distances of the points to the line on a scatterplot
median
the median of a set of numeric data is the middle value when the data is placed in numerical order. In the case where there are two "middle" values, the median is any number between those two (often taken to be the average of the two)
mode
the mode of a set of data is the data value that occurs most frequently
histogram
the real number line is partitioned into segments and the areas of the bars represent how frequently the various outcomes occurred (only used for quantitative data)
sample space
the sample space of a random experiment is the set of all possible outcomes
statistics
the study of data collected from a sample in order to make inferences about a larger population from which that sample was drawn
representativeness fallacy
the tendency to make decisions regarding chance based on the (mistaken) belief that even small samples should be representative of population
availability fallacy
the tendency to make decisions regarding chance based on the how easily certain events can be called to mind
base-rate fallacy
the tendency to make decisions regarding chance that neglect the rates at which relevant characteristics appear in a population
conjunction fallacy
the tendency to make decisions regarding chance without regard to the fact that A and B both occurring is less likely (or equally likely) than A occurring alone
positive association
two variables measured on the same individuals have a positive association if increases in one variable tend to correspond to increases in the other variable
inferential statistics
using statistics from a sample to make inferences about the entire population
cause and effect relationship
variable X causes variable Y (or the other way around). The values taken by variable Y change because the values taken by variable X change
theoretical mean (expected value)
- the mean of how we might expect our data to turn out, based on the underlying distribution - the average of the values that could occur, weighted according to the underlying probability distribution
theoretical probability
based on what we know about the random experiment prior to any actual trials
scatterplot
a display that shows the relationship between two numerical variables measured on the same sample of individuals
categorical data
data for which it makes sense to place an individual observation into one of several groups (or categories)
mutually exclusive events
events which share no common outcomes (they are disjoint sets)
census
any data collection method in which data is collected from each and every member of the population
statistic
any numerical information computed from a sample (e.g. a sample mean is a statistic)
sample bias
biased (inaccurate) information that results from a poorly chosen sample
confounded relationship
X and Y are associated because there is a confounding variable Z that is associated with both X and Y, so that we cannot distinguish between the effect of X on Y and the effect of Z on Y.
biased
a procedure is biased if it systematically favors certain outcomes
parameter
a representative number computed from an entire population (e.g. a population mean is a parameter)
correlation coefficient (r)
a statistic that measures the strength of the linear relationship between two numerical variables
sample
a subset of a population from which data is collected
event
a subset of the sample space of a random experiment
positive linear correlation
when the points almost lie along a straight line (with positive slope)