Exam 1 Review
What probability best matches the following statement: "This event will occur less often than not, but it is not extremely unlikely."
.30
Probability is a measure of how likely an event is to occur. What probability best matches the following statement: "This event is impossible; it cannot occur."
0
What is the area to the right of a z-score of 1.69?
0.0455
What probability best matches the following statement: "This event is very unlikely, but it will occur once in a while in a long sequence of trials."
0.05
A manufacturing process produces bags of cookies that have Normally distributed weights with a mean of μ = 15.0 oz. and a standard deviation of σ = 0.4 oz. What is the probability that a randomly selected bag weighs more than 15.2 oz?
0.3085
What probability best matches the following statement: "This event will occur slightly more often than not."
0.6
How many standard deviations above and below the mean do the quartiles of any Normal distribution lie? (Use the standard Normal distribution to find Q1 and Q3 for z-scores.)
0.67
Using the Standard Normal Table, what is the z-score with 0.25 to its right?
0.67
Using the Standard Normal Table, what proportion of observations on the standard normal curve satisfies -1.65 < z < 1.75?
0.9104
What probability that matches the following statement: "This event is extremely likely, but occasionally it will not occur in a long sequence of trials."
0.95
Probability Rules
1) 0<P(A)<1 2) Summation of all probabilities is 1 3) Complement rule 4) Addition rule for disjoint events
Principles of Valid Experiments
1. Control/Comparison 2. Randomization 3. Replication 4. Double-Blinding
For sampling to work
1. explicitly describe population 2. explicitly describe variable 3. select representative sample
Probability is a measure of how likely an event is to occur. What is the probability of an event that is certain and will occur every time?
1.0
Using the Standard Normal Table, what is the z-score with 0.9345 to its left?
1.51
Using the Standard Normal Table, what is the z-score with 0.9671 to its left?
1.84
Suppose there are three freshman, one sophomore, and two juniors in a study group. If you randomly select one student, what is the probability of selecting a sophomore?
1/6
Scores on a standard IQ test are approximately normally distributed with a mean of 100 and a standard deviation 15. If Joe's score is 133, how many standard deviations is Joe's score above the mean?
2.2
Rachel scored 670 on the analytic portion of the GRE (Graduate Record Exam). GRE scores are normally distributed with a mean of 600 and a standard deviation of 30. How many standard deviations is Rachel's score above the mean?
2.33
The time it takes for college freshman to complete the Mason Basic Reasoning Test is normally distributed with a mean of 24 minutes and a standard deviation of 5 minutes. What is the range for the middle 99.7% of the time for the Reasoning Test completion of college freshman?
30
Corks for bottles in a certain manufacturing process are produced in such a way that the diameter of the corks has a Normal distribution with mean 3 cm. and standard deviation 0.1 cm. The specifications call for corks with diameters between 2.9 and 3.1 cm. What percentage of the produced corks do not meet specifications?
31.74%
The average height of American men (ages 20-30) is 69.5 inches with a standard deviation of 3 inches. Between what two heights are the middle 95% of American men?
63.5 to 75.5 inches
Convenience Sample
A sample selected by taking the members of the population that are easiest to reach.
Voluntary Response Sample
A sample which involves only those who want to participate in the sampling. - Biased because people with strong opinions are more likely to respond.
Sample survey
A survey to collect data on a sample - Individuals report values themselves
Categorical Variable
A variable recorded as labels, names, or other non-numerical outcomes. - Gender, opinion...
Lurking Variables
A variable that is not among the explanatory or response variables in a study but that may influence the response variable.
Two of the following statements describe a correct probability model. Which statement does not?: There is a probability assigned to each event. All of the probabilities are between negative one and one. The sum of all the probabilities is equal to one.
All of the probabilities are between negative one and one.
What does the distribution of a random variable give us?
All possible values of the random variable and how often they occur.
Open Quesitons
Allow for almost unlimited responses.
Stemplot
Also called a stem-and-leaf plot. Data are separated into a stem and leaf by place value and organized in the form of a histogram. - not good for large data sets
Randomized Controlled Experiment
An experimental design where all subjects are randomly allocated to different treatments.
Event
An outcome or set of outcomes of a random phenomenon. -Subset of the sample space
Variables
Any characteristic of an individual. A variable can take different values for different individuals.
Symmetric
Being equal or the same in size, shape, and relative position
The length of human pregnancies from conception to birth varies according to a distribution that is approximately Normal with a mean of 266 days and a standard deviation of 16 days. Between what two values do the middle 95% of the lengths of all pregnancies fall?
Between 234 and 298 days
Normal curve G has a mean of 50 and a standard deviation of 5. Normal curve H has a mean of 50 and a standard deviation of 10. How do the shapes of these two Normal curves compare if they are drawn using the same scale?
Both are centered at 50, but curve G is taller and skinnier than curve H
Three children are in a room, ages 3, 4, and 5. A fourth child enters aged 6. What can we say about the mean and standard deviation of the ages?
Both the mean and standard deviation increase.
"Telescope" Events
Bringing forward events in the memory to more recent time periods. - "Have you visited the doctor in the past 6 months?"
What type of variable is "whether a driver entered the intersection when the light was red at his/her last stop light"?
Categorical
Process
Chain of activities that turns inputs into outputs
Suppose a researcher is interested in the average ACT score for high school students in Illinois. She randomly selects 150 high schools and then asks each student in the selected high schools what their ACT score was. What kind of sample is this?
Cluster Sample
High school students are selected by choosing at random several of the city's high schools and selecting all of the students from those selected high schools. What sampling technique is being used?
Cluster Sampling
Web Surveys
Collects large amounts of data at lower costs than traditional surveys -advantages: anonymous -disadvantages: people might fill them out in a hurry
Bad Sampling
Convenience sampling Voluntary response Quota Sampling - Bad due to bias and it is impossible to assess uncertainty
Students at a particular university are able to evaluate professors on a five point scale (a score of 1 meaning poor teaching and a score of 5 meaning excellent teaching, and answers are limited to whole round numbers). What type of random variable is professor evaluation an example of?
Discrete
ACT scores are Normally distributed with a mean of 21 and a standard deviation of 6. SAT scores are Normally distributed with a mean of 510 and a standard deviation of 100. Emma scored 646 on the SAT and Tess scored 26 on the ACT. Who has the better score? (Hint: compare z-scores.)
Emma
If the distribution of data is symmetric, then the mean of the data ___________ the median
Equal to
Observational studies cannot have control groups.
False
True or False Probability is the process of drawing conclusions about the sample based on population data.
False
True or False Taking a valid simple random sample eliminates all biases, including question wording bias and interviewer bias.
False
True or False The field of statistics ends once we have collected all of our data.
False
True or False We can establish causation whenever a random sample is taken.
False
True or False We can establish causation whenever a valid sample (simple random sample, stratified sample, multistage sample, etc.) is taken.
False
In studies of worker productivity, it has been noticed that any change in the work environment together with the knowledge that a study is underway will produce a short-term increase in productivity. This is known as
Hawthorne effect.
What type of graph would be best for displaying the final exam scores of 2,000 Stat 121 students?
Histogram
What is the advantage of a histogram over a stemplot or dotplot?
Histograms work well for very large data sets
Independent
If A occurs, it does not change the probability of B occurring.
Suspected outlier
If observation > Q3 + (1.5)(IQR) Or observation < Q1 - (1.5)(IQR)
Experiments
Imposes some treatment on individuals in order to observe their responses. -Study whether the treatment causes a change in the response.
Outlier
Individual value that falls outside the overall pattern of a distribution.
What is the final step in the Big Picture of Statistics?
Inference
Which of the following statements correctly describes a normal curve? Its mean is not equal to its median. It is symmetric. Its standard deviation is always one.
It is symmetric.
Why use a sample instead of a census?
Its practical, cheap and more accurate
Four of the following statements are correct descriptions of a Normal curve. Which one is not a characteristic of a Normal curve? It is symmetric. Its spread increases as its mean increases. It is bell-shaped. It is single-peaked (unimodal). Its mean is equal to its median.
Its spread increases as its mean increases.
Closed Questions
Limits response options
Twenty-five right-handed men were tested to compare their right hand strength with their left hand strength using a bathroom scale. For each male, a coin was tossed. If it landed heads, the man first squeezed the scales with his right hand and then with his left hand. If the coin landed tails, the man squeezed the scales with his left hand first and then with his right. The weight registered on the scale is recorded for both hands. What type of study is this?
Matched pairs experiment
For right skewed data, which is bigger? Mean or Median
Mean
Data
Measurements for a set of individuals
For left skewed data, which is bigger? Mean or Median
Median
Data Analysis
Methods and strategies for looking at data
Educators in California are concerned about a recent newspaper article reporting that students in the United States are falling behind students in other nations in their math skills. They decide to sample 10th grade students throughout the state and test their mathematics skills. They first randomly select 10 school districts. From each of these 10 school districts they randomly select three high schools. From these 3 high schools they randomly select 10 students and test them. What type of sample is this?
Multistage Sample
Standardized Value
The z-score obtained from standardizing an x-value.
Researchers followed a group of 10,892 middle-aged adults over a period of nine years. They found that smokers who quit had a higher risk of diabetes within three years of quitting than either nonsmokers or continuing smokers. Does this show that stopping smoking causes the short-term risk for type 2 diabetes to increase?
This is an observational study; it is not reasonable to conclude any cause-and-effect relationship.
Why do we randomize in experiments?
To eliminate bias associated with lurking variables.
Why do we compare different treatment groups in experiments?
To enable the measurement of treatment differences
Why do we use replication in experiments?
To more precisely measure chance variation.
Samples can be biased due to poor interviewing and/or poorly worded questions
True
Samples of convenience are non-probability samples.
True
The existence of possible lurking variables is the main reason we say association does not imply causation.
True
True or False Standard deviation is a measure of variability.
True
True or False An experiment that doesn't incorporate randomization is not a valid experiment.
True
True or False Cluster sampling is a type of probability sampling design.
True
True or False Outliers "inflate" standard deviation.
True
True or False Probabilities on individuals in a population can be computed using the standard Normal table only if the population distribution is Normal.
True
True or False Probability is a way to measure or quantify uncertainty
True
True or False The Law of Large Numbers states that as the number of trials increases, the relative frequency of an event gets closer and closer to the theoretical probability.
True
True or False The distribution of a random variable shows all possible values the random variable could take and how often they occur.
True
True or False The purpose of statistics is to convert data into useful information.
True
True or False When sampling from the population of interest, it is a good idea to have a representative sample.
True
True or False When we use a randomized block design we want the subjects within each "block" to be similar but they should be different from block to block.
True
True or False Without random selection, we cannot appropriately apply the laws of probability to perform inference.
True
When we perform a proper experiment, we can
establish causation
Lack of Realism
When the treatments, the subjects, or the environment of an experiment are not realistic. Lack of realism can limit researchers' ability to apply the conclusions of an experiment to the settings of greatest interest.
An experiment was designed using school children as subjects to determine whether milk prevented their catching colds. The researcher randomly assigned 100 school children to two groups - one group of 50 to receive a cup of milk at school each day and the other group of 50 to receive no milk at school. What is the response variable? Remember: The response variable is measured on the individual.
Whether the child caught a cold
In the milk study, what is the explanatory variable?
Whether the child received a cup of milk at school each day
An experiment was designed using school children as subjects to determine whether drinking milk prevented their catching colds. The researcher randomly assigned 100 school children to the two groups - one group of 50 to receive a cup of milk at school each day and the other group of 50 to receive no milk at school. What is a potential lurking variable?
Whether the child was frail or robust at the beginning of the experiment
Suppose you have played a game many, many times—winning sometimes and losing sometimes. Can you use the results of playing the game to estimate your overall probability of winning the game?
Yes
An experiment was designed using school children to determine whether drinking milk prevented their catching colds. The researcher randomly assigned 100 school children to two groups—one group of 50 to receive a cup of milk at school each day and the other group of 50 to receive no milk at school.Does the study incorporate replication?
Yes, since there were 50 children in each treatment group.
Standard Deviation
a computed measure of how much scores vary around the mean score
A reporter for the university newspaper wants to find out the opinions that all BYU students have about the university health center. During a class break, he goes to the health center, contacts a few students as they exit, and asks them to fill out a survey. What type of sample is this?
a convenience sample
Matched Pair Design
a design in which one creates a set of two participants who are highly similar on a key trait and then randomly assigns individuals in the pair to different groups - Compares Two Treatments - Each subject receives both treatments in a random order.
Bar Graph
a graph that uses vertical or horizontal bars to show comparisons among two or more items
Dot Plot
a graphical device that summarizes data by the number of dots above each data value on the horizontal axis
Block
a group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments
Statistic
a numerical measurement describing some characteristic of a sample
Simple Random Sample
a particular type of probability sample in which every member of the population has an equal chance of being selected
Trend
a pattern of change over time, longterm upward and downward movement.
Explanatory Variable
a variable that we think explains or causes changes in the response variable
Quantitative Variable
a variable whose values can be recorded as meaningful numbers - cost, height.....
Which of the following is NOT a measure of center of the data? a. Interquartile range b. Median c. Mean
a. Interquartile range
The announcer on the radio tells listeners that the probability of snow tonight is 20%. We should interpret this to mean that
according to historical records when meteorological conditions were the same as today, it snowed twenty percent of the time
A study of religious practices among college students interviewed a sample of 127 students; 107 of the students said that they prayed at least once in a while. Which of the following best describes the population?
all college students
Process of statistics
collect data summarize data interpret data
Interviewer Bias
effects of interviewers on respondents that lead to biased answers
Standard deviation measures
how much the observations in a data set vary about their mean.
68-95-99.7 Rule
in a normal model, about 68% of values fall within 1 standard deviation of the mean, about 95% fall within 2 standard deviations of the mean, and about 99.7% fall within 3 standard deviations of the mean
Stratified Random Samples
population is divided into subpopulations (called strata) and a random sample is obtained from each strata
Replication
repeating the essence of a research study, usually with different participants in different situations, to see whether the basic finding extends to other participants and circumstances
Wording Effects
when a specific word used in a question affects how respondents answer the question or the order of the questions
Mean equation
x̄ = ¹⁄ₙ ∑ xᵢ
Which of the following intervals corresponds to the smallest area under a Normal curve?
μ to μ + 3σ
Principles of Data Ethics
• safety and well-being of the subjects must be protected • all individuals must give their informed consent before data are collected • individual data must be kept confidential
In statistics, how do we define the probability of an event?
The relative frequency with which the event occurs in a long series of trials.
A group of college students believed that herbal tea has remarkable restorative powers. To test their theory, they randomly selected residents at a local nursing home. Each week they visited these residents and served them herbal tea. After several months, many of these residents were more cheerful and healthy. Which of the following can be correctly concluded from this study?
The results are not convincing because the effect of herbal tea is confounded with the effect of visiting.
. The standard deviation of a set of data is computed to be 8.2. If 10 is added to each data value, what can we say about the standard deviation of the new data set?
The standard deviation stays the same.
What is a correct description of the median of a distribution that is described by a density curve?
The value that divides the area of the density curve in half.
Data Production
methods for producing data that can give clear answers to specific questions
The Wechsler Adult Intelligence Scale (WAIS) is the most common "IQ" test. The scale of scores is approximately Normal with mean 100 and standard deviation 15. Approximately what percentage of the IQ's are between 85 and 130? Draw and label the Normal curve and use the 68-95-99.7 rule.
81.5%
Twelve locations were selected in a county where a steel plant is accused of air pollution. Sulfate levels were measured at each of these locations, as well as the distance of the location from the steel plant. The data were analyzed to see whether the sulfate level decreases as distance from the plant increases. What type of study is this?
A double blind experiment
normal distribution
A function that represents the distribution of variables as a symmetrical bell-shaped graph.
Histogram
A graph of vertical bars representing the frequency distribution of a set of data. - Too many classes=Too noisy -Too little classes=Too smooth
What is a distribution of a random variable?
A list of possible values of a variable together with how often each value occurs.
Table of Random Digits
A long string of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 with these properties: 1. Each entry in the table is equally likely to be any of the 10 digits 0 through 9. 2. The entries are independent of each other. That is, knowledge of one part of the table gives no information about any other part.
Density Curve
A mathematical model used to describe the overall pattern of the distribution of a random variable.
Correlation
A measure of the extent to which two factors vary together, and thus of how well either factor predicts the other.
Standard Normal Distribution
A normal distribution with a mean of 0 and a standard deviation of 1.
Sample
A part of the population from which we actually collect information. - Draw conclusions about the entire population -Sample represents a larger whole.
Midpoint
A point that divides a segment into two congruent segments
Finite Probability Model
A probability model with a finite sample space. To assign probabilities in this model, list the probabilities of all the individual outcomes. These must be numbers between 0 and 1 that add to exactly 1. The probability of any event is the sum of the probabilities of the outcomes making up the event.
Five Number Summary
minimum, Q1, median, Q3, maximum
Suppose a survey is so long that many respondents refuse to complete it. What type of bias could result?
Non-response bias
Which one of these random variables is discrete? Height of an adult male Number of phone calls received in a day GPA
Number of phone calls received in a day
Parameter
Numerical fact about the variable in the population
To investigate the effects of the drug phen-fen, 200 women in the 30-40 age range who had used the drug for at least one year were located. 200 women of the same age group who had not used the drug were also located. The incidence of heart valve abnormality was compared between the two groups. What type of study is this?
Observational Study
Patients often show improvement even when they get a sugar pill from a doctor rather than the actual medication. What is this phenomenon called?
Placebo Effect
What intervals corresponds to the largest area on a normal curve?
Q1 to μ+3σ
interquartile range
Q3-Q1 Difference between quartiles
Consider these two survey questions: 1. How often do you read a book for fun (outside of work or school)? 2. How often do you read a book for fun (outside of work or school): never, sometimes, or often? Are these examples of open or closed questions?
Question 1 is an open question and Question 2 is a closed question.
Which one of the following is a benefit of randomized block designs (RBD)? RBD eliminates all lurking variables. RBD eliminates the placebo effect. RBD reduces chance variation by removing variation associated with the blocking variable. RBD removes bias.
RBD reduces chance variation by removing variation associated with the blocking variable
Good Sampling
Random Sampling Cluster Sampling Multistage Sample
Researchers want to compare the effectiveness of exercise and dieting compared to dieting alone for weight loss. They have 60 volunteers, 30 men and 30 women. They randomly assign half of the men to Group 1, exercise and diet, and the other half to Group 2, diet alone. They follow the same procedure for the women. Half of the women are assigned to Group 1 and the other half are assigned to Group 2. After 16 weeks, their weight loss was measured and compared. What type of study is this?
Randomized Block Experiment
A study was conducted to test the effectiveness of using an antidepressant called imipramine in treating bulimia, an eating disorder. Twenty patients were randomly assigned to one of two groups with ten in each. One group received imipramine and the other received a placebo. The response measured was binge frequency. What type of study is this?
Randomized Controlled Experiment
What is the major difference between an observational study and an experiment?
Researchers assign treatments to subjects in an experiment.
What characterizes a probability sample but not a sample of convenience?
Some type of random device is used to obtain a probability sample. Their probabilities can be computed. All possible probability samples can be listed. Inferences can appropriately be made from probability samples.
Which of the following is a false statement about statistics?: - Collecting data from the entire population may be impossible. - Statistical information is never misleading or misrepresented. -All statistical summaries and conclusions should be reported in context. -Publishing research results is not a part of the Big Picture of Statistics.
Statistical information is never misleading or misrepresented.
A popular magazine is interested in the average amount of time that their readers spend on the internet each day. They randomly survey 100 of their female readers and 100 of their male readers and ask them about their average internet use. What type of sample is this?
Stratified Sample
With the "Let's Make a Deal" game show, what strategy should contestants use?
Switch to the other closed door.
Bias
Systematic errors in the way a sample represents the population.
Cumulative Proportions
The cumulative proportion for a value x in a distribution is the proportion of observations in the distribution that are less than or equal to x.
Population
The entire group of individuals that we want information about.
Three children are in a room, ages 3, 4, and 5. A fourth child enters aged 4. What can we say about the mean and standard deviation of the ages?
The mean stays the same, but the standard deviation decreases.
Individual
The objects described by a set of data - people, animals, or things
In a random digit telephone survey, homeless people or people with only cell phones do not have telephones that can be called with random digit dialing. What type of bias is this?
Undercoverage Bias
Measurement
Value of a variable for an individual - textbook cost for Nathan...
Quartiles
Values that divide a data set into four equal parts
What the dogma of statistics
Variation has to be dealt with
Non-Response
When an individual selected for the sample does not provide information. - hang ups, vacation...
Cluster Sampling
When population is naturally divided into groups (called clusters). - Each cluster representative of a population as a whole.
A student in STAT121 wants to estimate the mean score on the STAT121 final exam from last semester. They asked 200 students who took the class last semseter what their final exam score was and recorded it. What is the popuation?
all students who took STAT121 last semester
Completely Randomized Experiment
all subjects are allocated at random to all treatments and roughly equal number of subjects are assigned to each treatment
Informed Consent
an ethical principle that research participants be told enough to enable them to choose whether they wish to participate
Randomized Comparative Experiment
an experiment that uses both comparison of two or more treatments and random assignment of subjects to treatments
Statistically Significant
an observed effect so large that it would rarely occur by chance
Treatment
any specific experimental condition applied to the subjects. - If an experiment has more than one factor, this is a combination of specific values of each factor.
Response Bias
anything in a survey design that influences responses
When you play solitaire, you either win or lose. Therefore, the probability of winning, according to its definition, is
approximated by playing lots of times and dividing the number of times you win by the number of times you play.
What type of discipline is statistics?
art, science, methodology
Continuous Probability Model
assigns probabilities as areas under a density curve
Which one of the following is NOT affected by outliers? a. Mean b. Median c. Standard deviation d. Range
b. Median
Which one of the following variables is categorical? a. Fuel efficiency of vehicles b. Type of cell phone service c. Number of phone calls made from hotel rooms d. Body temperature
b. Type of cell phone service
Control/Comparison
control lurking variables by including comparison treatments, using homogeneous subjects; used to measure placebo effect
When the data are arranged in ascending numerical order, the median is the value which
cuts the data in half.
Which one of the following variables is quantitative? a. Car color b. Brand of detergent c. ID number of a BYU student d. Weight of a football team member
d. Weight of a football team member
Which one of the following does NOT have the same units of measure as the data? a. Mean b. Standard deviation c. Interquartile range d. Z-score
d. Z-score
Data Set
data identified with contextual information table: rows = individuals, columns = variables
Uniform Density Curve
describes a variable that takes values that are uniformly spread between a range of values
Sampling Design
describes exactly how to choose a sample from the population
Boxplot
displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values
Placebo Effect
experimental results caused by expectations alone; any effect on behavior caused by the administration of an inert substance or condition, which the recipient assumes is an active agent.
Clinical Trials
experiments that study the effectiveness of medical treatments on actual patients
Non-Compliance
failure to act in accordance with a wish or command
Why might collecting data from a sample be preferred over collecting data from a population?
less expensive less time consuming not impossible
How do you find the standard deviation?
max-min=______/6
Measures of Center
mean, median, mode
Response Variable
measures an outcome of a study
Standard Deviation equation
measures how much variability(spread) there is in data
Statistical Inference
methods for drawing conclusions about a population from sample data
Degrees of Freedom
n-1
Randomization
neutralize effects of lurking variables by assigning subjects to treatments randomly
Observational Studies
observes individuals and measures variables of interest but does not attempt to influence the responses. - To describe a group or situation
Undercoverage
occurs when some groups in the population are left out of the process of choosing the sample - homeless, phoneless...
Cycles
patterns in the data that occur, up and down movement.
Time Plot
plots each observation against the time at which it was measured
Multistage Sample
select successively smaller groups within the population in stages. - States-Countries-People.........
Misleading Response
selected individuals lie or give inaccurate answer
Regression Lines
straight lines that best summarize a correlation; used to make predictions
To standardize means to
subtract the mean from a given value and then divide by the standard deviation.
Normal Curves
symmetric, bell-shaped curves
Distribution
tells us what values the variable takes and how often it takes these values
The report about water quality included information on the water quality at swimming beaches in California. Forty-five of these beaches were sampled to test whether they failed to meet water quality standards. What is the Sample?
the 45 beaches that were sampled
Double Blinding
the act of blinding both the subjects of an experiment and the researchers who work with the subjects - Used to prevent experimenter effect
Confidentiality
the act of holding information in confidence, not to be released to unauthorized individuals
Mean
the arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores
Anonymity
the condition of being unknown
Factors
the explanatory variables in an experiment
The probability of an event can be defined as
the fraction of times the event will occur if the random phenomenon is repeated many times.
Resistant Measure
the mean resists the influence of extreme observations
Probability
the proportion of times the outcome would occur in a very long series of repetitions
Block Design
the random assignment of subjects to treatments is carried out separately within each block
Random Digit Dialing (RDD)
the random dialing by a machine of numbers within designated phone prefixes, which creates a random sample for phone surveys
Statistics (Course)
the science of collecting, organizing, analyzing, and interpreting data in order to make decisions
sample space S
the set of all possible outcomes
Variance
the square of the standard deviation
Skewed to the left
the tail to the left of the peak is longer than the tail to the right of the peak
Skewed to the right
the tail to the right of the peak is longer than the tail to the left of the peak
Question Order Bias
the tendency for earlier questions on a questionnaire to influence respondents' answers to later questions
Confounded
two variables are confounded when their effects on a response variable cannot be distinguished from each other
Chance Behavior
unpredictable in the short run but has a regular and predictable pattern in the long run
Exploratory Data Analysis
uses graphs and numerical summaries to describe the variables in a data set and the relations among them
Confidence Interval
uses sample data to estimate an unknown population parameter with an indication of how accurate the estimate is and of how confident we are that the result is correct - Two Parts: 1. An interval calculated from the data 2. Confidence level