Statistics Test 1
What is a frame?
A list of the individuals in the population being studied
Parameter
Numerical summary of a population
Statistic
Numerical summary of a sample
True or false: generally, the goal of an experiment is to determine the effect that the treatment will have on the response variable.
True
Five number summary
minimum, Q1, median, Q3, maximum
What is the probability of an event that is impossible? Suppose that a probability is approximated to be zero based in empirical results. Does this mean that the event is impossible.
0 No
Which of the following numbers can be the probability of an event? 0.06, -0.54, 1,56, 1, 0.36, 0
0, 0,06, 0.36, 1 Numbers can't be below zero or above 1.
In a relative frequency distribution, what should the relative frequencies add up to?
1
Find the population mean as indicated. Sample: 16, 8, 7, 10, 19
1) open in statcrunch-stat-summary stats-columns-compute-mean. x=12
What is a random variable?
A random variable is a numerical measure of the outcome of a probability experiment.
Discreet means
Countable
What is a lurking variable?
Explanatory variable that was not considered in a study, but that affects the value of the response variable in the study. In addition, lurking variables are typically related to explanatory variables in the study.
Stratified sample
Is obtained by dividing the population into homogeneous groups and randomly selecting individuals from each group.
y=808.8x+11,607 where y is median income and x is the percentage 25 years and older with at least a bachelors degree. Explain why it does not make sense to interpret the y-intercept.
It does not make sense to interpret the y-intercept because an x-value or 0 is outside the scope of the model.
To help assess student learning in her music appreciation courses, a music professor at a community college implemented ore- and post-tests for her music appreciation students. A knowledge-gained score was obtained by taking the difference of the two test scores. What type of experimental design is this?
Matched pair
Use the side by side box plots shown. What is the median of variable x? What is the third quartile I'd variable y? Which variable has more dispersion? Why? Describe the shape of the variable x. Describe the shoe of variable y.
Median is the middle line in the box. 30 Third quartile is the right side of the box. 48 Variable y-the interquartile range of variable y is larger than that of variable x. Symmetric Skewed right
Which is the superior observational study? Case-control or cross-sectional?
Neither study is always superior to the other. Both had advantages and disadvantages that depend on the situation.
What is a case-control study?
Observational studies that are retrospective, meaning that they require individuals to look back in time or require the researcher to look at existing records.
What is a cross-sectional study?
Observational studies that collect information about individuals at a specific point in time or over a very short period of time.
What does it mean when sampling is done without replacement?
Once an individual is selected, the individual cannot be selected again.
That data in the table are based on the results of a survey comparing the commute time of adults to their score on a well-being test. Draw a scatter gram of the data?
Open data in statcrunch-graph-scatterplot-compute. Choose graph
Find the probability of P(E^c) if P(E)=0.36
P(E^c)=1-P(E) P(E^c)=1-0.36 P(E^c)=0.64
Coefficient of determination
R^2 measures the proportion of total variation in the response variable that is explained by the least-squares regression line.
Apple wants to administer a satisfaction survey to its current customers. Using their customer database, the company randomly selects 40 customers and asks them about their level of satisfaction with the company. What type of sampling is used?
Simple random
Define statistics
Statistics is the science of collecting, organizing, summarizing, and analyzing information to draw a conclusion and answer questions. In addition, statistics is about providing a measure of confidence in any conclusions.
To estimate the percentage of defects in a recent manufacturing batch, a quality control manager at Ford selects every 14th trick that comes off the assembly line starting with the second until she obtains a sample of 30 trucks. What type of sampling is used?
Systematic
Which sampling method does not require a frame?
Systematic
A survey of 600 randomly selected high school students determined that 91 play organized sports. a) what is the probability that a randomly selected high school student plays organized sports? b) interpret this probability.
Take 91/600 a) 0.152 Take 0.152 x 1,000 b) if 1,000 high school students were sampled, it would be expected that ABOUT 152 of them play organized sports.
Response variable
The quantitative or qualitative variable for which the experimenter wishes to determine how its value is affected by the explanatory variable.
y=808.8x+11,607 where y is median income and x is the percentage 25 years and older with at least a bachelors degree. Interpret the slope.
The slope is 808.8 For every perfect increase in adults having at least a bachelors degree, the median income increases by 808.80, on average.
Lower class limit
The smallest value within the class
One of the greatest baseball hitters of all time has a career batting average of 0.366. Is the value a parameter or a statistic?
The value is a parameter because the career at-bats of a baseball player are a population.
Refer to the table below. Is constructing a conditional distribution by level of education different from constructing a conditional distribution by employment status? If they are explain the difference.
They are different because constructing a conditional distribution by level of education computes the relative frequency for each employment status, given the individuals level of education. Constructing a conditional distribution by employment status computes the relative frequency for each level of education, given the individuals employment status.
Find the value of 5!
Type in 5 on calculator-hit math-PRB-down to !-enter. 120
Is there an association between party affiliation and gender? The accompanying data represents the gender and party affiliation of registered voters based on a random sample of 812 adults. a. Construct a FREQUENCY marginal distribution table. b. Construct a RELATIVE frequency marginal distribution.
a. add up columns and rows and fill in table. b. take totals of each column/row and divide by total. Will be a decimal
The standard deviation is in conjunction with the ________ to numerically describe distributions that are bell shaped. The ________ measures the center or the distribution, while the standard deviation measures the ________ of the distribution.
mean, mean, spread
The factorial symbol n! is defined as n!=_______ and 0=________.
n(n-1)•...3•2•1 1
P(E^c) formula
1-P(E)
Determine the original set of data
10, 11, 12, 21, 24, 24, 27, 28, 33, 35, 35, 35, 37, 38, 40, 41
Explain the meaning of the following percentile. The 90th percentile of the length of newborn females in a certain city is 54.3 cm.
90% of newborn females have a length of 54.3 cm or less, and 10% of newborn females have a length that is more than 54.3 cm.
What is meant by conditional distribution?
A conditional distribution lists the relative frequency of each category if the response variable, given a specific value of the explanatory variable in a contingency table.
Placebo
An innocuous medication, such as a sugar tablet, that looks, tastes, and smells like the experimental medication.
Descriptive statistics
Consists of organizing and summarizing information collected
Which allows the researcher to claim causation between an explanatory variable and a response variable?
Designed experiment
What is the formula for the expected number of successes in a binomial experiment with n trials and probability to success p?
E(X) = np
Explain what each point in the least-squares regression line represents.
Each point on the least-squares regression line represents the predicted y-value at the corresponding value of x.
A binomial experiment is performed a fixed number of times. What is each religion of the experiment called?
Each repetition of the experiment is called a trial.
True or false: a data set will always have exactly one mode.
False
True or false: the shape of the distribution shown is best classified as skewed left.
False
True or false: when two events are disjoint, they are also independent.
False
True or false: correlation implies causation.
False.
What is a closed question?
Has fixed choices for answers, whereas an open question is a free-response question.
What does it mean is r=0?
No linear relationship exists between the variables.
Determine if the following probability experiment represents a binomial experiment. A random sample of 30 high school seniors is obtained, and the individuals selected are asked to state their heights.
No, this probability experiment does not represent a binomial experiment because the variable is continuous, and there are not two mutually exclusive outcomes.
Lower fence formula
Q1-1.5(IQR)
Upper fence formula
Q3+1.5(IQR)
A polling organization contacts 1512 adult women who are 30 to 70 years of age and live in the United States and asks whether or not they had received a mammogram during the last year. What is the sample in the study?
The 1512 adult women who are 30 to 70 years of age and live in the United States.
Class width
The difference between consecutive lower class limits.
Confounding
The effect of two factors (explanatory variables on the same response variable) cannot be distinguished.
What does it mean when park of the population is under-represented?
The part of a population is under represented when it is proportionally smaller in a sample than in its population.
What does it mean to say that the linear correlation coefficient between two variables equals 1? What would the scattergram look like?
When the linear correlation coefficient is 1, there is a perfect positive linear relation between the two variables. The scatter diagram would contain points that all line on a line with a positive slope.
z score formula
z = (x - μ)/σ (x-mean)/standard deviation
If r =_____ then a perfect negative linear relation exists between the two variables.
-1
Is the following a probability model? What do we call the outcome green?
1) add up all No because the probabilities do not sum to 1. An impossible event.
What is a confounding variable?
An explanatory variable that was considered in a study whose effect cannot be distinguished from a second explanatory variable in the study.
Advantages and disadvantages of closed and open question.
Closed questions are easier to analyze, but limit the responses. Open questions allow respondents to state exactly how they feel, but are harder to analyze due to the variety of answers and possible misinterpretation is answers.
Is the number of points scored during a basketball game continuous or discreet?
Discreet. The possible values are 0, 1, 2...
To determine customer opinion of their pricing, Home Depot randomly selects 100 check out lines during a certain week and surveys all customers in the check out lines. What type of sampling is used?
Cluster
Two Events are _____ if the occurrences of ever E in a probability experiment does not affect the probability of event F.
Independent
A(n)____is a person or object that is a member of the population being studied.
Individual
Suppose that events E and F are independent, P(E)=0.3 and P(F)=0.9. What is the P(E and F).
P(E and F) = 0.3•0.9= .27 P(E and F)= 0.27
P(E and F) formula
P(E) • P(F)
If E and F are disjoint events, then P(E or F) =
P(E)+P(F)
If E and F and now disjoint events, then P(E or F)=
P(E)+P(F)-P(E and F)
P(E or F) formula
P(E)+P(F)-P(E and F)
IQR (interquartile range) formula
Q3-Q1
Classes
The categories by which data are grouped
Upper class limit
The largest value within the class.
What are the two requirements for a discrete probability distribution?
∑ P(x)=1 0<=P(x)<=1
What does it mean if a statistic is resistant?
Extreme values (very large or small) relative to the data do not affect its value substantially.
The U.S. Department of Housing and Urban Development (HUD) uses the median to report the average price of a home in the United States. Why do you think HUD uses the median?
HUD uses the median because the data are skewed to the right, and the median is better for skewed data.
Violent crimes include rape, robbery, assault, and homicide. The following is a summary of the violent crime rate (violent crimes per 100,000 population) for all states of a country in a certain year. Q1=272.8 Q2=387.4 Q3=528.3 Determine and interpret the interquartile range. IQR=Q3-Q1
IQR=Q3-Q1 528.3-272.8=255.5 The middle 50% of all the observations have a range of 255.5 crimes per 100,000 population.
Explain the difference between a single-blind and a double-blind experiment.
In a single-blind the subject doesn't know which treatment is received. In a double-blind neither the subject nor the researcher in contact with the subject knows which treatment is received.
A probability experiment is conducted in which the sample space if the experiment is S={3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}. Let event E={5, 6, 7, 8, 9, 10}. Assume each outcome is equally likely. List the outcomes in E^c. Fine P(E^c)
List all numbers E and G don't have in common. E^c=3, 4, 11, 12, 13, 14 E^c contains 6 numbers S contains 12 numbers Divide 6/12 to get P(E^c) P(E^c)=0.5
A frequency distribution lists the_________of occurrences of each category it data, while a relative frequency distribution lists the ________of occurrences of each category or the data.
Number , proportion
Let the sample space be S={1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Suppose the outcomes are equally likely. Compute the probability of the event E={1, 2}. P(E)=?
Number of E/Number of S = P(E) E has 2 numbers. S has 10 numbers. 2/10=0.2 P(E)=0.2
Z-score
Represents the number of standard deviations an observation is from the mean.
y=808.8x+11,607 where y is median income and x is the percentage 25 years and older with at least a bachelors degree. I'm a particular region, 26.8 perfect of adults 25 years and older have at least a bachelors degree. The median income of this region is 29,792. Is this income higher than what you would expect? Why?
Substitute 26.8 for x to find the predicted median income and then compare. y=33,383 lower
y=808.8x+11,607 where y is median income and x is the percentage 25 years and older with at least a bachelors degree. Predict the median income of a region in which 30% of adults 25 years and older have at least a bachelors degree.
Substitution 30 for x to find median income. y=35,871
A binomial probability experiment is conducted with the given parameters. Compute the probability of x success in the n independent trials of the experiment. n=10, p=0.45, x=8 P(8)=
Take 10C8 (0.45)^8•(1-0.45)^10-8 (45)(0.0756680643)(.3025) P(8)=0.0229
A polling organization contacts 1512 adult women who are 30 to 70 years of age and live in the United States and asks whether or not they had received a mammogram during the last year. What is the sample in the study?
The 40 bottles of apple juice selected in the plant on December 22.
Describe the difference between classical and empirical probability.
The EMPIRICAL method obtains an APPROXIMATE empirical probability of an event by conducting a probability experiment. The classical method of computing probabilities does not require that a probability experiment actually be performed. Rather, it relies on counting techniques, and requires equally likely outcomes.
Determine whether the scatter diagram indicates that a linear relation may exist between the two variables of the relation is linear, determine whether it indicates a positive or negative association between the variables. Do the two variables have a linear relationship? If the relationship is linear do the variables have a positive or negative association?
The data points do not have a linear relationship because they do not lie mainly in a straight line. The relationship is not linear.
That data in the table are based on the results of a survey comparing the commute time of adults to their score on a well-being test. Which variable is likely the explanatory variable and which is the response?
The explanatory variable is commute time and the response variable is the well-being score because commute time affects the well-being score.
Explain the circumstances for which the interquartile range is the preferred measure of dispersion. What is an advantage that the standard deviation has over the interquartile range?
The interquartile range is preferred when the data are skewed or have outliers. An advantage of the standard deviation is that it increases as the dispersion or the data increases.
In a national survey of high school students (grades 9 to 12), 25% of the students who responded reported that someone had offered , sold, or given them an illegal drug in school property. Is the value a parameter or a statistic?
The value is a statistic because the respondents who were high school students (grades 9 to 12) are a sample.
Amount of water in a dogs bowl. Is the variable discreet or continuous?
The variable is continuous because it is not countable.
Number of students in a class. Is the variable discrete or continuous?
The variable is discreet because it is countable.
Number of brothers. Is the variable qualitative or quantitative?
The variable is quantitative because it is a numerical measure.
True or false: in the binomial probability distribution function , nCx, represents the number of ways of obtaining x successes in n trials.
True
True or false: probability is the measure of the likelihood of a random phenomenon or chance behavior.
True
True or false: inferences based on voluntary response samples are generally not reliable
True because it is often the case that the individuals who volunteer do not accurately represent the population.
True or false: when comparing two populations, the larger the standard deviation, the more the dispersion has, provided that the variable of interest from the populations has the same unit of measure.
True, because the standard deviation describes how far, on average, each observation is from the typical value. A larger standard deviation means that observations are more distant from the typical value, and therefore, more dispersed.
True or false: in a probability model, the sum of the probabilities of all outcomes must equal 1.
True.
True or false: the least-squares regression line always travels through the point (x, y).
True.
True or false: chebysgev's inequality applies to all distributions regardless of shape, but the empirical rule holds only for distributions that are bell shaped.
True. Chebyshev's inequality is less precise than the empirical rule, but will work for any distribution, while the empirical rule only works for bell-shaped distributions.
Suppose 24 cars start at a car race. In how many ways can the tip 3 cars finish the race?
Type 24-math-PRB-nPr-enter-3 The number of different top three finishes possible for this race if 24 cars is 12144
Find the value of the permutation 8P7
Type in 8 on calculator-math-PRB-nPr-enter-7 40320
Inferential statistics
Uses methods that generalize results obtained from a sample to the population and measure the reliability of the results.
Suppose a life insurance company sells a $300,000 one year life insurance policy to a 21 year old female for $180. The probability that the female survives the year is 0.999472. Compute and interpret the expected value of this policy to the insurance company.
1) 1-0.999472=0.000528 2) take 180-300,000=—299,820 3) take 180 • 0.98472 = 179.90 4) take —289,820 • 0.000528 = —158.30 5) $179.90 + (-$158.30) = $21.60 the expected value is $21.60 The insurance company expects to make an average profit of $21.60 on every $21 year old female is insured for 1 year.
In a game of roulette a player can place a $7 bet on the number 15 and have a 1/38 probability of winning. The the metal ball lands on 15, the player gets to keep the $7 paid to play the game and the player is awarded and additional $245. Otherwise, the player is awarded nothing and the casino takes the players $7. What is the expected value of the game to the player? If you played 1000 times how much would you expect to lose?
1) Take 1/38 = 0.026316 for winning 2) Take 1-0.026316 = 0.973684 for losing 3) Take 245 • 0.026316 = 6.45 4) Take —7 • 0.973684 = —6.82 5) take 6.45 + (—6.82) = —0.37 The expected value is —$0.37 6) take 0.37 • 1000 = 370 The player would expect to lose $370
Suppose a doctor measures the height, x, and head circumference, y, of 11 children and obtains the data below. The correlation coefficient is .853 and the least squares regression line is y=.161x+13.003. Construct a residual plot to verify the requirements of the least square regression model.
1) open statcrunch-graph-scatterplot-compute
Suppose a doctor measures the height, x, and head circumference, y, of 11 children and obtains the data below. The correlation coefficient is .853 and the least squares regression line is y=.161x+13.003. Compute the correlation coefficient, R2
1) open statcrunch-stat-regression-simple linear regression-compute. 72.7%
An insurance company crashed four cars of the same model at 5 miles per hour. The cost of repair for each of the four crashes were $406, $419, $484, and $228. Compute the mean, median, and mode cost of repair.
1) open statcrunch-stat-summary stats-columns-compute. the mean cost of repair: $384.25 the median cost of repair: $412.50 the mode does not exist.
Suppose you toss a coin 100 times and get 62 heads and 38 tails. Based in these results, what is the probability that the next flop results in a tail?
1) take 38/100. The probability that the next flip results in a tail is approximately .38.
For the month of December in a certain city, 95% of the days are cloudy. Also in the month of December in the same city, 37% of the days are cloudy and foggy. What is the probability that a randomly selected day in December will be foggy if it is cloudy?
1) take cloudy and foggy/cloudy 2) change percents to decimals 3) 0.37/0.95=0.389 The probability is approximately 0.389
For the month of October in a certain city, 79% of the days are cloudy. Also in the month of October un the same city, 72% of the data are cloudy and snowy. What is the probability that a randomly selected day in October will be snowy if it's cloudy?
1) take cloudy and snowy/cloudy 2) change percents to decimals 3) 0.72/0.79=0.911 The probability is approximately 0.911
Suppose that a single card is selected from a standard 52-card deck. What is the probability that the card drawn is a club? Now suppose that a single card is drawn from a standard 52-card deck, but it is told that the card is black. What is the probability that the card drawn is a club?
1) there are 13 clubs is a deck of 52 cards 2) 13/52=0.25 The probability that the card drawn from a standard 52-card deck is a club is 0.25. 3) there are 26 black cards in a deck of 52 cards. 4) take 13/26=0.5 The probability that the card drawn from a standard 52-card deck is a club that is black is 0.5.
Explain the meaning of the following percentile. The 10th percentile of the weight of makes 36 months of age in a certain city is 11.0kg.
10% of 36 month old males weigh 11.0 kg or less, and 90% of 36 month old males weigh more than 11.0kg.
A bag of tulip bulbs purchased for a nursery contains 25 red tulip bulbs, 20 yellow tulip bulbs, and 55 purple tulip bulbs. a) what is the probability that a randomly selected tulip bulb is red? b) what is the probability that a randomly selected tulip bulb is purple? c) interpret these two probabilities.
25/100= a) the probability that a randomly selected tulip is red is .25 55/100 b) the probability that a randomly selected tulip is purple is .55 multiply both numbers by 100 c) if 100 tulip bulbs were sampled with replacement, one could expect ABOUT 25 of the bulbs to be red and ABOUT 55 of the bulbs to be purple.
What is a designed experiment?
A designed experiment is when a researcher assigns individuals to a certain group, intentionally changing the value of an explanatory variable, and then recording the value of the response variable for each group.
What is meant by marginal distribution?
A marginal distribution is a frequency or relative frequency distribution if either the row or column variable in a contingency table.
Define simple random sampling
A sample of size n from a population of size N is obtained through simple random sampling if every possible sample of size n has an equally likely chance of occurring. The sample is then called a simple random sample.
Combination
An arrangement of r objects chosen from n distinct objects without religion and without regard to order.
Describe what an unusual event is. Should the same cutoff always be used to identify unusual events? Why or why not?
An event is unusual if it has a LOW probability of occurring. The same cutoff should NOT always be used to identify unusual events. Selecting a cutoff is subjective and should take into account the consequences incorrectly identifying an event as unusual.
What is meant by confounding?
Confounding in a study occurs when the effects of two or more explanatory variables are not separated. Therefore, any relation that may exist between an explanatory variable and the response variable may be due to some other variable or variables not accounted for in the study.
Is the time required to download a file from the internet discreet or continuous?
Continuous. The possible values are t>0
A radio station asks its listeners to call in their opinion regarding the use of pesticides in residential areas. What type of sampling is used?
Convenience
To help assess student learning in her music appreciation courses, a music professor at a community college implemented ore- and post-tests for her music appreciation students. A knowledge-gained score was obtained by taking the difference of the two test scores. What is the response variable in this experiment?
Difference in test scores
A university conducted a survey of 365 undergraduate students regarding satisfaction with student government. Results of the survey are shown in the table by class rank. a) If a survey participant is selected at random, what is the probability that he or she is satisfied with student government?
Divide total satisfied/total 223/365 a) P(satisfied)= .611 Divide the total junior/total 83/365 b) P(junior)= .227 Divide 61/365 c) P(satisfied and junior)= .167 Satisfied/total+junior/total-satisfied junior/total 223/365+83/365-61/365 = .611+.226-.167=. d) P(satisfied or junior)= .671
A data set is given. Draw a scatter diagram. Comment on the type of relation that appears to exist between x and y. Choose the correct graph Given that x=3.5000, sx=2.3452, y=3.9000, s=1.7675, and r=-.9505, determine the least-squares regressions line.
a. open in statcrunch-graph-scatter plot-compute. Choose graph. there appears to be a linear, negative relationship. b. open in statcrunch-stat-regression-simple linear-commute. y=-.716x+6.407 c. choose same graph but with a line.
Is there an association between party affiliation and gender? The accompanying data represents the gender and party affiliation of registered voters based on a random sample of 812 adults. c. What proportion of registered voters consider themselves to be independent? d. Construct a conditional distribution of party affiliation by gender.
c. the total in the independent column from b. d. take whole number in each column/row and divide by total of each column/row.
For the histogram on the right determine whether the mean is greater than, less than, or approximately equal to the median.
x<M because the histogram is skewed left
True or false: when obtaining a stratified sample, the number of individuals included within eachbstratim must be equal.
False. Within stratified samples, the number of individuals samples from each stratum should be proportional to the size of the strata in the population.
A probability experiment is conducted in which the sample space of the experiment is S={12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23}. Let event E={13, 14, 15, 16, 17, 18} and F={17, 18, 19, 20}. List the outcomes in E and F. Are E and F mutually exclusive?
List numbers that both E and F have in common. {17, 18} No E and F have outcomes in common.
Outside a home, there is a 4-key keypad numbered 1 through 4. The correct four-digit code will open the garage door. The numbers can be repeated in the code. a) How many codes are possible? b) What is the probability of entering the correct code on the first try, assuming that the owner doesn't remember the code?
1) take keypad number to the power of the number of digit code 2) 4•4•4•4=256 a) the number of possible codes is 256 3) take 1/256 b) the probability that the correct code is given on the first try assuming that the owner doesn't remember the code is 1/256
Compute the range and sample standard deviation for strength of the concrete (in psi). 3980, 4120, 3500, 3000, 2960, 3830, 4120, 4040
1) open in statcrunch-stat-summary stats-columns-compute Range= 1160 S= 484.2
Factor
A variable whose effect on the response variable is to be assessed by the experimenter.
In a recent survey, it was found that the median income of families n country A was $57,000. What is the probability that a randomly selected family has an income greater than $57,000?
0.5 the median is the middle so half of the families in country A earn more than $57,000 and half of the families earn $57,000 or less.
What is the probability of obtaining ten heads in a row when flipping a count. Interpret this probability.
1 head = 0.5 0.5•0.5•0.5•0.5•0.5•0.5•0.5•0.5•0.5•0.5= 0.00098
Suppose Nate loses 29% of all ping-pong games. a) What is the probability that Nate loses two ping-pong games? b) What is the probability that Nate loses 6 games in a row? c) when events are independent, their complements are independent as well. Use this result to determine the probability that Nate loses six ping-pong games in a row, but does not lose seven in a row?
1) 0.29•0.29=0.0841 a) 0.0841 2)0.29•0.29•0.29•0.29•0.29•0.29=0.0006 b) 0.0006 3) 1-0.29=0.71 4) 0.71•0.0006=0.0004 c) 0.0597
Suppose that E and F are two events and that P(E| F)=0.3 and P(E)=0.4. What is P(F| E)?
1) 0.3/0.4=0.75 P(F| E)= 0.74
Among 42 to 47 year olds, 33% say they have driven a car while under the influence of alcohol. Suppose four 42 to 47 year olds are selected at random. a) what is the probability that all four have driven a car while under the influence of alcohol? b) what is the probability that at least one has not driven a car while under the influence of alcohol? c) what is the probability that non of the five have driven a car while under the influence of alcohol? d) what is the probability that at least one has driven a car while under the influence of alcohol?
1) 0.33 • 0.33 • 0.33 • 0.33 = 0.0119 a) 0.0119 2) 1-0.0119=0.9881 b) 0.9881 3) 1-0.33=0.67 4) 0.67•0.67•0.67•0.67=0.2015 c) 0.2015 5) 1-0.2015=0.7985 d) 0.7985
Suppose that E and F are two events and that P(E)=0.2 and P(F|E)=0.8. What is P(E and F)?
1) 0.8= P(E and F)/0.2. Solve for P(E and F) 2.) 0.8•0.2=0.16 P(E and F)=0.16
That data in the table are based on the results of a survey comparing the commute time of adults to their score on a well-being test. Determine the correlation coefficient. Does a linear relationship exist between the commute time and well-being index score?
1) Open in stat crunch-stat-regression-simple linear-commute. -.984 Yes, there appears to be a negative linear association because r is negative and is less than the negative of the critical value.
About 20% of the populations of a large country is hopelessly romantic. a) If two people are randomly selected, what is the probability both are hopelessly romantic? b) What is the probability at least one is hopelessly romantic?
1) change percent to decimal. 0.20. P(E)=0.20 P(F)=0.20 P0.20•0.20=0.04 a) 0.04 P(only or both)=P(only)+P(both). 1) 1-0.20=0.80 2) 0.80•0.80=0.64 3) 1-0.64=0.36 b) 0.36
Suppose you just received a shipment of 15 televisions. Four of the televisions are defective. If two televisions are randomly selected, compute the probability that both televisions work. What is the probability at least one of the two televisions does NOT work?
1) take working TVs/total TVs=P(first works) 2) 11/15 = one works 3) now there are 14 TVs left and 3 are defective. 4) take number of working TVs/remaining TVs 5) 10/14 = another works 6) 11/15 • 10/14 = 0.524 The probability that both TVs work is 0.524 7) 1-0.524 The probability that at least one of the two TVs does not work is 0.476
Suppose that a single card is selected from a standard 52-card deck. What is the probability that the card is drawn is a ace? Now suppose that a single card is drawn from a standard 52-card deck, but it is told that the card is plain (any card EXCEPT jack, queen, or king). What is the probability that the card drawn is a ace?
1) there are 4 ace in a deck of 52 cards 2) take 4/52=0.077 The probability that the card drawn from a standard 52-card deck is an ace is 0.077. 3) there are 12 courts in a deck of 52 cards 4) take 52-12=40 5) there 4 aces in a deck of 52 cards 6) take 4/40= 0.1 The probability that the card drawn from a standard 52-card deck is an ace given that this card is plain is 0.1
Suppose that a single card is selected from a standard 52-card deck. What is the probability that the card drawn is a king? Now suppose that a single card is drawn from a standard 52-card deck, but it is told that the card is court (jack, queen, king). What is the probability that the card drawn is a king?
1) there are 4 kings in a deck of 52 cards 2) take 4/52=0.077 The probability that the card drawn from a standard 52-card deck is a queen is 0.077. 3) there are 12 courts in a deck of 52 cards 5) there are 4 kings in a deck of 52 cards 6) take 4/12= The probability that the card drawn from a standard 52-card deck is a king, given that this card is court is 0.333.
A probability experiment is conducted I. Which the sample space of the experiment is S={3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}, in an event E={4, 5, 6, 7} and event G={8, 9, 19, 11}. Assume that each outcome is equally likely. List the outcomes in W and G. Are E and G mutually exclusive?
E and G have no numbers in common so E and G={} Yes because E and G have no outcomes in common.
A standard deck of cards contains 52 cards. One card is selected from the deck. a) compute the probability of randomly selecting a club or spade. b) compute the probability of randomly selecting a club or spade or diamond. c) compute the probability of randomly selecting a four or a club.
13 spades and 13 clubs in a deck of cards 13/52=.25 13/52=.25 a) P(spade or club)=.25+.25=.5 13 diamonds in a deck of cards 13/52=.25 P(spade or club or diamond)=.25+.25+.25=.75 b) P(spade or club or diamond)=.75 4 fours in a deck 1 four of clubs in a deck 4/52=.077 1/52=.019 P(four or club)=.077+.25-.019 c) P(four or club)=.308
Suppose a doctor measures the height, x, and head circumference, y, of 11 children and obtains the data below. The correlation coefficient is .853 and the least squares regression line is y=.161x+13.003 Approximately ___% of the variation in __________ is explained by the least square regression model. According to the residual plot, the linear model appears to be _______
72.8 (correlation coefficient) Head circumference Appropriate
Experimental unit
A person, object, or some other well-defined item upon which a treatment is applied.
What is a residual? What does it mean when a residual is positive?
A residual is the difference between an observed value of the response variable y and the predicted value of y. If it is positive, then the observed value is greater than the predicted value.
Permutation
An ordered arrangement of r objects chosen from n distinct objects without repetition.
An experiment in probability is
Any process that can be repeated in which the results are uncertain.
Describe how the value of n affects the shape of the binomial probability histogram
As n increases, the binomial distribution becomes more bell shaped.
State the criteria for a binomial probability experiment
Each trial has two possible mutually exclusive outcomes: success and failure The probability of success, p, remains constant for each trial of the experiment The trials are independent The experiment consists of a fixed number
The notation P(F| E) means that the probability of event ________ given event ________.
F, E
Quartiles
Divide data sets in fourths
Cluster sample
Is obtained by dividing the population into groups and selecting all individuals from within a random sample of the groups.
A man has 8 shirts and two ties. Assuming they all match, how many different shirt and tie combinations can he wear?
Multiply 8•2 16
You suspect a 6-sided due to be loaded and conduct a probability experiment by rolling the die 400 times. The outcome of the experiment is listed in the following table. Do you think the Dow is loaded? Why?
No because each value has an approximately equal chance of occurring.
Suppose that two variables, X and Y, are negatively associated. Does this mean that above average values of X will always he associated with below-average values of Y? Explain.
No, because association does not mean that every point fits the trend. The negative association only means that above-average values of X are generally associated with below-average values of Y.
According to a center for disease control, the probability that a randomly selected person has hearing problems is 0.148. The probability that a randomly selected person has vision problems is 0.099. Can we compute the probability of randomly selecting a person who has hearing problems or vision problems by adding these probabilities? Why or why not?
No, because hearing and vision problems are not mutually exclusive. So some people have both hearing and vision problems. These people would be included twice in the probability.
If P(E)=0.60, P(E or F)=0.75, and P(E and F)=0.20, find P(F).
P(E or F)=P(E)+P(F)-P(E and F) 0.75=0.60+P(F)-0.20. Solve for P(F) P(F)=0.35
Find the probability of the indicates event if P(E)=0.25 and P(F)=0.35. Find P(E or F) if P(E and F)=0.20
P(E or F)=P(E)+P(F)-P(E and F) P(E or F)=0.25+0.35-0.20 P(E of F)=0.4
Violent crimes include rape, robbery, assault, and homicide. The following is a summary of the violent crime rate (violent crimes per 100,000 population) for all states of a country in a certain year. Q1=272.8 Q2=387.4 Q3=528.3 Do you believe that the distribution of violent crime rates is skewed or symmetric?
The distribution of violent crime rates is skewed right.
Why is the median resistant, but the mean is not?
The mean is not resistant because when data are skewed, there are extreme values in the tail, which tend to pull the mean in the direction of the tail. The median is resistant because the median of a variable is the value that lies in the middle of the data when arranged in ascending order and does not depend on the extreme values of the data.
In a certain game, the probability that a player is dealt a particular hand is 0.27. Explain what this probability means. If you play this card fame 100 times, will you be dealt this hand exactly 27 times? Why or why not?
The probability 0.27 means that approximately 27 out of every 100 dealt hands will be that particular hand. No, you will not be dealt this hand exactly 27 times since the probability refers to what is expected in the long term, not short term.
What are the advantages of having a preserved with open questions to assist in constructing a questionnaire that has closed questions? Click the icon. An open question allows the respondent to choose his or her response: What is the most important problem facing America's youth today? A closed question requires the respondent to choose from a list of predetermined responses: What is the most important problem facing America's youth today? a) drugs b) violence c) single-parent hones d) promiscuity e) peer pressure
The researcher can learn common answers
What does it mean to say that two variables are negatively associated?
There is a linear relationship between the variables, and whenever the value of one variable increases, the value of the other variable decreases.
What does it mean to say that two variables are positively associated?
There is a linear relationship between the variables, and whenever the value of one variable increases, the value of the other variable increases.
Find the value of 10C4
Type 10 on calculator-math-PRB-nCr-enter-4 210
How many different simple random sample sizes of 4 can be obtained from a population whose size is 44?
Type 44-math-PRB-nCr-4-enter The number of simple random samples which can be obtained is 135751
According to an airline, flights in a certain route are on time 80% of the time. Suppose 20 flights are randomly selected and the number of on time flights is recorded. a) explain why this is a binomial experiment
a) The probability is success is the same for each trial The experiment is performed a fixed number of times The trials are independent There are two mutually exclusive outcomes, success or failure.
In the probability distribution to the right, the random variable x represents the number of marriages an individual aged 15 years or older has been involved in.
a) this is a discreet probability distribution because ALL of the probabilities between 0 and 1, inclusive and the sum of the probabilities is 1 b) open in statcrunch-graph-stat-calculators-custom-compute. Choose graph. the distribution has one mode and is skewed left. Mean is .92 c) if many individuals aged 15 years or older were surveyed, one would expect the mean number of marriages to be the mean of the random variable. d) look at graph data. Standard deviation is 0.8 e) look at original data. 0.119 f) add p(x)2+p(x)3+p(x)4+p(x)5 0.119+0.031+0.004+0.001= 0.155
Is there an association between party affiliation and gender? The accompanying data represents the gender and party affiliation of registered voters based on a random sample of 812 adults. e. Draw a graph. f. Is gender associates with party affiliation?
e. Choose graph f. Yes, gender is associated with party affiliation. Males are more likely to be independents and less likely to be democrats.
According to an airline, flights in a certain route are on time 80% of the time. Suppose 20 flights are randomly selected and the number of on time flights is recorded. d) find and interpret the probability that ATLEAST than 14 flights are in time.
n=20 p=.80 x=14 At least means P(x>=14) Compute 14-20 then add The probability that fewer than 14 flights are in time is 0.9133 Take .9133 • 100 In 100 trials of this experiment, it is expected about 91 to result in fewer than 14 flights being on time.
According to an airline, flights in a certain route are on time 80% of the time. Suppose 20 flights are randomly selected and the number of on time flights is recorded. d) find and interpret the probability that between 12 and 14 flights are in time.
n=20 p=.80 x=14 Compute 12-14 then add The probability that fewer than 14 flights are in time is 0.1858 Take .1858 • 100 In 100 trials of this experiment, it is expected about 19 to result in fewer than 14 flights being on time.
According to an airline, flights in a certain route are on time 80% of the time. Suppose 20 flights are randomly selected and the number of on time flights is recorded. b) find and interpret the probability that EXACTLY 14 flights are on time. In 100 trials of this experiment, it is expected about ____to result in exactly 14 flights being on time.
n=20 p=.80 x=14 Exactly means P(X=14) 20C14 (.80)^14 (1-.80)^20-14 (38760)(.0439804651)(.000064)=0.1091 The probability that exactly 14 flights are on time is 0.1091 Take 0.1091 • 100 = 10.91 In 100 trials of this experiment, it is expected that about 11 to result in exactly 14 flights being on time.
According to an airline, flights in a certain route are on time 80% of the time. Suppose 20 flights are randomly selected and the number of on time flights is recorded. c) find and interpret the probability that FEWER than 14 flights are in time.
n=20 p=.80 x=14 Fewer means P(x<14) Compute 0-13 then add The probability that fewer than 14 flights are in time is 0.0867 Take .0867 • 100 In 100 trials of this experiment, it is expected about 9 to result in fewer than 14 flights being on time.
In a certain city, the average 20 to 29 year old man is 69.8 inches tall, with a standard deviation of 2.2 inches, while the average 20 to 29 year old woman is 64.5 inches tall, with a standard deviation of 2.9 inches. Who is relatively taller, a 75 inch man or a 70 inch woman?
x-μ/σ Men x=75 μ=69.4 σ=3.2 75-69.4/3.2= 1.75 Women x=70 μ=64.3 σ=3.8 70-64.3/2.8=1.5 The z-score for the man, 1.75, is larger than the z-score for the woman, 1.5, so he is relatively taller.
Violent crimes include rape, robbery, assault, and homicide. The following is a summary of the violent crime rate (violent crimes per 100,000 population) for all states of a country in a certain year. Q1=272.8 Q2=387.4 Q3=528.3 Provide interpretation of these results.
25% of the states have a violent crime rate that is 272.8 crimes per 100,000 population or less. 50% of the states have a violent crime rate that is 387.4 crimes per 100,000 population or less. 75% of the states have violent crime rate that isn't 528.3 crimes per 100,000 population or less.
An experiment was conducted in which two fair side were thrown 100 times. The sum of the pups showing on the dice was then recorded. The frequency histogram to the right gives the results. What was the most frequent outcome of the experiment? What was the least frequent? How many times did we observe 8? How many more 4's were observed than 10's? Determine the percentage of time a 6 was observed. Describe the shape of the distribution.
7 2 16 1 16/100= 16% Bell shaped
To predict future enrollment in a school district, 50 households within the district were sampled, and asked to disclose the number of children under the age of five living in the household. The results of the survey and presented in the table. Construct a relative frequency distribution of the data.
1) add up total number of households. 17+13+14+4+2=50 2) go to statcrunch-graph-bar plot-with summary. 3) dont use decimals to answer percentage questions
Violent crimes include rape, robbery, assault, and homicide. The following is a summary of the violent crime rate (violent crimes per 100,000 population) for all states of a country in a certain year. Q1=272.8 Q2=387.4 Q3=528.3 The violent crimes in a certain state of the country in that year was 1,467. Would that be an outlier?
1) find upper fence: Q3+1.5(IQR) 528.3+1.5(255.5)=911.55 2) find lower fence: Q1-1.5(IQR) 272.8-1.5(255.5)=-110.45 1467 is greater than 911.55 so it's an outlier. Yes, because it is greater than the upper fence.
The following data represent the dividend yields (in percent) of a random sample of 28 publicly traded stocks. Compute the five number summary. Draw a box plot of the data Determine the shape of the distribution of the box plot.
1) open in statcrunch-graph-boxplot-var1-draw boxes horizontally-compute. 0, .24, .89, 2.39, 3.54
A polling organization contacts 1512 adult women who are 30 to 70 years of age and live in the United States and asks whether or not they had received a mammogram during the last year. What is the population in the study?
Adult women who are 30 to 70 years of age and live in the United States.
A quality control manager randomly selects 40 bottles of apple juice that were filled on December 22 to assess the calibration of the filling machine. What is the population in the study?
All the bottles of apple juice produced in the plant on December 22.
What is an observational study?
An observational study measures the value of the response variable without attempting to influence the value of either the response or explanatory variables.
Treatment
Any combination of the values of the factors (explanatory variables)
The following data represent the dividend yields (in percent) of a random sample of 28 publicly traded stocks. Five number summary: 0, .24, .89, 2.39, 3.54 Draw a box plot of the data Determine the shape of the distribution of the box plot.
Make sure to count the horizontal line for boxplot. Skewed right.
To help assess student learning in her music appreciation courses, a music professor at a community college implemented ore- and post-tests for her music appreciation students. A knowledge-gained score was obtained by taking the difference of the two test scores. What is the treatment?
Music appreciation course
A histogram of a set of data indicates that the distribution of the data is skewed right. Which measure of central tendency will likely be larger, the mean or the median? Why?
The mean will likely be larger because the extreme values in the right tail tend to pull the mean in the direction of the tail.
The survey has bias a polling organization conducts a study to estimate the percentage of households that place an emphasis on physical activity. It mails questionnaire to 1709 randomly selected households across the country and asks the head of each household mid hw or she places an emphasis on physical activity. Of the 1709 households selected, 28 responded. Which bias is present?
Nonresponse
The survey has bias a polling organization conducts a study to estimate the percentage of households that place an emphasis on physical activity. It mails questionnaire to 1709 randomly selected households across the country and asks the head of each household mid hw or she places an emphasis on physical activity. Of the 1709 households selected, 28 responded. How can the bias be remedied?
The polling organization should try contacting households that do not respond by phone or face-to-face.