semester 1

Ace your homework & exams now with Quizwiz!

how far a particular score is above or below the mean

a standard score describes

when corn prices are above average, soybean prices also tend to be above average

an agricultural economist says that the correlation between corn prices and soy bean prices is r = 0.7. this means that

to model a distribution with a continuous function so you can take any interval and the area under the curve will equal the area of the bars in a histogram

what is the purpose of a density curve

.159

birth weights at a local hospital have a normal distribution with a mean of 110 oz. and a standard deviation of 15 oz. the proportion of infants with birth weights under 95 oz is about

yes

can outliers in the horizontal direction that don't fall in the overall pattern of the plot be influential?

III only

Which of the following statements about outliers are true? I. an influential point always has a high residual II. outliers are always influential points III. removing an influential point always causes a marked change in either the conclusion, the regression equation, or both

any variables whose effect on the response variable cannot be separated from each other

confounding variables are

- independent - 1/2 +4/6 = 1/3 -1/2 + 4/6 - 1/3= 5/6

suppose we roll two six sided dice- one red and on green. Let A be the event that the number of spots showing on the red die is three or less and B be the event that the number of spots showing on the green die is three or more - the events A and B are - P(AnB)= - P(AuB)=

eggs = -142.74 + 39.25 (length)

the equation of the lsrl is

skewed left, centered at 4.5, with some high outliers

the histogram below shows the length in minutes of 140 songs recorded by the band Wilco. the count for each bar is shown at the top of the bar. what description best fits this distribution? (shape, center, outliers??)

.8643

using the standard normal distribution tables, the area under the standard normal curve corresponding to Z < 1.1 is

P(A or B) = 1.0

Event A occurs with probability 0.2. Event B occurs with probability 0.8. If A and B are disjoint (mutually exclusive) then

7.5

the 5 number summary of the distribution of 316 scores on a statistics exam is: 0, 226, 31, 36, 50. the scores are approximately normal. the standard deviation of test scores must be about

- the median jump is between 80 and 85 inches - 82 inches - 55%

- for these data, the median is.. - the mean of this histogram is approximately - based on the histogram, the percentage of the winning jumps that were at least 80 inches is about

it must be 20

the ages of people in a college class are as follows: age 18 19 20 21 22 23 24 25 32 number of students 14 120 200 200 90 30 10 2 1 what is true about the median age?

- 3 minutes 45 seconds - 20% of the songs - 3 minutes - skewed right

- approximately what is the median rolling stones song length? - approximately what percent of rolling stones songs are over 5 minutes in length? - approximately what is the 20th percentile of rolling stones song lengths? - what is the shape of this distribution?

- z score = .42 the student that had a self concept score of 62 scored .42 times the standard deviation above the mean - skewed left

- calculate and interpret the z score for the student that had a self concept score of 62 - on the graph above, make a rough sketch of a density curve for these data. how would you describe the shape of this density curve?

neither the volunteers nor the psychiatrist knew which treatment any person had received

100 volunteers who suffer from severe depression are available for a study. 50 are selected at random and are given a new drug that is thought to be particularly effective in treating severe depression. the other 50 are given an existing drug for treating severe depression. a psychiatrist evaluates the symptoms of all volunteers after four weeks in order to determine if there has been substantial improvement in the severity of the other depression. the study would be double blind if

let 01-20= football players and 21=57= all other students. read off 2 digit numbers, ignoring 00 and any numbers above 57, until 3 different numbers/students are chosen. repeat 30 times, keeping track of the proportion of times all 3 students are football players. record data and calculate. 2 out of 30 times resulted being all 3 football player students. this is about 6% of the time. therefore, based on the simulation, there is only a 6% chance that all 3 dorms would go to football players.

57 students participated in a lottery for the good dorm rooms. 20 of them are football players. when all 3 winners were football players, the other students cried foul. use simulation to determine whether an all team outcome could reasonably be expected to happen. based on you simulation results what would be the probability that all 3 winners are football players?

- 79.11 -2.47

For children between the ages of 18 months and 29 months, there is an approximately linear relationship between height and age. The relationship can be represented by yˆ = 64.93 + 0.63x, where y represents height (in centimeters) and x represents age (in months). Joseph is 22.5 months old. -What is his predicted height? -Loretta is 20 months old and is 80 cm tall. what is her residual?

5

IQs among undergraduates at mountain tech are approximately normally distributed. the mean undergraduate IQ is 110. about 95% of undergraduates have IQs between 100 and 120. the standard deviation of these IQs is about...

increase by $3,000 increase by $3,000 be unchnaged be unchanged be unchanged

a sample was taken of the salaries of 20 employees of a large company. suppose each employee in the company receives a $3,000 raise for next year (each employee's salary is increased by $3,000). the mean salary for the employees will... the median salary for the employees working for the company will... the standard deviation of the salaries for the employees will... the interquartile range of the salaries for the employees will... the z - scores of the salaries for the employees will...

- the range of the middle half of the salaries is about $20,000 - 28,39, 48, 60.5, 77

a sample was taken of the salaries of 20 employees of a large company. the following box plot shows the salaries for this year. - what is the range - what is the 5 number summary

5.18

a soft drink machine can be regulated so that it discharges an average mean oz per cup. if the ounces of fill are normally distributed with a standard deviation of .4 oz, what value should mean be set at so that 98% of 6 oz cups will not overflow?

sampling frame (the "list" your're choosing from): the telephone directory. sample: the 850 people who respond

a candidate for mayor of Dallas calls 1,000 people chosen at random from the city telephone directory; 850 of them respond. What are the sampling frame and the sample in this example?

under coverage and non response is present because if the people are home during a week day working hours then they will have more time to volunteer, overestimating how many hours the community as a whole would be willing to volunteer

a church group interested in promoting volunteerism in a community chooses an SRS of 200 community addresses and send members to visit these addresses during weekday working hours to inquire about the residents' attitudes toward volunteer work. sixty percent of all respondents say that they would be willing to donate at least an hour a week to some volunteer organization. bias is present in this sample design. identify the type of bias involved in this example by name and explain why you think the sample results obtained are different from the population.

population: high school seniors apply to college sample: randomly selected applicants from her college that responded

a college admission officer wants to know what the most important factors are that high school seniors consider when they choose where to apply to college. she conducts a telephone survey by taking a simple random sample of applicants to her college. identify the population and the sample

31.85

a company produces packets of soap powder that are labeled "giant size 32 ounces." the actual weight of soap powder in a box has a normal distribution with a mean of 33 oz and a standard deviation of .7 oz 95% of packets actually contain more than x oz. of soap powder. what is x?

14 only

a lobster fisherman is keeping track of the productivity of a set of traps as he has placed in a favorite location. below are the numbers of lobsters in these traps over the course of 12 different hauls. 0, 3, 3, 3, 4, 5, 5, 6, 7, 7, 12, 14. according to the 1.5 x IQR rule, which values in the above distribution are outliers?

mean and standard deviatioin

a policeman records the speeds of cars on a certain section of roadway with a radar gun. the histogram below shows the distribution of speed for 251 cars. which of the following measures of center and spread would best to use when summarizing these data?

voluntary response

a poll conducted by the student newspaper asked, "who do you believe will win the Ohio State undergrad student government elections?" in order to vote, one had to access the student newspaper's web site and record ones vote. the results of the poll were summarized in a pie graph similar to the graph below. what type of response is this?

cluster sample

a public opinion poll in Ohio wants to determine whether or not registered voters in the state approve of a measure to ban smoking in all public areas. they select a random sample of five countries and poll all registered voters in each country by asking whether they approve or disapprove of the measure. this is an example of

fare = 2.5 + 2 (each mile) mean fare = 2.5 + 2(6.5) = $15.50 standard deviation = 2 x 2 = $4 the shape will not change

a taxi driver charged an initial fee of $2.50 plus $2 per mile. he calculated his mean distance and standard deviation at the end of the month ( mean = 6.5 miles, standard deviation = 2 miles). he wants to know what the average and standard deviation was for the fares that month in dollars. how do you calculate that? what happens to the shape?

- q + s + r - r / q+r - r

a venn diagram looks like this: in circle A is q, in Circle b is S, in the combined portion is r, outside of the circles in the rectangle is p - P(AuB) -P(B/A) - the probability associated with the intersection of A and B

label the plants using 2 digit numbers from 01 - 24. starting from the left to right, read off the first 12 2 digit unique numbers from 01 - 24. ignore only repeating numbers not from 01 - 24. assign the first 12 corresponding plants to the fertilizer they currently manufacture. assign the remaining 12 plants to the fertilizer they currently manufacture. then compare the effectiveness that the fertilizers had on the tomatoes heaviness

agricultural scientists for a chemical company wants to determine if a newly developed fertilizer produces heavier tomatoes than the fertilizer they currently manufacture. for their first pilot study, they have 24 healthy young tomato plants growing in individual pots. use the section from the random digit table to explain how to carry out the randomization required by your design

matched pairs design

an experiment compares the taste of a new spaghetti sauce with the taste of a commercially successful sauce readily available in grocery stores. each of a number of tasters tastes both sauces (in random order) and says which tastes better. this is called a

yes

an office supple catalog gives a description of bookshelves that includes the following variable. is the color of the bookshelf categorical?

- fods, the association between sbp and age of 42 - 67 year old men is positive/increasing, linear, and moderate. there is an outlier at about (46,203) - this point is an outlier because, compared to the overall pattern of the scatter plot, it has an unusually high systolic blood pressure for a younger age of about 46

below is a scatter plot relating systolic blood pressure and age for 14 men from 42 to 67 years old - describe the association - there is one unusual point on the graph. which one is it and explain what is unusual about this case in context

- the data point would lessen the correlation because it doesn't fit within the overall pattern of the graph. it would be an unusual large amount of hours of delay per year for the given smaller average vehicles per day - slope = .07822, the hours of delays per year are predicted to increase .07822 per 1 average vehicles peer day - S= 3899.57, the actual amount of hours of delay per year is 3899.57 hours more than the predicted amount of hours of delay per year from the lsrl with x=average vehicles per day - while the relationship

below is data on the total number of hours of felay per year at 10 major highway intersections versus traffic volume (measured by avg # of vehicles per day that pass through intersection). predictor coef: -3629, vehicles per day coef: 0.07822, S= 3899.57, R-sq= 48.6% - suppose another data point at (200000, 2400), that is 200,000 vehicles per day and 24,000 hours of delay per year were added to the plot. What effect, if any, will this new point have on the correlation between these two variables? explain why - what is the slope of the regression line? interpret the slop in the context of the problem - determine the value of S and, explain what it measures in the context of this problem

yes, commonly being skewed left, skewed right, and symmetric. a density curve can be symmetric and not normal

can density curves occur in other shapes?

- 0.160 - the events are not mutually exclusive, but they are exclusive

car bus walk totals juniors 146 106 48 300 seniors 146 64 40 250 totals 292 170 88 550 you select one student from this group at random: - if the student says he is a junior, what is the probability that he walks to school? - the events typically walks to school and junior are or are not mutually exclusive and/or are or are not independent

the placebo effect

does caffeine improve exam performance? students in an 8:30 section of a course are randomly assigned to a treatment group (two cups of coffee), or a control group (two cups of decaffeinated coffee). the coffee is so bad that students cannot tell whether they are in the treatment or the control group. as it turns out, students in both groups do better on the exam that students in the 9:30 section, who weren't given anything. this could be the result of

exam performance

does caffeine improve exam performance? suppose all students in the 8:30 section of a course are given a "treatment" (2 cups of coffee) and all students in the 9:30 section are not permitted to have any caffeine before a mid-term exam. the response variable in this study is

block design

does caffeine improve exam performance? suppose half of the students from the 8:30 section of a course are randomly allocated to the treatment group (two cups of coffee) and the other half to the control group (two cups of decaf). in addition, half of the 9:30 students are randomly allocated to the treatment group, the other half to the control group. this is an example of a

310

entomologist heinz kaefer has a colony of bongo spiders in his lab. there are 1000 adult spiders in the colony, and their weights are normally distributed with mean 11 grams and standard deviation 2 grams. about how many spiders are there in a colony which weigh more than 12 grams?

71%

here is a list of exam scores for mr. williams calculus class: 60 61 61 65 72 75 75 78 81 81 85 89 91 98. what is the percentile of the person whose score was 85?

- .001(.99) + .999(.02)= .02097 - P(D/P) = .001(.99)/.02097 = .047

the probability of a randomly selected adult having a particular rare disease is .001. the diagnostic test to detect the disease is not perfect. the probability the test will be positive (indicating that the person has the disease) is .99 for a person with the disease and .02 for a person without the disease. - the proportion of adults for which the test would be positive is - if a randomly selected person is tested and the result is positive, the probability the individual has the disease is approximately

using control, randomization, replication, comparison control: control sources of variation (i.e. variables) other than the factors we are testing, so we need to make the conditions as similar as possible for all the treatment groups randomization: involves using a chance process to determine which members of a population are included in the sample so you can conclude based off the general whole population replication: if we base our experiment on the outcomes of just one or a small number of subjects, then the differences between treatment groups could be due to chance comparison: experiments must compare two or more treatments to avoid confounding

how do you describe a completely randomized design for an experiment?

- i would use median and iqr because there are outliers and median and iqr are resistant to outliers so 157 won't give a misleading description - yes bc after using iqr/1.5 rule i found that number below -10 or above 110 are outliers so 157 is an outlier

how much oil wells in a given field will produce info in deciding whether to drill more wells. below is 5 number summary of the estimated total amounts of oil recovered from 38 wells in thousands of barrels. 3 35 47 65 157 -which measures would you use to describe the center and spread? why? -are there any outliers?

they are all the same

how would the mean of the three distributions compare?

association: when specific values of one variable tend to occur in common with specific values of another inference: drawing conclusions beyond the data at hand

i can define association and inference

the mean = 0, standard deviation = 1, and the shape does not change

if you standardize every test score from Mr. Bowman's class what would the new mean and standard deviation be and how would the shape be effected?

observational study

in order to assess the effects of exercise on reducing cholesterol, a researcher took a random sample of fifty people from a local gym who exercised regularly an another random sample of fifty people from the surrounding community who did not exercise regularly. they all reported to a clinic to have their cholesterol measured. the subjects were unaware of the purpose of the study, and the technician measuring the cholesterol was not aware of whether or not subjects exercised regularly. this is an

he can make inferences about the population from which the samples were taken, but not about cause and effect

in order to assess the effects of exercise on reducing cholesterol, a researcher took a random sample of fifty people from a local gym who exercised regularly an another random sample of fifty people from the surrounding community who did not exercise regularly. they all reported to a clinic to have their cholesterol measured. the subjects were unaware of the purpose of the study, and the technician measuring the cholesterol was not aware of whether or not subjects exercised regularly. which of the following best describes the inferences the researcher can make based on his results?

yes, the residual plot shows no pattern and r, r^2 are both close to 1. S is small in context. ALTERNATIVELY: no, the residual plot has a sinusoidal pattern

is a line an appropriate model for these data? justify your answer

.3%

items produced by a manufacturing process are supposed to weigh 90 grams. the manufacturing process is such, however, that there is variability in the items produced and they do not all weigh exactly 90 grams. the distribution of weights can be approximated by a normal distribution with a mean of 90 grams and a standard deviation of 1 gram. about what percentage of items will either weigh less than 87 grams or more than 93 grams?

jacks z = 1.07; jills z = 1.11; jill wins the contest

jack and jill are both enthusiastic players of a certain computer game. over the past year, jack's mean score when playing the game is 12,400 with a standard deviation of 1500. during the same period, jill's mean score is 14,200 with a standard deviation of 2000. they devise a fair contest: each one will play the game once, and they will compare z scores. jack gets a score of 14,000 and jill gets a score of 16,000. who won the contest, and what were each of their z scores

- x p $32 .85 $20 .15 - $30.20, the average amount joe charges a randomly selected customer is $30.20 - $4.28, the average amount of $ deviation from the mean is $4.28

joe the barber charges $32 for a shave and a haircut and $20 for just a haircut. he determine that the probability that a randomly selected customer comes in for a shave and haircut is 0.85, and that the rest of his customers come in for just a haircut. let j=what joe charges and randomly selected customer. - give the probability distribution for j - find and interpret uj - find and interpret oj

under coverage

just before the presidential election of 1936, a magazine incorrectly predicted that Alf Landon would defeat Roosevelt. Landon lost in a landslide. it turned out that the magazine only polled its own subscribers, plus others from a list of automobile owners and a list of people who had telephone service. all three groups had a higher typical income during the great depression. this is an example of

- 3 - 3

mr williams asked the 26 seniors in his stats class how many ap courses they took in high school. below is a dot plot summarizing the results of the survey. - the median number of ap courses taken by mr williams students is - the interquartile range for the number of ap courses is

- P(T6c/D) = P(T^cnD)/P(D) = .05/12= .25 - - T - .95 -D .2 -.25 TC -.05 - .7 T -D^c - .3 T^c - P(D/T^c)= (P)= DnT^c/T^c= .05/.29 = .17 - .95

some days Ramon drives his car to work the rest of the time he rides his bike. suppose we choose a random work day. the table shows the probability of each event drives to work = .20 drives and is late for work = .05 late for work given he rode his bike = .30 - find the probability that Ramon is late for work, given that he drives - draw a tree diagram - find the probability that Ramon drove to work, given that he is late - find that probability that Ramon is not late for work, given the he drives

conditional distribution of political party registration among males

the proportion of males that are registered as democrats is part of a

marginal distribution of political part registration

the proportion of registered democrats is part of a

29.16 (to find variance, square standard deviation)

the standard deviation of 16 peoples' weights (in pounds) is computed to be 5.40. the variance of these measurements is

- independent / constant probability because of sampling without replacement - the sample size (50) is less than 10% of the population (2700) - .0.4286 - np = 14, n(1-p) = 36, yes both are > or equal to 10 - 0.3783

suppose there are 2700 students at HCHS, and that 28% of them take spanish. you select a random sample of 50 students in the school, and you want to calculate the probability that 15 or more of the students in your sample take spanish. - which condition for the binomial setting has been violated here? explain - considering the size of the population and the size of the sample why is it still possible to use the binomial distribution to approximate the probability that 15 or more students in the sample take spanish? - use a binomial distribution to approximate that 15 or more students in the sample take spanish. - is it appropriate to use a normal distribution to approximate the probability that 15 or more students take spanish? justify. - use the normal approximation to approximate the probability that 15 or more students take spanish.

0.0625

suppose you want to flip a coin until the first heads turns up. The probability that it takes four flips for the first heads to occur (that is, three tails followed by one heads). In this example, x = 4 and p = 0.5 what is the P(x=4)

experimental units: each customer on hold factors: music and message treatments: IN SHORT the 6 combos of music and message

the 24 hour customer service call center for a major electronics manufacturer is trying to determine how to keep customers who are on hold as happy as possible. they want to examine whether the type of music they play while customers are on hold and whether or not there is a periodically repeated recorded message have an impact on customer satisfaction. they plan to randomly select customers who are on hold and play one of three different types of music (smooth jazz, classical, or show tunes) and either play the recorded message or not. after the entire call is over, they will ask the customers to rate their overall customer service experience. suppose the company plans to conduct a completely randomized design. list the experimental units, factors and treatments in this experimental design

using the random digit table, start at line 101 and read off single digit numbers until an odd number (odd = boy, even = girl) occurs. count how many digits/children until a baby is born. repeat several (30) times. find the average number of children per family. if the average is above 2, the population will increase. if it is less than 2, it will decrease

the chinese gov wants to reduce population growth, suppose they initiate a program that allows families to have kids until they have a male child, then they must stop having kids. how will the population be effected by this? develop a simulation model using the random digit table for the population growth policy in China.

between 76 and 100

the five number summary of the distribution of scores on the final exam in Psych 001 last semester was 18, 39, 62, 76, 100. the 80th percentile was

16%

the following histogram represents the distribution of acceptance rates (percent accepted) among 15 business schools in 2007. what percent of the schools have an acceptance rate of under 20%?

501. 34% of those that take the GRE score below 502

the graduate record examination (GRE) are widely used to help predict the performance of applicants to graduate schools. the range of possible scores on the GRE is 200 to 900. the psychology department at a university finds that the scores of its applicants on the quantitative GRE are approximately normal with mean of 544 and standard deviation of 103. calculate and interpret the 34th percentile of the distribution of applicants' GRE scores

33.36%

the graduate record examination (GRE) are widely used to help predict the performance of applicants to graduate schools. the range of possible scores on the GRE is 200 to 900. the psychology department at a university finds that the scores of its applicants on the quantitative GRE are approximately normal with mean of 544 and standard deviation of 103. find the proportion of GRE scores are below 500?

volume of lumber

the height and volume of usable lumber of 32 cherry trees are measured by a researcher. the goal is to determine if volume of usable lumber can be estimated from the height of a tree. - in this study, the response variable is

roughly symmetric, centered at about 150, range 80

the histogram below shows the distribution of heights for 100 randomly selected school children in great britain. which of the following descriptions best fits the distribution? (shape, center, range)

- $53 - ut= $538, ot= $71.91 - up = $238, op= $71.91

the manager of a children's puppet theater has determined that the number of adult tickets he sells for a saturday afternoon show is a random variable with a mean of 28.3 tickets and a standard deviation of 5.3 tickets. the mean number of children's tickets he sells is 42.5, with a standard deviation of 8.1. - the adult tickets sell for $10. let a = the $ he collects from just adult tickets on a random saturday. find the mean and the standard deviation for a. - the children's tickets sell for $6. let t = the total income from all ticket sales on a random saturday. find the mean and standard deviation of t. - it costs $300 for the manager to put on each puppet show. let p = the profit from a random Saturdays show. find the mean and standard deviation of p

35

the mean age of 4 people in a room is 30 years. new person whose age is 55 years enters the room. the mean age of the 5 people in the room is

- 40 - 5.69 - um = 42.78, om= 6.09

the mp3 files on sharons computer have a mean size of 4 megabytes and a standard deviation of 1.8 megabytes. she wants to create a mix of 10 of the songs for a friend. let the random variable t= the total size of 10 randomly selected songs from sharons computer. - what is the expected value of t? - what is the standard deviation of t? - the formula 1.07(file size) - 0.2 provides a good estimate of the length of a song in minutes. if m = 1.07T-0.02, what are the mean and standard deviation of m?

mean = 100; standard deviation = 65

the normal curve below describes the death rates per 100,00 people in developed countries in the 1990's. the mean and standard deviation of this distribution are approximately

convience sample

to determine the proportion of each color of M&Ms, you buy 10 packs at your local grocery store and count how many there are of each color. this is an example of

.5764

using the standard normal distribution tables, the area under the standard normal curve corresponding to -0.5 < Z < 1.2 is

bechhofer, taylor, weiss

we wish to choose a simple random sample of size three from the following employees of a small company. to do this, we will use the numerical labels attached to the names below. 1. Bechhofer 2. Brown 3.Ito 4. Kesten 5. Kiefer 6. Spitzer 7. Taylor 8.Wald 9. Weiss. we will also use the following list of random digits, reading the list from left to right, starting at the beginning of the list 11793 20495 05907 11384 44982 20751 27498 the simple random sample is

if we use another list of random digits to select the sample, the result obtained with the list actually used would be just as likely to be selected as any other set of three names

what is true about a simple random sample using a random digit table?

skewed right

when a basketball player makes a pass to a teammate who then scores, he earns an assist. below is a normal probability plot for the number of assists earned by all players in the national basketball association during the 2010 regular season. what shape is the distribution?

because it only suggests symmetry and not all symmetric ones are normal. it would be better to plot the points into the calculator, see if the probability plot is linear, and if it is then it is normal

why cant you explain why you think a distribution is normal saying that q1 and q3 are about the same distance from the mean

the explanatory variable is parents' income and you expect to see a positive association

you have data for many families on the parents income and the years of education their eldest child completes. your initial examination of the data indicated that children from wealthier families tend to go to school for a longer period of time. when you make a scatter plot,

three; one categorical and two quantitative

you measure the age, marital status, and earned income of a sample of 1463 women. the number and type of variables you have measured is

stratified random sample

you plan to give a math achievement test to samples of 15 year old's from both the US and Korea in order to compare mathematics knowledge in the two countries. in each country you will randomly choose 300 students from low income families, 400 students from middle income families, 200 students from high income family. the sample of Korea is a


Related study sets

REL Islam, God is not One (25-63)

View Set

Managerial Accounting - McGraw Hill Chapter 1 Overview

View Set

Chapter 27: Drugs Used to Treat Heart Failure

View Set

CSC 415 Operating System Principles Unit 06 Part 2

View Set

Unit 2: High-risk Intrapartum 1/5

View Set

Western Civilization Chapter 5 Test

View Set

Illinois Insurance Pre-Licensing Exam 2017 (Life)

View Set