STAT 1430

Ace your homework & exams now with Quizwiz!

Types of Probability

"AND" = joint probability, or A and B to occur "OR" = Probability of A or B to both occur Marginal = Probability of a single event Conditional = Probability of A given B has occurred

Which type of probabilities are in each of the 4 cells of a two-way table of probabilities?

"And" probabilities

Box Plots notes

- Cant tell what the sample size is - Bigger boxes DO NOT mean more data - boxplots can be horizontal or vertical - You CAN NOT see the mean - there is always 25% of data in each section of a boxplot you are interested in how concentrated the data are within each section - Strength= compare data sets, shows skewedness vs, symmetric shapes, easy to determine center and variability, - Weakness= Cant tell exact shape

P(D|M) P(M|D)

- Out of democrats, who are male -Out of male, who are democrats Probability of A given B has occurred A is what we want to know, B is what we know

standard deviation properties

- Same unit as original data - Never negative - Can equal Zero - Is affected by outliers and Skewness - Multiplying the same number to all values changes standard deviation - Adding a number does not

If you switch X and Y, which of the following will change?

- The slope of the regression line - The Y-intercept of the regression line

Suppose you are using cereal price to predict milk price; they have a correlation of -.70. The average cereal price is $3.00 with standard deviation $0.50 and the average milk price (gallon) is $2.50 with standard deviation $0.25. What is the slope of the regression line?

-.35

A business has 3 branches, A, B, and C. Branch A gets 20% of the business, Branch B gets 50%, and Branch C gets 30%. We know the following information: Branch A: chance of running out of single dollars in a day is .15 Branch B: chance of running out of single dollars in a day is .05 Branch C: chance of running out of single dollars in a day is .10. What is the chance that you go to Branch A and they will have run out of single dollars? Choose the closest answer

.03 .20x.15=.03

A business has 3 branches, A, B, and C. Branch A gets 20% of the business, Branch B gets 50%, and Branch C gets 30%. We know the following information: Branch A: chance of running out of single dollars in a day is .15 Branch B: chance of running out of single dollars in a day is .05 Branch C: chance of running out of single dollars in a day is .10. What is the chance that you go to any branch of this business and they will have run out of single dollars? Choose the closest answer

.10

A business has 3 branches, A, B, and C. Branch A gets 20% of the business, Branch B gets 50%, and Branch C gets 30%. We know the following information: Branch A: chance of running out of single dollars in a day is .15 Branch B: chance of running out of single dollars in a day is .05 Branch C: chance of running out of single dollars in a day is .1 Suppose a Branch has run out of single dollars and you get the phone call. Is it most likely to be Branch A, Branch B, or Branch C?

A or C

Standard deviation has no units.

False

Suppose the correlation between X and Y is .3. If you double all the X values and double all the Y values, the correlation between 2X and 2Y is .6.

False

There can be different amounts of data in each section of a boxplot

False

Correlation measures the strength and direction of any relationship between X and Y.

False - Correlation measures the size and direction of a relationship between two or more variables

If you add the same value to every single number in a data set, the standard deviation also changes by that same value.

False Adding doesnt do anything

Undercoverage means you had alot of nonresponse in your sample.

False Undercoverage = a subgroup of the population is excluded from the very beginning - example: want OSU student opinion on tuition: -took a random sample from dorms want all students opinions you have to do it a different way issues: - sampling procedure is used and only represent the remaining population without the subgroup (excluded all students who don't live in dorms)

The 4 cells of a two way table contain conditional probabilities.

False they are joint distributions

The best fitting line has an SSE of Zero

False, SSE should be small but never equal zero

Which of the following statistics can be negative?

If data permits: Correlation Slope of the regression line Y-intercept of the regression line

If there is no relationship between two variables in a two-way table, then the two variables are said to be:

Independent

The five-number summary of a single data set of 100 numbers would be which of the following?

Min Q1 Median Q3 Max

Suppose 40% of the new employees at your company are males and 60% of the "old employees" are males. Are gender and type of employee (new/old) independent?

No

The equation of a regression line is Y = 20 + 5X where X = hours studied and Y = exam score. Study time data ranged from 8 to 15 hours. Should we interpret the Y-intercept here?

No. You should not interpret the Y-intercept in this situation.

Suppose 40% of the new employees at your company are males and 30% of the "old employees" are males. What percentage of ALL the employees are male?

Not enough information

Table

P(A and B)

P(B and A)

P(A and B)/P(B) or P(B) P(A|B)

"OR" Probability/ at least one

P(A or B)= P(A)+P(B)-P(A and B)

P(A and B) (dependent events)

P(A) x P(B|A)

Law of Total Probability

P(A)=P(A and B)+P(A and not B) on a table, if we add down or across that is the law of total probability.

Tree

P(A)P(B|A)

Complement rue

P(A^C)=1-P(A) - probability of what you want - probability of what you do not want

Bob is a telemarketer and he makes a sale 20% of the time. Suppose he makes ten calls. What is the chance he makes AT LEAST ONE sale?

P(At least 1) = 1 - P(None)= 1 - P(No No No .... No )= 1 - P(No)^10= 1 - .8^10=.8926

Bayes Rule

P(A|B) = P(B|A)P(A)/P(B) We have P(B|A) but we want P(A|B)

Definition of Conditional Probability

P(B | A) = P(A and B) / P(A) probability of B given A has occurred

"AND" probability /joint distribution Example

P(F)=.6 P(M)=.4 P(Yes|F)= .25 P(Yes|M)=.30 P(F and Y)=P(F)P(Y|F) .6x.25=.15 P(M and Y)=P(M)P(Y|M) .4x.3=.12

A confounding variable can cause the results of a two-way table to reverse when it is added to the data set.

True

A listing of all the possible values of a data set and how often they occur is called a distribution.

True

The median is not affected by outliers

True

what it means to be disjoined

Two events, say A and B, are defined as being disjoint if the occurrence of one precludes the occurrence of the other; that is, - they have no common outcome.

Quartiles

Values that divide a data set into four equal parts Q1= 25th percentile Q2= 50th per Q3= Median IQR= Q3-Q1= 75th Quartile

P(A) = .20, P(B) = .30, P(A and B) = .06. Are A and B independent?

Yes

An experiment gives 3 different dosage levels of a drug to 3 groups of people. The first dosage level is a fake pill (or placebo) for comparison. We measure the blood pressure of the participants before and after the study and write down the amount by which blood pressure changed. What is the response variable?

blood pressure change

Which is best if you want to compare several data sets regarding shape, center, and variability?

boxplot

Your company operates in 4 regions and your boss numbers them 1, 2, 3 4. Is this variable quantitative or categorical?

categorical

The second level branches on a tree are marginal probabilities.

false it is the first branch

Which is better to use to see the most clear pattern in the data?

histogram

If the correlation is .2 what does that tell you about using a regression line to fit your data?

it's a weak positive linear relationship, do not proceed with a regression line

Undercoverage

leaving a group out occurs when some groups in the population are left out of the process of choosing the sample

self-selected sample

members of a population can volunteer to be in the sample EX. Make an ad and people reach out to you if they want to participate

A correlation of -.6 is considered to be what?

moderately strong .1-.5= Weak .7-.10=Strong

You randomly choose 100 students from Stat 1350 to take a survey. 60 of them take the survey. What can occur with the other 40 people?

nonresponse bias

convience sample

only members of a population who are easy to reach are selected EX. Catching people at the union bc its close by to where you live

response variable (dependent variable)

result or change that occurs due to the experimental variable what comes out of the experiment

median greater than mean

skewed left

mean greater than median

skewed right

Which of the following is in the same units as the original data?

standard deviation Q1 y-intercept of the regression line

Experiements are.. than observational study

stronger

A boxplot is a one-dimensional graph

true

If the median is closer to Q1 than it is to Q3 then the data is skewed right.

true If median was closer to Q3 itll be skewed left

A longer box in the boxplot means more variability in the data.

true variability = how spread out the data is

Confidentiality is ___ than anonymity

weaker

Simpson's Paradox

when averages are taken across different groups, they can appear to contradict the overall averages

IQR is affected by outliers.

False

If the correlation is 0 you know there is no relationship between X and Y.

False

Correlation is in the same units as X and Y.

False

P(A) = .3, P(B|A) = .4, P(B) = .5 What is P(A and B)?

.12 P(A and B)= P(A) P(B|A)

If A and B are independent events with P(A) = 0.20 and P(B) = 0.60, then P(A|B) is:

.20 P(A)P(B)/P(B)

P(A) = .2 and P(B) = .3. Suppose A and B are INDEPENDENT . What is P(A or B)?Choose the closest answer.

.4 1. P( A or B) = P(A)+P(B)-P( A and B) 2. Find P( A and B) independent = P(A) x P(B) 3. .3x.2-.06=.44

If the stock went up today, what is the chance it also went up yesterday? Choose the closest answer.

.4 P(A and B) / total for A

if r = -.7, what is the value of the coefficient of determination?

.49

What is the chance that they did the SAME THING on both days? Choose the closest answer.

.5 P(A and B) + P(Not A and Not B) / total for everything

P(A) = .2 and P(B) = .3. Suppose A and B are DISJOINT. What is P(A or B)?Choose the closest answer.

.5 P(A or B) disjoint = P(A)+P(B)

P(A and B) disjoint

0

A company owner has 12 sales representatives and she finds a .85 correlation between years in sales and number of sales. The regression analysis is above. What is the slope of the regression line?

49.41

What % of people were employed and 25 or over? 70/100 70/200 70/150

70/200 were looking for % of PEOPLE , so it has to be total (200)

The third quartile is the same thing as the _____________ percentile

75th

What % of employed people were 18-25? 80/200 80/150 80/100 None of the other choices is correct

80/150 were looking for % of EMPLOYED total has to be 150

What is the conditional distribution of age for those who are employed? 80/150 and 70/150 80/100 and 20/100

80/150 and 70/150 conditional distribution= a distribution of values for one variable that exists when you specify the values of other variables.

What is statistical significance?

A result due to more than chance

A business has 3 branches, A, B, and C. Branch A gets 20% of the business, Branch B gets 50%, and Branch C gets 30%. We know the following information: Branch A: chance of running out of single dollars in a day is .15 Branch B: chance of running out of single dollars in a day is .05 Branch C: chance of running out of single dollars in a day is .10 Which Branch is most likely to run out of single dollars in a day?

C they have a higher % of ppl coming in and they have a .03% chance

What is qualitative data?

Data based off things other than numbers height, age, etc..

marginal distribution

Distribution of values of that variable among all individuals described by the table. P(A) P(B) P(A') P(B')

How to determine if P(A and B) are independent

Events A and B are independent if the equation - P(A and B) = P(A) · P(B) - P(A|B)=P(A) - P(A|B)=P(A| not B) - Random sample

What is the statistical definition of a random sample?

Every sample of that same size has an equal chance of being selected.

What does it mean for a sample to be truly random, according to our notes?

Every sample of the same size has the same chance of being selected.

A confidential survey is one in which they cannot link you to your data.

False

A flat histogram indicates no variability in the data.

False

Bob picks a name from the phone book using a random number generator, and then takes the first 100 names that come after that to make a sample. Is Bob's sample random?

False

Extrapolation is what?

Plugging in X values outside the range of the data

Units of residuals

Same as the units of Y Residual= Observational value of Y- Predicted value of Y

You send out an email to all the students in Stat 1430 and you tell them to go to your website and do a survey. 100 students come forward. What kind of sample is this?

Self-selected sample

Response Bias

Someone answers inaccurately a systematic pattern of incorrect responses in a sample survey

Which measure of variability measures the concentration of the data around the mean?

Standard deviation

If you want to ask the question: "How is the view from your seat?" where your population is the OSU's football stadium, what kind of sample should you use?

Stratified Random Sample

Complement rule example At most

Telemarketer P(yes)=10% makes 3 calls what is the chance of getting at most 1 yes? P(NO)= 1-P(yes).10 P(NO)=.9 (.9)^3 = P(no YESes) =.729 or 72.9% (.1)(.9)(.9)x3= .243 P(at most 1 yes) =.729+.243= 9.72

If you are predicting gas price using temperature, which is the X variable?

Temperature

SSE is equal to what?

The Sum of Squares for Error for any line going through the data.

conditional distribution

The distribution of one variable restricted to a single row (or column) of another variable in a two way table. A conditional distribution is found by dividing the values in the row (or column) by the row (or column) total. - Pie chart

When finding the correlation if you are given R-squared, you take the square root first. Then what do you look at to determine the sign for the correlation?

The sign on the slope


Related study sets

Chapter 54- Drug Therapy for Anxiety and Insomnia

View Set

Ch. 8 Testbank, Plagiarism & Intellectual Property

View Set

Chapter 20: Assessment of Respiratory Function, prep U Nurs225, Assessment of Respiratory Function, Chapter 20: Assessment of Respiratory Function, Chapter 19, RESPIRATORY TEST, Respiratory Ch20-24 PrepU, PrepU: Chapter 19 - Respiratory, Chapter 19:…

View Set

Managerial Accounting Exam 1 (Ch. 14, 1, + 2)

View Set

Medical Terminology Chapter 6 PART TWO

View Set

Personal Health: Final Exam, Nursing 100 Final, Personal health Midterm

View Set