stat exam 1
mean,
(sum of values) / (number of values)
Other forms of bias to watch out for in data collection:
- Question wording- Context- Inaccurate responses
Randomized Experiments benefits and disadvantges
-Causal conclusions possible -Generally more costly -Generally more time-consuming -Sometimes impossible due to ethical concerns -Sometimes impossible if units aren't people/animals
Non-ideal methods of sampling
-Sampling cases based on something obviously related to the variable(s) you are studying -Letting your sample be comprised of whoever chooses to participate (volunteer bias)
Visualization via Side-by-Side Boxplots
1 categorical AND 1 quant
Visualization via Stacked Dotplots
1 categorical AND 1 quant
Correlation facts:
1.-1 ≤ r ≤ 12.The sign indicates the direction of linear association1. Positive association: r > 0. Negative association: r < 0. No linear association: r = 0. The closer r is to ±1, the stronger the linear association r has no units, doesn't depend on the units of measurement Correlation between X,Y = Correlation between Y, X Everything here is true for P (rho) as well as r.
Correlation Cautions
1.Correlation can be heavily affected by outliers. Always plot your data! 2.r = 0 means no linear association. The variables could still be otherwise associated. Always plot your data! 3.(Nonzero) correlation does not imply causation!Sometimes people refer to "spurious correlation":There's nothing spurious (false) about the correlation itself, though there's not always causation.
proportion/relative frequency
= number in. category/total sample size
Indicate whether we should trust the results of the following study. Is the method of data collection biased?Take 12 apples off the top of a truckload of apples and measure the amount of bruising on those apples to estimate how much bruising there is, on average, in the whole truckload.
Biased
Discrete
Can only take a set, "countable,"number of values. For example: •Classes missed last week•Puppies in a litter•Nearest highway mile marker (Counts/integers)
Continuous
Can take any value within some range•Height•Distance biked per day•Time it takes to walk from here to your next class Measuring a continuous variable often requires some sort of tool, like a ruler, scale, odometer, or stopwatch.
Record whether or not the literacy rate is over 75% for each country in the world.
Categorical
A headline in June 2015 proclaims "Infections can lower IQ."1 The headline is based on a study in which scientists gave an IQ test to Danish men at age 19. They also analyzed the hospital records of the men and found that 35% of them had been in a hospital with an infection such as an STD or a urinary tract infection. The average IQ score was lower for the men who had an infection than for the men who hadn't.
Danish men
Regression Caution
Do not use the regression equation or line to predict outside the range of xvalues available in your data (do not extrapolate!)• If none of the x values is anywhere near 0, then the intercept is not directly interpretable. Computers will calculate a regression line for any two quantitative variables, even if they are not associated or if the association is not linear ALWAYS PLOT YOUR DATA! The regression line/equation should only be used if the association is approximately linear • Outliers can have a strong effect on the regression line Influential point: A point whose explanatory variable value is an outlier. Higher values of x may correspond to higher or lower predicted values of y, but this does NOT mean that changing x will cause y to increase or decrease. Causation can only be determined if the values of the explanatory variable were determined randomly, which is rarely the case for a quantitative variable.
Side-by-Side Bar Chart (2 cat)
Height of each bar is the count from the corresponding cell in the two-way table. Height could also be expressed as (row) percentages.
95% Rule
In a normal distribution, about 95% of the cases lie between the mean and TWO standard deviation units on both sides of the mean
The population is all trees in a forest. We walk through the forest and pick out trees that appear to be representative of all the trees in the forest.State whether or not the sampling method described produces a random sample from the given population.
Not random
Interquartile Range (IQR) =
Q3 - Q1
Is the following variable categorical or quantitative?Record the percentage change in the price of a stock for 100 stocks publicly traded on Wall Street.
Quantitative
Measure the shelf life of bunches of bananas (the number of days until the bananas go bad) for a large sample.
Quantitative
The population is incoming students at a particular university. The name of each incoming student is thrown into a hat, the names are mixed, and 25 names (each corresponding to a different student) are drawn from the hat.State whether or not the sampling method described produces a random sample from the given population.
Random
State whether the data are best described as a population or a sample. To estimate size of trout in a lake, an angler records the weight of 10 trout he catches over a weekend.
Sample
Several studies have been performed to examine the relationship between nut consumption and cholesterol levels. Here we consider two such studies. In Study 1,1 participants were randomly assigned into two groups: one group was given nuts to eat each day, and the other group was told to consume a diet without nuts. In Study 2,2 participants were free to follow their own diet, and reported how many nuts they consumed. Cholesterol levels were measured for all participants, and both studies found that nut consumption was associated with lower levels of LDL ("bad") cholesterol. Based on the information above, which study do you think provides better evidence that nut consumption reduces LDL cholesterol?
Study 1
Histogram (one quant
The height of each bar corresponds to the number of cases within that range of the variable
A group of professors who each teach a large lecture class at different universities are curious about how their students think about studying for class. Suppose they interviewed the first 100 students to enroll in one of their courses at one university in southern Georgia, asking how much time they spent studying in a previous course and what letter grade they earned. Also, suppose they found that students who studied more are more likely to have received a higher letter grade.
This method of sampling is a simple random sample.: false This method of sampling has bias: true This is an observational study: true From these results, we can infer that students who study more will earn a higher grade.: false
Summary Statistics: correlation Visualization: scatterplot
Two quantitative variables
Association:
Two variables are associated if values of one variable tend to be related to values of the other
Causation:
Two variables are causally associated if changing the value of one variable influences the value of the other variable.
control group.
When determining whether a treatment is effective, it is important to have a comparison group This group is treated exactly the same as the other group(s), except it is not given any active treatment.
correlation
a measure of the strength and direction of linear association between two quantitative variables
standard deviation
a quantitative variable measures the spread of its distribution. SD is, roughly, the average distance of any value from the mean.
Your LA collects the ages of the first fifteen students who walk in the lab classroom to find the average age of students in your lab.
a sample not random
observational study
a study in which the researcher does not actively control the value of any variable, but simply observes the values as they naturally exist. no causation can be assumed
Multivariate Structure
add additional variables to a scatterplot beyond the two quantitative variables on the axes: -Point sizes-Point colors-Point symbols-Dynamic over time-Interactive
sample
all the cases that we have collected data on—a subset of the population.
outlier
an observed value that is notably distinct from the other values in a dataset.
variable
any characteristic that is recorded for each case.
Study A: A company has 200 employees and would like to select a sample of 40 of them for a study. The employees are numbered from 1 to 200 in random order. After sorted by numerical order, the first 40 names on the list are selected for the study. Study B: A local church wants to get its members' opinions on a new minister that is applying for a job. All of the church members listen to the applicant's sermon. Afterward, they all put their names in a big bowl and the hiring committee randomly chooses 15 names to interview. Study A ____________ use simple random sampling with ___________ egregious source of bias. Study B ______________ use simple random sampling with _____________ egregious source of bias.
did, no, did, no
Over the course of a year, do men spend more money on clothes than women?
difference in mean, side by side boxplots, histograms, dotplots
when comparing a quantitative variable across two categories, we compute the
difference in means
Who makes more money, nurses or home care givers?
difference in means, side by side boxplots, histograms, dotplots
Are pregnant women with insomnia more likely to deliver prematurely when compared to women without insomnia?
difference in proportions, side by side bar charts
Are republicans more likely to support gun control laws than democrats?
difference in proportions, side by side bar charts
Categorical Variable
divides cases into categories. Its values are often words or symbols, but be careful because sometimes numbers can be categories too
matched pairs experiment
each case gets both treatments in random order (or cases are paired in some obvious way), and we examine individual differences in the response variable between the two treatments\ person recieves both control and not or both variables.
dotplot (one quant)
each case is represented by a dot and dots are stacked. Easy way to see each case
A relationship between two variables is described below. We can think of one variable as helping to explain the other. Is the indicated variable the explanatory variable or the response variable?Amount of fertilizer used and the yield of a crop.Amount of fertilizer used is the:
explanatory variable.
A relationship between two variables is described below. We can think of one variable as helping to explain the other. Is the indicated variable the explanatory variable or the response variable?Year and the world record time in a marathon.
explanatory variable.
Activity 1: The "randomization" implemented in this experiment included _____________ . The purpose of using the randomization is to _______________
flipping a coin to form groups balance out potential confounding variables
-Interactive:
he user interacts with the graph either by mouse clicks, mouse hover, choosing what to display, or in other ways
not associated
if knowing the value of one variable does not give you any information about the value of the other variable
Slope:
increase in predicted y for every unit increase in x
One student in five reports abandoning safe-sex practices when drunk
individual risk
confounding orlurking variable.
is associated with both the explanatory variable and the response variable i
The Pth percentile
is the value that is greater than P% of the data Q1= 25th Q3= 75th
A statistic is resistant if
it is relatively unaffected by extreme values. ex. median
z score:
just give equation, how many standard deviations a part. value is from the mean
Maximum
largest data value
left skewed
long left tail
right skewed
long right tail
positive association
means that values of one variable tend to be higher when values of the other variable are higher.
negative association
means that values of one variable tend to be lower when values of the other variable are higher.
Quantitative Variable
measures a numerical quantity for each case. It consists of numerical measures or counts.
Q3
median of the values above m.
Q1
median of the values below m.
The relative risk of having a certain type of disease after the age of 55 when comparing men to women, with women as the baseline, is 3.9. This means that
men are 3.9 times as likely to have the disease as women
The sample size, the number of cases in the sample, is denoted by
n
A research study at McGill University compared 25 men to 25 women, all of which were habitually active, non-smoking, and non-obese people between the ages of 19 and 39 years. Each completed a maximum exercise test on a stationary bike. During this test, each participant rated their breathlessness on a rating scale of 0 to 10 while the researchers monitored the electrical activity level of the diaphragm using an electrode placed at the base of the esophagus by way of the nose. Not only did the women report greater shortness of breath, but their respiratory muscles were also working harder than the guys were. Women, when compared to men, needed greater electrical activation of the diaphragm, in order to compensate for their biologically smaller lungs, airways and breathing muscles. The described study is a(n)_________________ Breathlessness rating is the _____________ variable, and it is _________________ Biological sex is the explanatory variable, and it is _________________ Based on the results we ______________ make cause-and-effect conclusions. We __________________ use the results of this study to make generalizations about all men and women between 19 to 39 years of age?
observational study response quantitative explanatory categorical cannot cannot
In 2012, Washington State voters approved the legalization of marijuana by a margin of 56 to 44.
odds
In 2014 the Upshot calculated the odds of a Republican takeover of the Senate to be about 3 to 1.
odds
In the 2012 election, Hispanic voters voted for President Barack Obama over Mitt Romney by 71 to 27, according exit polls by the Pew Hispanic Center.
odds
difference in proportions =
one group - another group (groups determined by cat variable)
Sampling bias or Selection bias
participants included are not representative of the entire population ( bad sampling methods)
Nonresponse bias
participants included in the sample do not participate in the survey or study
Response bias
participants included in the sample who do respond are motivated or pressured to respond in a certain way that may not be their true feelings
Your TA collects the ages of all the LAs on the teaching team to find the average age of the LAs on the teaching team.
population
Researchers examined all sports-related concussions reported to an emergency room from children ages 5 to 18 in the United States over the course of one year.1 The table below displays the number of concussions in each of the major activity categories. (a) Are these results from a population or a sample? (d) Can we conclude that, at least in terms of concussions, riding bicycles is more dangerous to children in the US than playing football?
population no
Two quantitative variables are described below. Do you expect a positive or negative association between the two variables?Size of a house and Cost to heat the house
positive.
Intercept
predicted y value when x = 0
Simple linear regression
predicts a response variable, y, as a linear function of one explanatory variable, x. Goal of simple linear regression: Find a straight line that best fits the data
A linear model
predicts a response variable, y, using a linear function of explanatory variables.
When women take birth control pills, some of the hormones found in the pills eventually make their way into lakes and waterways. In one study, a water sample was taken from various lakes. The data indicate that as the concentration of estrogen in the lake water goes up, the fertility level of fish in the lake goes down. The estrogen level is measured in parts per trillion (ppt) and the fertility level is recorded as the percent of eggs fertilized. One variable is concentration of estrogen in the water and another is fertility level of fish. Is estrogen concentration categorical or quantitative? Is fertility level categorical or quantitative?
quant, quant
Aboriginal Canadians are 1.5 times as likely to develop heart disease when compared to the general Canadian population.
relative risk
Fontham found increased risks of lung cancer with increasing exposure to secondhand smoke, whether it took place at home, work, or in a social setting. A person whose spouse smokes has 1.3 times the lung-cancer risk as that of someone whose spouse does not smoke.
relative risk
What they found was that women who smoked had a risk of getting lung cancer 27.9 times as likely as non-smoking women.
relative risk
relative risk =
risk in category 1/risk in category 2
Sample standard deviation:
s
frequency table
shows the number of cases that fall in each category.
Minimum
smallest data value
x or y
stand for any variable
A group of professors who each teach a large lecture class at different universities are curious about how their students think about studying for class. They interviewed a simple random sample of 100 students at one university in southern Georgia about how many hours they studied each week for a previous class and what letter grade they earned in that class. They found that students who studied more are more likely to have received a higher letter grade. What is the population to which we can infer their results?
students at that university in southern Georgia
statistic p hat (p has lil curl)
summary measure for a sample is called
parameter p (p has lil curl)
summary measure for the entire population
dataset
takes the form of a spreadsheet, comprised of variables measured on cases. This type of data is calledraw data.
Z-scores
tells us how many standard deviations a particular value if from the mean. allows comparison of observations from diff data z= (observed value-mean)/standard deviation
bar chart- one categorical
the bar correspond to the number of cases falling in each category
residuals
the difference between an observed value of the response variable and the value predicted by the regression line THE ACTUAL RESPONSE VALUE MINUS THE PREDICTED RESPONSE VALUE.
Dynamic:
the graph changes over time (also called animated)
scatterplot
the graph of the relationship between two quantitative variables.
For symmetric distributions,
the mean and the median will be about the same
For skewed distributions,
the mean will be more pulled towards the direction of skewness
median, m,
the middle value when the data are ordered. If there are an even number of values, the median is the average of the two middle values.■
The larger the standard deviation,
the more variability there is in the data and the more spread out the data
Over 30,000 people participated in an online poll on cnn.com conducted in April 2012 asking "Have you ever driven with a pet on your lap?" We see that 34% of the participants answered yes and 66% answered no. The sample is ______________ . Based on the results of this survey, we ___________ conclude that 34% of all drivers have driven with a pet on their lap.
the over 30,000 people that participated in the poll cannot
Statistical Inference
the process of using data from a sample to gain information about the pop only valid if the sample is representative of the population.•
pie chart - one categorical
the relative areas correspond to the proportion in each category
experiment,
the researcher controls one or more explanatory variable(s).
Statistics
the science of collecting, describing, and analyzing data.
Cases or Units
the subjects or objects that we obtain information about.
randomized experiment
the value of the key explanatory variable is determined randomly for each unit, before the response variable is measured.
number values that are categorical
values must be meaningful as numbers in order to be quantitative.- SOOO CAT= Phone number area code-Postal code-Social Security Number
If a randomized experiment yields an association between the explanatory and response variables,
we canestablish causation!
randomized comparative experiment,
we randomly assign cases to different treatment groups and then compare results on the response variable(s).
explantary value =
x
sample mean
x̅
95% rule formula-sample
x̅-(2s) and x̅+(2s)
predicted response variable
y
Was the sample randomly selected?
yes --> Possible to generalize to the population no --> Should not generalize to the population
Population mean
μ
95% rule formula-population
μ -(2σ) and μ+(2σ)
Population standard deviation:
σ, root-mean squared distance from μ
Want to be happy? Smile more!
causation
risk =
number in category/number in total group
sample correlation
r
Using a placebo is only helpful if participants are blinded
that is, they do not know whether they are getting the placebo or the active treatment).
Observational Studies benefits and disadvantges
-Causal conclusions impossible; confounding can't be eliminated -Generally cheaper -Generally faster
Inference
Confidence intervals Hypothesis tests
heat map
Geographic data are often displayed with color shading on a map, where color corresponds to a variable (categorical or quantitative)
Segmented Bar Chart (2 cat)
Like a side-by-side bar chart, but the bars are stacked instead of side-by-side.
boxplot -one quant
Lines ("whiskers") extend from each quartile to the most extreme value that is not an outlier
Inference with distributions
Many types of inference!
Range
Max - Min
The population is all employees at a company. All employees are emailed a link to a survey.State whether or not the sampling method described produces a random sample from the given population.
Not random
population correlation
P
State whether the data are best described as a population or a sample.The U.S. Department of Transportation announces that of the 250 million registered passenger vehicles in the US, 2.1% are electro-gas hybrids.
Population
Was the explanatory variable randomly assigned?
Yes --> Possible to make conclusions about causality no --> Cannot make conclusions about causality
explanatory variable
use this variable to help explain another
The data are from a personal experiment to compare commuting time based on a randomized selection between two bicycles made of different materials. The bike material, date of the ride, length of the commute in miles, total commute time in hours::minutes::seconds, time converted to minutes only, average speed during the ride in miles per hour, maximum speed in miles per hour, time converted to seconds, and date listed by month are recorded in the dataset.
cases: bike rider categorical variables: month and bike quantitative variables: seconds, time, distance, topspeed, minutes, avgspeed
Give the relevant proportion using correct notation.Of all 1,672,395 members of the high school class of 2014 who took the SAT (Scholastic Aptitude Test), 793,986 were minority students.
(ref photo album)
The White House plans to conduct a survey on the attitudes of 'typical Americans'. They send an email to everyone on the '1600 Daily' listserv (a mailing list you can subscribe to to get daily updates from whoever is in the White House). This email contains a link to the survey. Which two of the options below are the most likely sources of bias?
-They only contact people who have signed up for the 1600 Daily, and these people are probably not representative of the US population. -There might be non-response bias because not everyone contacted will complete the survey.
Climate ChangeIn July 2015, a poll asked a random sample of 1236 registered voters in Iowa whether they agree or disagree that the world needs to do more to combat climate change.1 The results show that 65% agree, while 25% disagree and 10% don't know. (a) Is the sample likely to be representative of all registered voters in Iowa? (b) Is it reasonable to generalize this result and estimate that 65% of all registered voters in Iowa agree that the world needs to do more to combat climate change?
yes, yes
ordinal
• Year in school• Quality of service (poor, fair, good, excellent) categories have levels or are hierarchal
What proportion of all children in the study were given antibiotics during their first year of life? Include proper notation.
𝑝̂ =438/616=0.71
Considering the activity you performed with your classmates at the beginning of this lab, if we (hypothetically) determined that visual aids do help memorization, could we make the claim that visual aids increase memorization score, based on this experiment?
Yes, because this is a randomized experiment
An organization promoting affordable daycare conducts a survey. They randomly select 1,000 mothers who chose to stay home with their children during the first year. After calculating how much income, retirement benefits, and wage growth each woman forfeited by staying home, the women are asked whether they regretted staying home for an entire year. Do you think there are possible sources of bias? If so, what do you think is the most likely source?
Yes, the question is asked in a way that is meant to influence the participant's answers.
You want to estimate the proportion of all Penn State students who have season football tickets by asking a sample of twenty students who are camping out in Nittanyville the night before the annual white-out game.
a sample not random
New research1 supports the idea that people who get a good night's sleep look more attractive. In the study, 23 subjects ages 18 to 31 were photographed twice, once after a good night's sleep and once after being kept awake for 31 hours. Hair, make-up, clothing, and lighting were the same for both photographs. Observers then rated the photographs for attractiveness, and the average rating under the two conditions was compared. The researchers report in the British Medical Journal that "Our findings show that sleep-deprived people appear less attractive compared with when they are well rested." a) What is the explanatory variable? What is the response variable? (b) Is this an experiment or an observational study? If it is an experiment, is it a randomized comparative design or a matched pairs design? (c) Can we conclude that sleep deprivation causes people to look less attractive? Why or why not?
a) amount of rest, attractiveness rating b) Matched pairs experiment c) Yes, we can, because an association was found and the result comes from an experiment.
Male fertility rates and the death rate of men in car crashes are both decreasing.
association w/o causation
Indicate whether we should trust the results of the following study. Is the method of data collection biased?Ask a random sample of students at the library on a Friday night "How many hours a week do you study?" to collect data to estimate the average number of hours a week that all college students study.
biased
response variable
variable effected or ecplained by explan variable
State whether the following claim is one of association and causation, association only, or neither association nor causation.Cat owners tend to be more educated than dog owners.
Association only
State whether the following claim is one of association and causation, association only, or neither association nor causation.Cell phone radiation leads to deaths in honey bees
Association with causation
State whether the following claim is one of association and causation, association only, or neither association nor causation.Daily exercise improves mental performance.
Association with causation
Data
Collecting Data Describing Data
Yes, the question is asked in a way that is meant to influence the participant's answers.
Even though this study incorporated random sampling, there may be a systematic bias because of only selecting the first ten eggs laid.
Is the following an experiment or an observational study?To examine whether planting trees reduces air pollution, we find a sample of city blocks with similar levels of air pollution and we then plant trees in half of the blocks in the sample. After waiting an appropriate amount of time, we measure air pollution levels.
Experiment
If we have learned to solve problems by one method, we often have difficulty bringing new insight to similar problems. However, electrical stimulation of the brain appears to help subjects come up with fresh insight. In a recent experiment1 conducted at the University of Sydney in Australia, 40 participants were trained to solve problems in a certain way and then asked to solve an unfamiliar problem that required fresh insight. Half of the participants were randomly assigned to receive noninvasive electrical stimulation of the brain while the other half (control group) received sham stimulation as a placebo. The participants did not know in which group they were. In the control group, 20% of the participants successfully solved the problem while 60% of the participants who received brain stimulation solved the problem. (a) Is this an experiment or an observational study? (b) From the description, does it appear that the study is double-blind, single-blind, or not blind? (f) Does electrical stimulation of the brain appear to help insight?
Experiment Single-blind Yes
A recent headline reads "Early Language Skills Reduce Preschool Tantrums, Study Finds"1, and the article offers a potential explanation for this: "Verbalizing their frustrations may help little ones cope." The article refers to a study that recorded the language skill level and the number of tantrums of a sample of preschoolers.
Observational Study
Is the following an experiment or an observational study?To examine whether eating brown rice affects metabolism, we ask a random sample of people whether they eat brown rice and we also measure their metabolism rate.
Observational study
A group of professors who each teach a large lecture class at different universities are curious about how their students think about studying for class. They interviewed a simple random sample of 100 students at one university in southern Georgia about how many hours they study each week for a previous class and what letter grade they earned in that class.
The response variable is: letter grade earned The explanatory variable is:study hours . The cases are: students If study hours is reported as time spent per week, then it's a: quantitative variable If study hours is reported a tier such as tier 0 = no hours, tier 1 = 0.1 to 3 hours, tier 2 = 3.1 to 6 hours, tier 3 = 6.1 to 10 hours, tier 4 = 10.1 to 15 hours, tier 5 = more than 15 hours then it's a: categorical variable. If letter grade is recorded as A, B, C, D, or F, then it's a: categorical variable If letter grade is recorded as 4.0, 3.0, 2.0, 1.0, 0.0 and the professors want to look at an overall average value, then it's a: quantitate variable
Can experiences of parents affect future children? New studies1 suggest that they can: Early life experiences of parents appear to cause permanent changes in sperm and eggs. In one study, some male rats were fed a high-fat diet with 43% of calories from fat (a typical American diet), while others were fed a normal healthy rat diet. Not surprisingly, the rats fed the high-fat diet were far more likely than the normal-diet rats to develop metabolic syndrome (characterized by such things as excess weight, excess fat, insulin resistance, and glucose intolerance.) What surprised the scientists was that the daughters of these rats were also far more likely to develop metabolic syndrome than the daughters of rats fed healthy diets. None of the daughters and none of the mothers ate a high-fat diet and the fathers did not have any contact with the daughters. The high-fat diet of the fathers appeared to cause negative effects for their daughters. One variable is whether or not the male was fed a high-fat diet or a normal diet and another variable is whether or not the daughters developed metabolic syndrome. Is the type of diet variable categorical or quantitative?
categorical, categorical, explan variable
Smaller hands lead to lower typing accuracy of young students. This study shows
causation
Is there relationship between amount of sleep (hrs/day) and BMI for school-age children?
correlation value, scatterplots
To determine if ingesting caffeine helps mice learn the way through a maze faster, we randomly divide 20 mice into two groups of ten. Half the mice get caffeine in their food, while the other half get identical food without the caffeine. The researchers that interact with the mice don't know which mice are getting caffeine and which mice are not. After the mice eat, we measure the time it takes for the mice to learn the maze and compare the results between the two groups This study is: _______________ and __________________ use a placebo. The randomization used in this study is ____________________
double blinded did random assignment into groups
simple random sample
each unit of the population has the same chance of being selected, regardless of the other units chosen for the sample.
Nominal
ex. • Political party• Eye color
A somewhat surprising fact about coffee is that the longer it is roasted, the less caffeine it has. Thus an "extra bold" dark roast coffee actually has less caffeine than a light roast coffee.What is the explanatory variable and what is the response variable?The roasting time is the Choose the answer from the menu; The roasting time __________________ variable.The amount of caffeine is the Choose the answer from the menu; The amount of caffeine __________________ variable.Do the two variables have a negative association or a positive association?
explanatory response
population
includes all individuals or objects of interest.
Activity 2: Consider the below scenario from your lab: Two weight training regimens, Regimen A and Regimen B, are designed to improve arm strength. The goal is to compare the two different methods. Forty individuals will participate in the study. We pair individuals based on arm strength measured before the experiment begins so that each pair should have approximately the same beginning arm strength. Within each matched-pair, randomly assign one person to use Regimen A and the other person to use Regimen B. After the six-week regimens are completed, we measure and record the arm strength of each person. This is a _______________ . It ________________ placebos and is _________________
matched-pairs experiment does not use not blinded
Two quantitative variables are described below. Do you expect a positive or negative association between the two variables?Outside temperature and Amount of clothes worn
negative
Wearing a Uniform to WorkThe website fox6now.com held an online poll in June 2015 asking "What do you think about the concept of having an everyday uniform for work, like Steve Jobs did?" Of the people who answered the question, 24% said they loved the idea, 58% said they hated the idea, and 18% said that they already wore a uniform to work. (a) Are the people who answered the poll likely to be representative of all adult workers? (b) Is it reasonable to generalize this result and estimate that 24% of all adult workers would like to wear a uniform to work?
no, no
Suppose Professor Smith feels that more students in his class earned an A on the common test compared to Professor Yang's class. To prove his claim, he uses letter grades for the test scores. He finds that his class has 16 students who earned an A out of 35 students whereas in Professor Yang's class, only 5 out of 22 students earned an A. What summary statistics and visual representation should Professor Smith use to display these findings?
two-way table and a side-by-side bar chart
People who are evening types go to bed later and wake up later than morning types. They also tend to move around far less throughout the day, according to an interesting new study of how our innate body clocks may be linked to our physical activity habits. The study, one of the first to objectively track daily movements of a large sample of early birds and night owls, suggests that knowing our chronotype might be important for our health. So, for the new study, which was published in June in the Scandinavian Journal of Medicine & Science in Sports3 , researchers at the University of Oulu in Finland turned to some of their fellow Finns. Years before, more than 12,000 had become part of an ongoing study of the health of almost every child born in Oulu in 1966. Now, the researchers checked in with almost 6,000 of them still living in the Oulu area and willing to participate in a follow-up study. These men and women, all age 46, visited the university for an inperson exam, which included medical and other tests and a variety of questionnaires, including one designed to determine their chronotypes. The researchers also gave each volunteer an activity tracker and asked them to wear it for two weeks, providing objective data about their physical activities. Then the scientists compared how people moved with how their internal clocks chimed. The described study is a(n) _________________________ The explanatory variable is ___________________________. There are _______________ Based on the results of this study, we _________________ make a cause-and-effect conclusion. These results _________________ be generalized to young adults ages 18 to 30 years.
observational study Chronotype: early bird or night owl more than two cannot cannot
Among smokers in Maine, what is the average number of cigarettes smoked per day?
one mean, boxplot, histogram, or dotplot
How many grams of protein does the average woman eat?
one mean, boxplot, histogram, or dotplot
What proportion of Americans approve of the job congress is doing?
one proportion, bar chart or pie chart
The science department where professors Smith, Yang, Andrews and Davis work is ready to put an end to the debate, so the following semester, Professor Smith and Professor Yang are scheduled to teach Calculus I at the same time on the same days of the week. Students enroll in Calculus 1 and will be randomly assigned to sit in on either Smith's or Yang's class for the entire semester. This is an example of a(n) randomized matched pairs experiment so we __________________ establish causality. We would ____________________ to have a confounding variable potentially responsible for an association between professor and grade because this situation _______________________use random assignment.
randomized comparative experiment (actuallu idk) can not expect does
Participants were next taken to a lab where they completed three distractor tasks which lasted about 30 minutes. After this, they were given a test which was based on the information provided in the videos and included both factual and conceptual questions. Each test was graded by a person who didn't know which method was used to take the notes. Three key findings from this study are: The described study is a(n) _____________. The explanatory variable is_________________ . Based on the results of this study, we _______________ make a cause-and-effect conclusion.
randomized experiment. note-taking approach (longhand vs. laptop) can
A 2017 study about the opioid crisis found that opioid-use disorder was 40 times as likely in patients prescribed high doses of painkillers for a short period of time, compared with patients that were prescribed low doses for a long period of time.
relative risk
A recent study about the link between fitness and breast cancer found that rats with low natural fitness were about four times as likely to develop breast cancer as rats with high fitness were.
relative risk
1 in 5 Americans die of heart disease.
risk
In a recent 18-year study about the link between menopause hormones and early death, 27.1% of hormone users died.
risk
In a the same recent 18-year study, 27.6% of participants who took a placebo died.
risk
Activity 2: Consider the first two scenarios described in your lab about determining whether exercise helps increase certain chemicals in the brain: Scenario 1: we contact a random sample of 100 people and record how much each person exercises. We also measure the chemicals in the brain for each person. Scenario 2: We randomly assign half of the 100 participants to participate in a regular exercise program for a six-week period while the other half makes no changes. At the end of the time period, we measure the brain chemicals. ____________________ is an observational study. The randomized experiment is ___________________ . This scenario _____________ placebos, and it is _________________ . We can claim causation from the results found with______________________
scenario 1 scenario 2 does not use not blinded scenario 2
An experiment is conducted at a high school where 16 students are asked to complete a two-part exam: one part is written, while the other part is oral. When this two-part exam was given, 8 students were randomly assigned to do the written part first and the oral part second, while the other 8 students did the reverse order of the oral part first and the written part second. The case is the______________ This study is a(n) _______________
student Matched-Pairs experiment