lec assignments
The recorded values for ID# ct-9 are: Exposure time: 7 days WBC count decrease: 535 Using this information and the regression equation, calculate the residual for this observation.
-2305.41
Use this table to calculate the risk of illness among males.
0.0127
A disease that is often linked to over-dependence on corn as a staple food is known as Pellagra. Below is a table that shows the number of cases in South Carolina by sex in the 1920's. Yes No Row Total Female 46 1436 1482 Male 18 1401 1419 Column 64 2837 2901 Total Use this table to calculate the risk of illness among females.
0.031
Use this table to calculate the relative risk for females in comparison to males.
0.031/0.0127 = 2.441
Given R2 ("R-squared" in the plot), calculate r. Make sure you give it the correct sign. You'll need a calculator.
0.4057
The Human Resources (HR) at UCI wants to start an investigation regarding gender pay gap among its faculty members. The HR is interested in whether female professors are being payed less than male professors. So, they decide to take a sample of 10 female and 10 male professors. Then, they see that the salaries for the 10 female professors are a bit higher than the 10 male professors. Hence, they conclude that there is no gender pay gap at UCI. The explanatory variable is (1) and the response variable is (2).
1) gender 2) pay/salary
The estimate that the previous question asks you to make is an example of (1) and the prediction (2)
1. extrapolation 2. cannot be trusted
Use your results from previous parts to calculate the chi squared test statistic. NOTE: if your answer does not match any of the options below, you should consider recalculating the previous two questions.
11.323
What would this linear regression model predict the average decrease in WBC counts to be after 15 days of exposure to cigarette smoke?
3926.65
We wish to test if there is a statistically significant difference between the two categories, so our null hypothesis is: sex and Pellagra illness are NOT related in the population. Choose the appropriate alternative hypothesis:
Alternative hypothesis: sex and Pellagra illness are related in the population
A researchers at UCI wants to study whether consuming energy drinks before an exam leads to an increase in performance. The researcher takes a random sample of UCI students that have similar GPA's and asks her assistant to randomly assign half of them to a control group and randomly assigned the rest to the experimental group. Before participants begin their exam, they are asked to drink an energy drink. All drinks have the same packaging, except that the ones that belong to the control group are just flavored water. The real energy drinks contain high levels of caffeine. What type of study is this?
Experiment
Robert has a large garden in his backyard and is interested in studying the rate of plant growth with a new plant food he purchased. Since different parts of his garden have different amounts of sun exposure, he divides the backyard in regions based on similar sun exposure. He randomly selects a plant from each section and measures their height before feeding them the plant food and one month after administering this food. His sample is made up the plants he selected from each section. True or False Robert can use his results to talk about all plants outside of his garden.
False
Using the table above and the appropriate formula, calculate the expected count for each cell in the table.
Female and yes = 32.69, Female and no = 1449, Male and yes = 31.305, Male and no = 1387
Use the table above to calculate the "(observed - expected count)^2 / expected count" for each cell.
Female and yes = 5.41, Female and no = 0.1166, Male and yes = 5.655, Male and no = 0.141
Choose the correct interpretation of the relative risk you calculated above. (Let's call the relative risk that you calculated in the previous question RR)
Females have a RR-fold greater risk for Pellagra than males
A dentist in Irvine, Dr. Parks, is interested in surveying patients to find out their pain level one hour after receiving a cavity filling. He asks the receptionist to call 10 patients and ask them to rate their pain level from 1 to 10 (1 = no pain, 10 = excruciating pain). However, he was only able to sample the patients who were interested in participating in this survey. Assuming that the receptionist is not causing bias, what would be the best approach to minimize the bias?
He should randomly select the 10 patients from all patients that receive a cavity filling.
Suppose that we are interested in studying the chest circumference of healthy newborn baby girls in Orange County. A group of 25 newborn girls were randomly sampled from different hospitals in Orange County. Which of the following is an example of inferential statistics based on this study?
Making a conclusion about the average chest circumference of newborn baby girls in Orange County.
A survey is mailed to a random sample of residents in a city asking whether or not they think the current mayor is doing an acceptable job. What type of bias do you think would most likely be introduced in this type of situation? You can argue for more than one answer but please select the answer that would be the main source of bias.
Nonresponse bias. People who feel strongly about the mayor are more likely to respond.
A large sample of teenagers between ages 12 to 16 were asked to keep track of the number of hours they spend in social media and then answer some questions that assess their sense of self-worth and self-esteem. The answers to these questions were then converted to a score that indicates their psychological well-being. Additional variables such as gender, age, and exercise (yes/no) were also collected for the analysis. What is the explanatory variable?
Number of hours spent in social media
A large sample of teenagers between the ages of 12 and 16 were asked to keep track of the number of hours they spend on social media and then answer some questions that assess their sense of self-worth and self-esteem. The answers to these questions were then converted to a score that indicates their psychological well-being. Additional variables such as gender, age, and exercise (yes/no) were also collected for the analysis. What type of study is this?
Observational study
Choose the correct interpretation of the SLOPE estimate:
On average, an increase of 1 day of exposure to cigarette smoke is associated with a decline in white blood cell counts of 135.78.
Students in a prestigious elementary school are given a skills test to measure their mental capabilities before they start school, and are followed till their graduation from high school to study the effectiveness of the curriculum in improving their skills. The goal of this study is to see if the curriculum had a significant effect in enhancing students' skills. This is an example of a
Prospective study
A large sample of teenagers between the ages 12 and 16 were asked to keep track of the number of hours they spend on social media and then answer some questions that assess their sense of self-worth and self-esteem. The answers to these questions were then converted to a score that indicates their psychological well-being. Additional variables such as gender, age, and exercise (yes/no) were also collected for the analysis. What is the response variable?
Psychological well-being score
A dentist in Irvine, Dr. Parks, is interested in surveying patients to find out their pain level one hour after receiving a cavity filling. He asks the receptionist to call 10 patients and ask them to rate their pain level from 1 to 10 (1 = no pain, 10 = excruciating pain). However, he was only able to sample the patients who were interested in participating in this survey. Suppose Dr. Parks fears that the intimidating receptionist will make patients hesitant to complain about their pain. What kind of bias can result from this kind of survey study? (Don't worry. Dr. Parks is looking for a new receptionist...)
Response
Suppose that we are interested in studying the chest circumference of healthy newborn baby girls in Orange County. A group of 25 newborn girls were randomly sampled from different hospitals in Orange County. This is an example of what kind of sample design?
Simple Random Sample
Robert has a large garden in his backyard and is interested in studying the rate of plant growth with a new plant food he purchased. Since different parts of his garden have different amounts of sun exposure, he divides the backyard in regions based on similar sun exposure. He randomly selects a plant from each section and measures their height before feeding them the plant food and one month after administering this food. His sample is made up the plants he selected from each section. What kind of sample design is this?
Stratified Random Sample
A large sample of teenagers between ages 12 to 16 were asked to keep track of the number of hours they spend in social media and then answer some questions that assess their sense of self-worth and self-esteem. The answers to these questions were then converted to a score that indicates their psychological well-being. Additional variables such as gender, age, and exercise (yes/no) were also collected for the analysis. What are the subjects/experimental units?
Teenagers between ages 12 and 16
The Irvine school district is interested in the average body mass index (BMI) of its middle school students to understand more of the health status of their students. They randomly select one middle school from all the schools in their district and record the BMI of all the students in that school. The mean BMI for the middle schoolers in that particular school was found to be 26. From this information, the Irvine school district concluded that the average BMI for all middle schoolers in the district was probably between 24 and 29. Which of following is true about the above information?
The district's conclusion that the average BMI for middle schoolers in the district is between 24 and 29 is known as Inferential statistics. 26 mean BMI from the sample is known as Descriptive Statistics This is an example of a cluster sample.
Choose the correct interpretation of the INTERCEPT estimate
The estimated average decrease in white blood cell counts at 0 days of cigarette smoke exposure is 1889.95
Anscombe's quartet is a famous set of 4 dataset illustrating the need to visualize data before fitting models to data. Each set of data consists of 11 observations of an X variable and 11 observations of a Y variable. Summary statistics for each set of the x and y variables are in the table below. Note: the last two columns contain what we are calling in this class b0 (the intercept) and b1 (the slope coefficient, labeled "x_effect" here) when performing a linear regression of Y on X. Which of the following statements are TRUE? (Select all that apply)
The outlier in dataset 4 is also a high leverage point AND an influential point The residuals from a linear regression for dataset 2 are likely to exhibit strong patterning. If we rounded to 2 decimal places, the linear regression formulas for all four datasets would be identical.
Select the answer that best completes this sentence: The proper interpretation of R2 is _____________________.
The proportion of the variability in the decrease in WBC counts that is explained by the number of days of cigarette smoke exposure
The Sampling Frame is the set of all people you wish to sample from and each sample will select different people and yield different values, known as Sampling Variability.
True
A researchers at UCI wants to study whether consuming energy drinks before an exam leads to an increase in performance. The researcher takes a random sample of UCI students that have similar GPA's and asks her assistant to randomly assign half of them to a control group and randomly assigned the rest to the experimental group. Before participants begin their exam, they are asked to drink an energy drink. All drinks have the same packaging, except that the ones that belong to the control group are just flavored water. The real energy drinks contain high levels of caffeine. What are the subjects/experimental units?
UCI students
A dentist in Irvine, Dr. Parks, is interested in surveying patients to find out their pain level one hour after receiving a cavity filling. He asks the receptionist to call 10 patients and ask them to rate their pain level from 1 to 10 (1 = no pain, 10 = excruciating pain). However, he was only able to sample the patients who were interested in participating in this survey. What kind of study design is this?
Voluntary Response Sampling
A researchers at UCI wants to study whether consuming energy drinks before an exam leads to an increase in performance. The researcher takes a random sample of UCI students that have similar GPA's and asks her assistant to randomly assign half of them to a control group and randomly assigned the rest to the experimental group. Before participants begin their exam, they are asked to drink an energy drink. All drinks have the same packaging, except that the ones that belong to the control group are just flavored water. The real energy drinks contain high levels of caffeine. Is blinding used in this experiment?
Yes, the experiment is double-blinded since both the researcher and participants do not know what treatment they are assigned to.
A group of researchers at UCI are interested in measuring the effect of length of exposure to cigarette smoke on white blood cell counts. As part of an experiment, they take blood samples from 12 rats before and after exposure to cigarette smoke for varying lengths of time, and measure the decrease in white blood cell (WBC) counts. Their data is plotted below, with a linear regression line. The formula for the linear regression line below is: decrease in WBC = 1889.95 + 135.78*days of exposure The point marked "ID# ct-9" is
an outlier an influential point
The Human Resources (HR) at UCI wants to start an investigation regarding gender pay gap among its faculty members. The HR is interested in whether female professors are being payed less than male professors. So, they decide to take a sample of 10 female and 10 male professors. Then, they see that the salaries for the 10 female professors are a bit higher than the 10 male professors. Hence, they conclude that there is no gender pay gap at UCI. This is an example of
observational study.
Based on the plot, is r (the correlation coefficient) going to be positive or negative? No calculations necessary here - just look at the plot.
positive
Based on the chi squared test statistic you calculated in the previous question: - None of the above - the chi squared we found ≥ 3.841, so we say the relationship is statistically significant and we reject the null hypothesis - the chi squared we found < 3.841, so we cannot say the relationship is statistically significant and we cannot reject the null hypothesis - we can not tell based on the information provided
the chi squared we found ≥ 3.841, so we say the relationship is statistically significant and we reject the null hypothesis