STAT 107 Midterm 1
Suppose you have a histogram with four bins. The first three bins have a total area of 61%. What is the area of the fourth bin?
39
Check ALL (zero or more) of the rights you have under FERPA (unless it's an extenuating circumstance).
(a) Your mom doesn't get to see the D you got in calculus
Check ALL (zero or more) of the rights you have under HIPAA (unless it's an extenuating circumstance).
(a) Your parents don't get to know about your trip to McKinley (health services)
Which are the ethical principles upon which the "Common Rule", and therefore guidelines for human subject research, are based?
(b) Beneficence (c) Justice (d) Respect of persons
A recent study compared two different treatments for repairing a torn knee ligament. The subjects were 32 active, young adult volunteers who had acute knee ligament injuries. They were randomly divided into two groups: Group A received physical therapy and surgery, while Group B received only physical therapy. No group received a fake surgery. Evaluators who were aware of which patients were in which group rated the subjects on knee strength, stability, flexibility, etc., over a two-year period and found that both groups said that they felt better, but there were no significant differences on any measure between the two groups.
(c) It's a randomized controlled experiment without a placebo and without "blind" evaluators.
Researchers tracked 119,000 men and women over a 30-year period and found that those who ate nuts every day lived longer than those who didn't eat nuts every day. The people who ate nuts were less likely to develop heart disease and cancer as well. Is it appropriate to conclude that eating nuts causes people to live longer?
(c) We aren't sure. The study shows there is an association between eating nuts and living longer, but there could be a confounder.
Who does the General Data Protection Regulation (GDPR) protect?
(d) Citizens of the European Union
Prof. Karle has included one particular multiple-choice question on the final in large lectures for the past 2 years (approximately 1000 students in the class). She would like to determine whether or not a new phrasing of this question would improve student responses. Prof. Karle decide to include the question with the new phrasing on this semester's final and then compare the percentage of students who answered the question correctly this semester to the percentage of students who answered the question correctly last semester with the old phrasing. Which of the following represents the best improvement on the design of the experiment?
(d) Give half of this semester's students a version of the final with the old phrasing and half of this semester's students a version with the new phrasing. Use a random procedure (like flipping a fair coin) to decide who gets which version. Compare the results from these two groups.
A study tried to see whether men's mental abilities diminished when they think they are observed by a woman. Subjects were first given a test, then performed a lip-reading task, then given another test. Some subjects were told that a male was observing their lip-reading, and others were told a female did the observing. (Which sex was decided randomly.) The observers sent instant messages, but otherwise the subject never saw or heard from their observer, so didn't know for sure the observer's sex. Neither the subjects, nor their evaluators knew who was in the treatment group or the control group. Men who thought they were being observed by a female had worse scores on the second test than on the first. Men who thought they were being observed by a male had about the same scores on the first and second tests. (Women scored equally well on both tests no matter what the sex of their observer.) In this study, were there controls, and if so, were they randomized?
(f) The controls were the men who thought they were being observed by another guy. They were randomized.
A study published in the journal Mayo Clinic found that children who have more than one surgery with general anesthesia before their second birthday have a higher risk of developing ADHD than those who never had general anesthesia. The researchers examined the medical records of 341 children diagnosed with ADHD to find out who had undergone a surgical procedure with anesthesia before they were 2. Results: They found that 18 percent of children who had 2 or more surgeries with general anesthesia when they were babies eventually developed ADHD compared to only 7 percent of children who had no surgeries with general anesthesia as babies. Based only on the info below, state whether the following are confounders, causal links, or neither: Serious Medical Conditions- The babies who have serious medical conditions may be more likely to have surgeries which require general anesthesia and the serious medical conditions could put them at higher risk for ADHD.
Confounder
Example from Lecture Notes: One day, while scrolling through Facebook this summer, I came across an article entitled: "The Secret to a Long Marriage Is Drinking Together." As a newlywed, I quickly became interested and clicked on the article. It stated that couples who drink together, stay together. Over 2,000 couples were involved, and they found that couples who reported drinking alcohol together even just a few times each year were less likely to get on each others' nerves and more likely to have a positive outlook on their marriage. The study highlighted that what's most important isn't how much the couples drink, but whether they BOTH drink. If both partners drank, they were more likely to have a happier marriage than if just one of them drank. Based only on the info below, state whether the following are confounders, causal links, or neither: Shared Interests- Couples with shared interests are both more likely to drink together and to engage in other activities together that lead to happier marriages.
Confounder
Two studies were done comparing the effects of listening to classical music vs. pop music while studying. All students in both experimental designs were given identical 2-hour lessons and then allowed time to study for a short exam. In Design A, the students themselves chose whether or not they wanted to listen to classical or pop music while studying. In Design B, the students were randomly assigned to listen to classical or pop music while studying. Design A found that the classical music study group scored significantly higher on the exam than the pop music study group did. Design B found no significant difference in scores between the two groups. The overall exam average in both designs was the same. Which design had a control group?
Design A and B
A study was done to see whether regular exercise can help lower cholesterol. The subjects were 1000 adults with high cholesterol. Half the subjects were randomly assigned to treatment and half were randomly to control. In the treatment group, the subjects were offered the usual health care plus given full access to a gym and a personal trainer and encouraged to exercise on a treadmill for 30 minutes 4 times a week. The control group was offered just the usual health care (which had no gym access). All the subjects were followed for 1 years and their cholesterol levels were compared. Is this an observational study or a designed experiment?
Designed Experiment
What does beneficence mean?
Maximizing benefits and minimizing harm as much as possible
A study published in the journal Mayo Clinic found that children who have more than one surgery with general anesthesia before their second birthday have a higher risk of developing ADHD than those who never had general anesthesia. The researchers examined the medical records of 341 children diagnosed with ADHD to find out who had undergone a surgical procedure with anesthesia before they were 2. Results: They found that 18 percent of children who had 2 or more surgeries with general anesthesia when they were babies eventually developed ADHD compared to only 7 percent of children who had no surgeries with general anesthesia as babies. This study is an example of a...
Observational Study
Before the new semester begins, Harry received a letter from Hogwarts, which said the platform of the train station would temporarily change due to the road maintenance in King's Cross. The number of the new platform is enclosed in a question, only those who can successfully find the Magic Quantile using what they learned from Data Magic Course can get the ticket for the new semester. Harry's letter begins with a series of numbers: 4, 4, 6, 8, 10, 11, 11, 11, 15, 15, 16, 16 To find The Magic Quantile, given a sequence of numbers, the platform where we can get on the train to Hogwarts is the Q3 value of the data. (Make sure to calculate the best possible approximation of Q3 if it is not an exact number.)
Our data set has 12 data points, so the true location of is (#data points -1) * %-tile we want to find = 11 * 0.75 = 8.25. So we want to find the 8.25th number. Q3 = low + ((high-low) * fraction%) = 15+ (15-15) * 0.25 = 15.0.
Exam 1 scores of several students are shown below: 53, 72, 72, 72, 73, 74, 77, 78, 78, 82, 83, 91, 98 How many outliers exist in the data above?
Outliers are data points less than Q1 - 1.5 IQR or greater than Q3 + 1.5 IQR. First, we find that Q1 = 72 and Q3 = 82. (Q1 = 13 * 0.25, so 3.25th number in list) (Q3 = 13 * 0.75, so 9.75th number in list) Then, the IQR is 82 - 72 = 10. Then the outliers are any pointer greater than 97 or less than 57, so the answer is 2.
An academic transcript including only the courses you have taken and the grade you earned is an example of what kind of data?
Structured Data
The Washington Post recently reported on a study that links restricting screen time for kids to higher mental performance. The study assessed the behavior of 4,500 children, ages 8 to 11, by looking at their sleep schedules, how much time they spent on screens, their amount of exercise, and analyzed how those factors affected the children's mental abilities. They found that kids who spent less than 2 hours on screens had superior mental performance than those who spent more than 2 hours on screens. Which of the following conclusions are best?
The study does show an association but does not establish a definite causal link
A video of campus taken by a drone is an example of what kind of data?
Unstructured Data
On a command line, what command is used to change your current directory?
cd
Using a command line interface, what would you type to change your current directory to the Desktop directory?
cd Desktop
Write the Python code to find the percentage of Ds given in EVERY section of every course at Illinois.
df["Percent D"] = df["D"] / df["Count"]
Write the Python code to store all students in Data Science DISCOVERY who are currently freshman into the DataFrame df_answer.
df_answer = df[df["School Year"] == "Freshman"]
Write the Python code to store all students who were accepted into major D at UC-Berkeley.
df_applicants = df[(df["Admission"] == "Accepted") & (df["Major"] == "D")]
Write the Python code to store all students who were accepted into UC-Berkeley.
df_applicants = df[df["Admission"] == "Accepted"]
Write the Python code to store ALL graduate-level courses into df_courses. A graduate-level course is with a number including or between 400 and 699 (ex: 400, 401, ..., 699).
df_courses = df[(df["Number"] >= 400) & (df["Number"] <= 699)]
Write the Python code to store ALL lower-level courses into df_courses. A lower-level course is with a number including or between 100 and 299 (ex: 100, 101, ..., 299).
df_courses = df[(df["Number"] >=100) & (df["Number"] <=299)]
Write the Python code to store ALL courses that have a course subject of either ANTH or SPAN (ex: ANTH 100, SPAN 100, ANTH 101, SPAN 101, etc...).
df_courses = df[(df["Subject"] == "ANTH") | (df["Subject"] == "SPAN")]
Write the Python code to store ALL of the courses with the course number of exactly 107 (ex: CS 107, STAT 107, etc...) into the Python variable df_courses.
df_courses = df[df["Number"] == 107]
Write the Python code to store ALL courses with the course subject CHEM (ex: CHEM 100, CHEM 101, etc...) into the Python variable df_courses.
df_courses = df[df["Subject"] == "CHEM"]
Write the Python code to store all lower-level CS courses into df_courses. A lower-level CS course is a CS course with a course number including or between 100 and 299 (ex: CS 100, CS 101, ..., CS 299).
df_cs = df[df["Subject"] == "CS"] df_courses = df_cs[(df_cs["Number"] >= 100) & (df_cs["Number"] <= 299)]
Write the Python code to find the average number of siblings that Sophomore students in Data Science DISCOVERY have and store that number in average:
df_soph = df[df["School Year"] == "Sophomore"] average = df_soph["Siblings"].mean()
On a command line, what command is used to change to access files under version control?
git
Write the Python code to find the total number of C-s given in EVERY subject at Illinois.
group = df.groupby("Subject") df_grade_count = group.agg("sum").reset_index()
Write the Python code to find the total number of Fs given in EVERY subject at Illinois.
group = df.groupby("Subject") df_grade_count = group.agg("sum").reset_index()