data science discovery cbtf exam study
What does beneficence mean?
Maximizing benefits and minimizing harm as much as possible
What statement best describes the following line of Python?
Reads the file hello.csv into the DataFrame named df.
Write the Python code to find the average number of hours studying each week that Freshman students in Data Science DISCOVERY have and store that number in average:
freshies = df[df["School Year"] == "Freshman"] average = freshies["Hours Studying"].mean()
On a command line, what command is used to change to access files under version control?
git
Write the Python code to store all senior-level IS courses into df_courses. A senior-level IS course is a IS course with a course number including or between 400 and 499 (ex: IS 400, IS 401, ..., IS 499).
hello = df[df["Subject"] == "IS"] df_courses = hello[(hello["Number"] >= 400) & (hello["Number"] <= 499)]
How do you find a low outlier and a high outlier?
low outlier: Q1 - 1.5(IQR) high outlier: Q3 + 1.5(IQR)
Write the Python code to calculate the total number of applications for each Major and store it in the column Applicants.
major = df.groupby("Major").agg("count").reset_index() major["Applicants"] = major["Admission"] df_applicants = major
Write the Python code to store all men who were accepted into major A at UC-Berkeley.
males = df[df["Gender"] == "M"] accepted = males[males["Admission"] == "Accepted"] major_a = accepted[accepted["Major"] == "A"] df_applicants = major_a
Write the Python code to store all junior-level MUS courses into df_courses. A junior-level MUS course is a MUS course with a course number including or between 300 and 399 (ex: MUS 300, MUS 301, ..., MUS 399).
musica = df[df["Subject"] == "MUS"] df_courses = musica[((musica["Number"] >= 300) & (musica["Number"] <= 399))]
Write the Python code to find the 6 subjects that give the most C+s at Illinois.
subs = df.groupby("Subject").agg("sum").reset_index() df_grade_count = subs.nlargest(6, "C+")
Which are the ethical principles upon which the "Common Rule", and therefore guidelines for human subject research, are based?
(a) Beneficence (b) Justice (c) Respect of persons
Check ALL (zero or more) all of the correct/true statements about a research study
(b) You must state in informed consent that participation is voluntary (c) You must minimize the risks and maximize the benefits for your participants (d) You do not need consent if you prove that participants will face no more than minimal risk
Who does the General Data Protection Regulation (GDPR) protect?
Citizens of the European Union (EU)
Check ALL (zero or more) of the following that are considered Personally Identifiable Information (PII):
Data that allows you to contact someone Data that allows you to identify someone Data that allows you to locate someone
Write the Python code to calculate the total number of ACCEPTED applications for each Major and store it in the column Accepted
accepted = df[df["Admission"] == "Accepted"] major_accepted = accepted.groupby("Major").agg("count").reset_index() major_accepted["Accepted"] = major_accepted["Admission"] df_applicants = major_accepted
Write the Python code to store all students who were accepted into major B at UC-Berkeley.
accepted = df[df["Admission"] == "Accepted"] major_b = accepted[accepted["Major"] == "B"] df_applicants = major_b
how to find quartiles given a set of numbers in python?
df = pd.DataFrame({"a":[#,#,#]}) df.a.quantile(0.25, 0.50, 0.75)
Write the Python code to find the percentage of Ds given in EVERY section of every course at Illinois.
df["Percent D"] = df["D"]/df["Count"]
Write the Python code to store all students who were accepted into UC-Berkeley.
df_applicants = df[df["Admission"] == "Accepted"]
Write the Python code to find the standard deviation in the number of hours of sleep that Junior students in Data Science DISCOVERY have and store that number in standard_deviation:
junior = df[df["School Year"] == "Junior"] standard_deviation = junior["Hours of Sleep"].std()
The return addresses of all the mail you received last month is an example of what kind of data?
structured data