BADM 211
Which of the following is appropriate syntax for a key-value pair in a dictionary?
"Location": "Urbana"
What will be the output of List1[-2] if List1 is defined as below? List1 = ['Hello', 3, 'BADM', 211, 'Professor', 3.14]
'Professor'
What is the output of the following lines of code: word = "BADM" word = 211 print(word)
211
What is the output of the following lines of code: val = 1 for i in range(3, 10): if i > 5: val += 1 elif i != 4: val = val - 1 print(val)
3
What would the following code output? var = [1, 2, [3], [5], [4, 6, [7, 8]]] print(len(var))
5
What will be the result of the following code? 2 * 3.0
6.0
In the below scatterplot, what is the relationship between the variables on the x-axis and y-axis?
A positive relationship
To filter a dataframe df such that a column, col1, has values only greater than 100, we run the following code: df[df['col1] > 100] What will be the outcome if we only run the code within the outer set of square brackets, i.e., df['col1'] > 100?
A series of True and False will be printed.
The dataframe df has 100 rows and 5 columns. After running df.duplicated().sum(), you find that the dataframe has 9 duplicate values. A. How many rows will df have after running df.drop_duplicates()? B. How many rows will df have after running df.drop_duplicates(inplace=True)?
A. 100, the code will only return a copy of df with duplicates removed. It will not change the underlying dataframe. B. 91 (100 - 9)
Which of the following is not a valid variable type in Python?
DoubleString
How would you return the enrollment count in business from this dictionary? Univ = {'business': {'name':'gies', 'enrollment': 1500}, 'engineering': {'name': 'grainger', 'enrollment': 4500}}
Univ["business"]["enrollment"]
Which of the following pandas methods will report summary statistics of numeric columns?
describe()
Consider the dataframe df with categorical variable catvar and a numerical variable numvar. What is the correct code to find the mean of numvar for each value of catvar?
df.groupby("catvar")["numvar"].mean()
Consider the dataframe df with categorical variable catvar and two numerical variables numvar1 and numvar2. What is the correct code to find the mean of numvar1 and numvar2 for each value of catvar?
df.groupby("catvar")[["numvar1", "numvar2"]].mean()
Assuming dataframe df is properly defined, what line of code would output the first 10 rows of data?
df.head(10)
Which of the following code will return the following subset? (Select all that apply.)
df.loc[:,["Col1", "Col3", "Col5"]] df.iloc[:,0:3]
Which of the following code will return the following subset? (Select all that apply.)
df.loc[[1, 3, 5],:] df.loc[[1, 3, 5]] df.iloc[0:3,:] df.loc[1:5,:]
Which of the following code will return the following subset? (Select all that apply.)
df.loc[[1,3,5],["Col1","Col3","Col5"]] df.iloc[0:3,0:3] df.loc[1:5,"Col1":"Col5"] df.iloc [[0,1,2],[0,1,12]]
Suppose we have a data frame df, which includes a column animal. I would like to count the number of times each animal appears in the column, e.g. Elephant: 8, Zebra: 7, etc. Which of the following commands can I use to do that?
df["animal"].value_counts()
Which of these will help us select two columns, c1 and c2, from the dataframe df?
df[["c1", "c2"]]
Which code will filter the dataframe df for rows where the column "col1" has values that are NOT equal to 5?
df[df["col1"] != 5]
Consider the dataframe df with m numerical variables, and one categorical variable catvar which takes n possible values. How many variables will be in the dataframe created by the following code? pd.get_dummies(df, columns=['catvar'], drop_first = True)
m+n-1 variables
Consider the dataframe df with categorical variable catvar which takes n possible values. How many dummy variables will the following code create? pd.get_dummies(df, columns=['catvar'], drop_first = False)
n dummy variables
What seaborn function is used to create a scatter plot?
sns.scatterplot()
What will be the output of this code? student_gpa = {'Dan':3.9, 'Jason':4, 'Jasmine':4, 'Lori':3.85} student_gpa["Dan"] = 3.8 student_gpa
{'Dan': 3.8, 'Jason': 4, 'Jasmine': 4, 'Lori': 3.85}
Which of the following measures cannot be inferred from a box plot? (Select all that apply.)
Count Mean
The dataframe df has 100 rows and 5 columns. After running df.isnull().sum(), you find that the dataframe has 5 missing values in Column1 and 5 missing values in Column2. A. How many rows will df have after running df.dropna()? B. How many rows will df have after running df.dropna(inplace=Ture)? (Select all that apply.)
A. 100, the code will only return a copy of df with duplicates removed. It will not change the underlying dataframe. B. At least 90 observations (100 - 5 - 5, if the missing values are all in different rows) At most 95 observations (100 - 5, if the missing values in Column1 are in the same rows as the missing values in Column 2)
Suppose you have a dataset of box office revenues for the top 1000 movies. The dataset includes the genre of each movie. Which chart could you use to show average revenues by genre? (Select all that apply.)
Bar chart
Which of the following plots can be created based on a single numeric variable? (Select all that apply.)
Box plot Histogram
What purpose does a cross-tabulation (pd.crosstab) serve in data analysis?
Examine the relationship between two categorical variables.
A scatterplot is a useful chart to visualize the distribution of a categorical variable.
False
Both "=" and "==" perform the same function in Python.
False
What does the groupby function in pandas do?
Group rows of a dataframe based on one or more columns.
Which option will help us select elements "a" and "b" in the list L = ["a", "b", "c"]? (Select all that apply.)
L[0:2] L[-3:-1]
The following function is meant to calculate the area of a triangle. However, when I try to run it, I get an error. Why? (Select all that apply.) def area_function{base=3, height=1}; area = 0.5 * base * height return(area)
Lines 2 and 3 are not intended. Line one uses "{}" instead of "()". Line one ends with a semicolon and not a colon.
The follow bar chart shows average rainfall by month. Based on the figure, which of the following is true?
None of the other options.
What is the value of the interquartile range (IQR)?
Q3 - Q1
Which of the following statements is TRUE? (Select all that apply.)
Seaborn is built on top of Matplotlib. Each column in a pandas dataframe is a Series.
The following boxplot shows the distribution of tip amounts by day of the week. Based on the figure, which of the following is true? (Select all that apply.)
The median tip on Sat is less than that on Sun. The first quartile for tip is similar across all four days. Friday and Sunday have no outliers.
What is the purpose of using the parameter hue in a Seaborn scatterplot?
To color-code observations by the value of a categorical variable
A list can hold multiple types of data.
True
