Week 3 quiz

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

You are creating a dashboard in Tableau to share with stakeholders. Why might you decide to pre-filter the dashboard? Select all that apply.

Pre-filtering is useful because it saves time and effort while directing stakeholders to the most important data.

Question 3 Fill in the blank: The bias function compares the actual outcome of the data with the _____ outcome to determine whether or not the model is biased. 1 / 1 point

The bias function compares the actual outcome of the data with the predicted outcome to determine whether or not the model is biased.

A data analyst is cleaning their data in R. They want to be sure that their column names are unique and consistent to avoid any errors in their analysis. What R function can they use to do this automatically?

The clean_names() function will automatically make sure that column names are unique and consistent.

A data analyst is working with the penguins data. The variable species includes three penguin species: Adelie, Chinstrap, and Gentoo. The analyst wants to create a data frame that only includes the Adelie species. The analyst receives an error message when they run the following code:

The code chunk is filter(species == "Adelie"). The filter function is used to specify the part of the data to be viewed. Two equal signs in an argument mean "exactly equal to." Using this operator instead of the assignment operator <- calls only the data about Adelie penguins to the dataset.

A data analyst is working with a data frame called salary_data. They want to create a new column named hourly_salary that includes data from the wages column divided by 40. What code chunk lets the analyst create the hourly_salary column?

The code chunk is mutate(salary_data, hourly_salary = wages / 40). The analyst can use the mutate() function to create a new column for wages divided by 40 called hourly_salary. The mutate() function can create a new column without affecting any existing columns.

A data analyst is working with a data frame named cars. The analyst notices that all the column names in the data frame are capitalized. What code chunk lets the analyst change all the column names to lowercase?

The code chunk is rename_with(cars, tolower). The rename_with() function will enable the analyst to easily change the case of the column names to lowercase. Including the tolower argument indicates that all column names will be changed to lowercase.

A data analyst is working with a data frame named retail. It has separate columns for dollars (price_dollars) and cents (price_cents). The analyst wants to combine the two columns into a single column named price, with the dollars and cents separated by a decimal point. For example, if the value in the price_dollars column is 10, and the value in the price_cents column is 50, the value in the price column will be 10.50. What code chunk lets the analyst create the price column?

The code chunk unite(retail, "price", price_dollars, price_cents, sep=".") lets the analyst create the price column. The unite() function lets the analyst combine the dollars and cents data into a single column. In the parentheses of the function, the analyst writes the name of the data frame, then the name of the new column in quotation marks, followed by the names of the two columns they want to combine. Finally, the argument sep="." places a decimal point between the dollars and cents data in the price column.

A data analyst is working with a large data frame. It contains so many columns that they don't all fit on the screen at once. The analyst wants a quick list of all of the column names to get a better idea of what is in their data. What function should they use?

The colnames() function will return a list of all the column names in a data frame for easy reference.

A data analyst inputs the following command: quartet %>% group_by(set) %>% summarize(mean(x), sd(x), mean(y), sd(y), cor(x, y)). Which of the functions in this command can help them determine how strongly related their variables are?

The cor() function returns the correlation between two variables. This determines how strong the relationship between those two variables is.

Fill in the blank: A data analyst is creating the title slide in a presentation. The data they are sharing is likely to change over time, so they include the _____ on the title slide. This adds important context.

The data analyst includes the data of the presentation on the title slide. Specifying the date lets people know when the data was last updated.

Which summary functions can you use to preview data frames in R? Select all that apply.

The head(), glimpse(), and str() summary functions allow you to preview data frames in R. The head() function returns the columns and the first several rows of data.The mutate() function lets you change the data frame, not preview it. Going forward, you can use summary functions to inspect the data frames you create in your career as a data analyst.

A data analyst writes the following code chunk to return a statistical summary of their dataset: quartet %>% group_by(set) %>% summarize(mean(x), sd(x), mean(y), sd(y), cor(x, y)) Which function will return the average value of the y column? 1 / 1 point

The mean() function will return the average value of a specific variable. In this case, mean(y) will return the average value of y.

Which R function can be used to make changes to a data frame?

The mutate() function can be used to make changes to a data frame.

The rename_with() function can be used to reformat column names to be upper or lower case.

The rename_with() function can be used to reformat column names to be upper or lower case.

Which of the following functions can a data analyst use to get a statistical summary of their dataset? Select all that apply.

The sd(), cor(), and mean() functions can provide a statistical summary of the dataset using standard deviation, correlation, and mean.

Which of the following functions returns a summary of the data frame, including the number of columns and rows? Select all that apply.

The skim_without_charts() and glimpse() functions both return a summary of the data frame, including the number of columns and rows.

A data analyst is working with customer information from their company's sales data. The first and last names are in separate columns, but they want to create one column with both names instead. Which of the following functions can they use?

The unite() function can be used to combine columns.

Why are tibbles a useful variation of data frames?

Tibbles can make printing easier. They also help you avoid overloading your console when working with large datasets. Tibbles are automatically set to only return the first ten rows of a dataset and as many columns as it can fit on the screen.

A data analyst is working with a dataset in R that has more than 50,000 observations. Why might they choose to use a tibble instead of the standard data frame? Select all that apply.

Tibbles make printing in R easier. They won't accidentally overload the data analyst's console because they're automatically set to pull up only the first 10 rows and as many columns as fit on screen.

Tidy data is a way of standardizing the organization of data within R.

Tidy data refers to the principles that make data structures meaningful and easy to understand. It's a way of standardizing the organization of data within R. 4. Question 4 Which R function can be used to make changes to a data frame

You are building a dashboard in Tableau. To create a single-layer grid that contains no overlapping elements, which layout should you choose?

To create a single-layer grid that contains no overlapping elements, you should choose a tiled layout. Tiled items are part of a single-layer grid that automatically resizes based on the overall dashboard size.

A data analyst wants to include a visual in their slideshow, then make some changes to it. Which of the following options will enable the analyst to edit the visual within the presentation without affecting its original file? Select all that apply.

To edit the visual without affecting its original file, the analyst should use copy and paste or embed the visual into the presentation.

Which of the following questions do data analysts ask to make sure they will engage their audience? Select all that apply

To engage their audience, data analysts ask about what roles the people in the audience play, their stake in the project, and what they hope to do with the data insights.

Which of the following are best practices for creating data frames? Select all that apply.

When creating data frames, columns should be named and each column should contain the same number of data items.

When designing a dashboard, how can data analysts ensure that charts and graphs are most effective? Select all that apply.

When designing a dashboard, data analysts can ensure that charts and graphs are most effective by placing them in a balanced layout and making good use of available space.

You are working with the penguins dataset. You want to use the arrange() function to sort the data for the column bill_length_mm in ascending order. You write the following code: penguins %>% Add a code chunk to sort the column bill_length_mm in ascending order.

You add the code chunk arrange(bill_length_mm) to sort the column bill_length_mm in ascending order. The correct code is penguins %>% arrange(bill_length_mm). Inside the parentheses of the arrange() function is the name of the variable you want to sort. The code returns a tibble that displays the data for bill_length_mm from shortest to longest. The shortest bill length is 32.1mm.

A data analyst uses the bias() function to compare the actual outcome with the predicted outcome to determine if the model is biased. They get a score of 0.8. What does this mean?

A score of 0.8 indicates that the model is biased. The closer the score is to zero, the less likely it is that the model is biased.

Which of the following are appropriate uses for filters in Tableau? Select all that apply.

Appropriate uses for filters in Tableau include highlighting individual data points, limiting the number of rows or columns in view, and providing data to different users based on their particular needs.


Kaugnay na mga set ng pag-aaral

pharmacology practice assessment A 2016

View Set

Michigan State Insurance Exam 6/3/2021

View Set

Health Assessment - Urinary System

View Set