Google Data Analyst
The share phase of the data analysis process typically involves which of the following activities?
- Creating a slideshow to present to stakeholders - Communicating Findings - summarizing results using data visualizations
the share phase involves
- creating data visualizations - preparing your presentation - communicating your findings to stakeholders
spreadsheets are digital worksheets that enable data analysts to do which of the following tasks?
- organize data in columns and rows - store data - sort and filter data
You are working with the ToothGrowth dataset. You want to use the skim_without_charts() function to get a comprehensive view of the dataset. Write the code chunk that will give you this view.
60
In data analytics, what term describes a collection of elements that interact with one another?
A data ecosystem
Fill in the blank: In SQL databases, the _____ function can be used to convert data from one datatype to another.
CAST
A data analyst prepares to communicate to an audience about an analysis project. They consider what the audience members hope to do with the data insights. This describes establishing the setting.
False
You are in the ideate phase of the design process. What are you doing at this stage?
Generating visualization ideas
Fill in the blank: _____ code is code that can be inserted directly into a .rmd file.
Inline
Which of the following is a benefit of internal data?
Internal data is more reliable and easier to collect.
According to the McCandless Method, what is the most effective way to first present a data visualization to an audience?
Introduce the graphic by name
You are working with the World Happiness data in Tableau. What tool do you use to select the area on the map representing Central America?
Lasso
You want to include a visual in your slideshow that will update automatically when its original source file updates. Which of the following actions will enable you to do so?
Link the original visual within the presentation
A data analyst uses the Color tool in Tableau to apply a color scheme to a data visualization. In order to make the visualization accessible for people with color vision deficiencies, what should they do next?
Make sure the color scheme has contrast
What does the alpha aesthetic do to the appearance of the points on the plot?
Makes some points on the plot more transparent
A company wants to make more informed decisions regarding next year's business strategy. An analyst uses data to help identify how things will likely work out in the future. This is an example of which problem type?
Making predictions
When programming in R, what is a pipe used as an alternative for?
Nested function
Fill in the blank: _____ code is freely available and may be modified and shared by the people who use it.
Open-source
A data analyst uses words and symbols to give instructions to a computer. What are the words and symbols known as?
Programming language
A data analyst is using the Pan tool in Tableau. What are they doing?
Rotating the perspective while keeping a certain object in view
Question 6 To get the most out of data-driven decision-making, it's important to include insights from people very familiar with the business problem. Identify what these people are called.
Subject-matter experts
For a function to work properly, data analysts must follow each function's predetermined structure. What is this structure called?
Syntax
Which of the following functions automatically remove extra spaces when cleaning data?
TRIM
Tableau is used to create interactive and dynamic visualizations. A visualization is interactive when the audience can control what data they see. What does it mean for a visualization to be dynamic?
The visualization can change over time
A data analyst uses the Color tool in Tableau to apply a color scheme to a data visualization. Why do they make sure the color scheme has contrast?
To make the visualization accessible for people with color vision deficiencies
You are performing a calculation during your analysis of a dataset. Which phase of analysis are you in?
Transform data
Data design is how you organize information; data strategy is the management of the people, processes, and tools used in data analysis.
True
Sophisticated use of contrast helps separate the most important data from the rest using the visual context that our brains naturally respond to.
True
The V in VLOOKUP stands for what?
Vertical
dataset
a collection of data that can be manipulated or analyzed as one unit
Data
a collection of facts
in data analytics, what term describes a collection of elements that interact with one another?
a data ecosystem
data science
a field of study that uses raw data to create new ways of modeling and understanding the unknown
a characteristic or quality of data used to label a column in a table
attribute
In a dataset, a row is called an observation. An observation includes all of the _____ for something contained in the row.
attributes
What do data analysts use to label the type of data contained in each column in a spreadsheet?
attributes
Fill in the blank: Code added to an .rmd file is usually referred to as a code _____. This allows users to execute R code from within the .rmd file.
chunk
Assume the name of your data frame is flavors_df. What code chunk lets you review the column names in the data frame?
colnames(flavors_df)
Fill in the blank: If a data analyst is using data that has been _____, the data will lack integrity and the analysis will be faulty.
compromised
Fill in the blank: A data analyst uses the CASE statement to consider one or more _____, then returns a value.
conditions
the condition in which something exists or happens
context
What code chunk do you add to the third line to create wrap around facets of the variable Rating?
facet_wrap(~Rating)
a quality of data analysis that does not create or reinforce bias
fairness
A magazine wants to understand why its subscribers have been increasing. A data analyst could help answer that question with a report that predicts the result of a half-price sale on future subscription rates.
false
Assume the name of your data frame is flavors_df. What code chunk lets you get a glimpse of the contents of the data frame?
glimpse(flavors_df)
A data visualization is the _____ representation of information.
graphical
In which step of the data analysis process would an analyst ask questions such as, "What data errors might get in the way of my analysis?" or "How can I clean my data so the information I have is consistent?"
process
the data life cycle has six stages, whereas data analysis has six.....
process steps
If a data analyst is measuring qualities and characteristics, they are considering _____ data.
qualitative
a computer programming language used to communicate with a database
query language
Fairness is achieved when data analysis doesn't create or _____ bias.
reinforce
Fill in the blank: Once data is clean, a data analyst moves on to _____ and verification.
reporting
Data analysts answer questions and solve problems. These are called business tasks
true
What is involved in seeing the big picture when verifying data cleaning? Select all that apply.
- Consider the goal - Consider the business problem
A data analyst has entered the analyze step of the data analysis process. Identify the questions they might ask during this phase.
- how will my data help me solve this problem - what story is my data telling me
Which of the following actions might occur when transforming data? Select all that apply.
- make calculations based on your data - recognize relationships in your data - identify a pattern in your data
the destroy stage of the data life cycle might involve....
- using data-erasure software - shredding paper files
during planning, a business decides...
- what kind of data it needs - how it will be managed throughout it's life cycle - who will be responsible for it - optimal outcomes
By default, all visualizations you create using Tableau Public are available to other users. What icon do you click to hide a visualization?
Eye
When writing a query, what term is used to indicate where the database should look for the desired data?
FROM
Data is a collection of _______
Facts
Data analysts ensure their analysis is fair for what reason?
Fairness helps them avoid biased conclusions.
A WHEN statement considers one or more conditions and returns a value as soon as that condition is met.
False
A data analyst wants to save stakeholders time and effort when working with a Tableau dashboard. They also want to direct stakeholders to the most important data. What process can they use to achieve both goals?
Pre-filtering
An INNER JOIN is a function that returns records with matching values in two or more tables. An OUTER JOIN is a function that combines RIGHT and LEFT JOIN to return all matching records in both tables.
True
A data analyst wants to be sure all of the numbers in a spreadsheet are numeric. What function should they use to convert text to numeric values?
VALUE
qualities and characteristics associated with using facts to solve problems
analytical skills
Fill in the blank: A changelog contains a _____ list of modifications made to a project.
chronological
A delimiter is a character that indicates the beginning or end of a data item. The split text to columns tool uses a delimiter to accomplish what task?
to specify where to split a text string
Categorizing things involves assigning items to categories. Identifying themes takes those categories a step further, grouping them into broader themes or classification
true
Reviewing version history is an effective way to view a changelog in SQL.
False
Sometimes during analysis, an analyst discovers that it's necessary to adjust the business objective. When this happens, the analyst should take the initiative to do so without involving others in order to be respectful of their time.
False
A data analyst creates a histogram to share in a presentation. What are histograms used to demonstrate?
How often data values fall into certain ranges
On a railway line, peak ridership occurs between 7:00 AM and 5:00 PM. The fairness of a passenger survey could be improved by over-sampling data from which group?
Nighttime riders
During which of the four phases of analysis do you gather the relevant datasets for a project?
Organize data
Fill in the blank: A data analyst is working with the World Happiness data in Tableau. To get a better view of Moldova, they use the _____ tool.
Pan
When working with the World Happiness data in Tableau, what could you use the Filter tool to do?
Show only countries with a World Happiness score of 3.5 or lower
In what circumstance might a data analyst choose not to use external data in their analysis?
The data cannot be confirmed to be reliable
A data analyst is working with customer data. The analyst includes the DISTINCT clause in their SELECT statement for the customer_id field. They get a list of customer_id without duplicates.
True
A data analytics team works to recognize the current problem. Then, they organize available information to reveal gaps and opportunities. Finally, they identify the available options. These steps are part of what process?
Using structured thinking
Organizing available information and revealing gaps and opportunities are part of what process?
Using structured thinking
A data analyst creates a table, but they realize this isn't the best visualization for their data. To fix the problem, they decide to use the _____ feature to change it to a column chart.
chart editor
A data analyst is using statistical measures to get a better understanding of their data. What function can they use to determine how strongly related are two of the variables?
cor()
the analytical skill that involves managing the processes and tools used in data analysis
data strategy
graphs, maps, and charts are used for what
data visualization
what is the term for the graphical representation of data
data visualization
A company defines a problem it wants to solve. Then, a data analyst gathers relevant data, analyzes it, and uses it to draw conclusions. The analyst shares their analysis with subject-matter experts, who validate the findings. Finally, a plan is put into action. What does this scenario describe?
data-driven decision-making
What do subject-matter experts do to support data-driven decision-making? Select all that apply.
- Validate the choices made as a result of the data insights - Offer insights into the business problem - Review the results of data analysis and identify any inconsistencies
RStudio's integrated development environment includes which of the following? Select all that apply.
- a console for executing commands - an area to manage loaded data - an editor for writing code
Why would a data analyst create a template of their .rmd file? Select all that apply.
- to save time when creating the same kind of document - to customize the appearance of a final report
What is the main difference between a formula and a function?
A formula is a set of instructions used to perform a specified calculation; a function is a preset command that automatically performs a specified process.
A data analyst inputs asterisks before a word or phrase in R Markdown. How will this appear in the document?
As bullet points
You are presenting to a large audience and want to keep everyone engaged during your Q&A. What can you do to ensure your audience doesn't grow disinterested despite its size?
Ask your audience for insights
A data analyst wants to assign the value 50 to the variable daily_dosage. Which of the following types of operators will they need to write that code?
Assignment
What symbol can be used to add bullet points in R Markdown?
Asterisks
A healthcare company keeps copies of their data at several locations across the country. The data becomes compromised because each location creates a copy of the original at different times of day. Which of the following processes caused the compromise?
Data replication
You have just finished analyzing data for a marketing project. Before moving forward, you share your results with members of the marketing team to see if they might have additional insights into the business problem. What practice does this support?
Data-driven decision-making
A data analyst is working on a project about the global supply chain. They have a dataset with lots of relevant data from Europe and Asia. However, they decide to generate new data that represents all continents. What type of insufficient data does this scenario describe?
Data that's geographically limited
The data analysis process phases are ask, prepare, process, analyze, share, and act. What do data analysts do during the ask phase?
define the problem to be solved
A data analytics team uses _____ to indicate consistent naming conventions for a project. This is an example of using data about data.
metadata
the attributes that describe a piece of data contained in a row of a table
observation
what works well for processing and analyzing a small dataset
spreadsheet
Spreadsheet cell L6 contains the text string "Function". To return the substring "Fun", what is the correct syntax?
=LEFT(L6, 3)
A data analyst is working with a spreadsheet that has very long text strings. They use a function to count the number of characters in cell G11. What is the correct syntax?
=LEN(G11)
A data analyst sorts a spreadsheet range between cells F19 and G82. They sort in ascending order by the second column, Column G. What is the syntax they are using?
=SORT(F19:G82, 2, TRUE)
In the data analysis process, how does a sample relate to a population?
A sample is a part of a population that is representative of the population.
Question 4 Select the best description of gut instinct.
An intuitive understanding of something with little or no explanation
Which phase of the data analysis process has the goal of identifying trends and relationships?
Analyze
Fill in the blank: The _____ function can be used to return non-null values in a list.
COALESCE
A data analyst wants to write a SQL query to combine data from two columns and into a new column. What function can they use?
CONCAT
While verifying cleaned data, a data analyst encounters a misspelled name. Which function can they use to determine if the error is repeated throughout the dataset?
COUNTA
Who is the composer listed in row 4 of your query result?
Caetano Veloso
What is the first step in the verification process?
Compare cleaned data with the original, uncleaned dataset and compare it to what is there now.
A data analyst in human resources uses a spreadsheet to keep track of employees' work anniversaries. They add color to any employee who has worked for the company for more than 10 years. Which spreadsheet tool changes how cells appear when values equal 10 or more?
Conditional formatting
A data analyst notices that two variables in their data seem to rise and fall at the same time. They recognize that these variables are related somehow. What is this an example of?
Correlation
In R, which statistical measure demonstrates how strong the relationship is between two variables?
Correlation
A data analyst commits a query to the repository as a new and improved query. Then, they specify the changes they made and why they made them. This scenario is part of what process?
Creating a changelog
Question 5 In data analytics, how are dashboards different from reports?
Dashboards monitor live, incoming data from multiple datasets and organize the information into one central location. Reports are static collections of data.
Your stakeholders are concerned about the source of your data. They are unfamiliar with the organization that ran the analyses you referenced in your presentation. Which kind of objection are they making?
Data
An airline collects, observes, and analyzes its customers' online behaviors. Then, it uses the insights gained to choose what new products and services to offer. What business process does this describe?
Data-driven decision-making
Billings Upholstery has defined a problem it needs to solve: Find a more environmentally friendly way to produce its furniture. A data analyst gathers relevant data, analyzes it, and uses it to draw conclusions. The analyst then shares their analysis with subject-matter experts from the manufacturing team, who validate the findings. Finally, a plan is put into action. This scenario describes what process?
Data-driven decision-making
Sharing the results of your analysis with colleagues who are very familiar with the business problem supports what practice?
Data-driven decision-making
An analyst working for a British school system just downloaded a dataset that was created in the United States. The data is formatted as U.S. dollars, but the analyst needs it to be in British pounds. What spreadsheet tool can help them select the right format?
Format as Currency
A data analyst wants to change their header to be one font size smaller. What should they add to their markdown syntax?
Hashtag
Finding patterns is one of the six problem types data analysts aim to solve. This type of problem might involve which of the following?
Identifying trends from historical data
A data analyst is trying to understand what data to use to help solve a business problem. They're asking questions such as, "What internal data is available in the database?" and "What outside facts do I need to research?" The data analyst is in which phase of the data analysis process?
Prepare
A data analyst wants to create a shareable report of their analysis with documentation of their process and notes explaining their code to stakeholders. What tool can they use to generate this?
R Markdown
Fill in the blank: When you execute code in the source editor, the code automatically also appears in the _____.
R console
A data analyst needs to quickly create a series of scatterplots to visualize a very large dataset. What should they use for the analysis?
R programming language
A team of data analysts is working on a complex analysis. The team needs to quickly process lots of data. They also need to easily reproduce and share every step of their analysis. What should they use for the analysis?
R programming language
What do correlation charts reveal about the data they contain?
Relationships
What does the asterisk (*) after SELECT tell the database to do in this query?
SELECT * tells the database to select all columns from the employee table.
Fill in the blank: Data analysts usually use _____ to deal with very large datasets.
SQL
A restaurant gathers data about a new dish by providing free samples to parties of six or more diners. What does this scenario describe?
Sampling bias
A data analyst is in the verification step. They consider the business problem, the goal, and the data involved in their analytics project. What scenario does this describe?
Seeing the big picture
Curiosity is an analytical skill that involves which of the following?
Seeking out new challenges and experiences
A data analyst writes the code summary(penguins) in order to show a summary of the penguins dataset. Where in RStudio can the analyst execute the code? Select all that apply.
Source editor pane
You are creating a presentation for stakeholders and are choosing whether to include static or dynamic visualizations. Describe the difference between static and dynamic visualizations.
Static visualizations do not change over time unless they're edited. Dynamic visualizations are interactive and can automatically change over time.
A data analyst is working with a spreadsheet that has very long text strings. Rather than counting the characters themselves to determine the number of characters they contain, what tool can they use?
The LEN function
Tableau is used to create dynamic and interactive visualizations. Dynamic visualizations can change over time. What does it mean for a visualization to be interactive?
The audience can control what data they see
Data analysis
The collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making
When working with subqueries, which part of the query segment executes first?
The inner query
In data analytics, what is data aggregation?
The process of gathering data from multiple sources and combining it into a single, summarized collection.
A data analyst creates a data frame with data that has more than 50,000 observations in it. When they print their data frame, it slows down their console. To avoid this, they decide to switch to a tibble. Why would a tibble be more useful in this situation?
Tibbles won't overload the console because they automatically only print the first 10 rows of data and as many variables as will fit on the screen
In a spreadsheet, what is text wrapping used for?
To automatically change cell height in order to allow all of the text to fit inside
A data analyst makes changes to SQL queries and uses these comments to create a changelog. This involves specifying the changes they made and why they made them.
True
When using VLOOKUP, there are some common limitations that data analysts should be aware of. One of these limitations is that VLOOKUP only returns the first match it finds, even if there are many possible matches within the column.
True
A recycling center that sponsors a podcast about saving the environment is an example of what strategy?
Trying to reach a target audience
a set of instructions that performs a specific calculation using spreadsheet data is called
a formula
Fill in the blank: When creating a variable for use in R, your variable name should begin with _____.
a letter
describe the difference between a question and a problem
a question is designed to discover information, whereas a problem is an obstacle or complication that needs to be solved
A data analyst shares insights from their analysis during a formal presentation to stakeholders. In a slideshow, they make a data-driven recommendation for how to solve a business problem. What phase of the data analysis process would come next?
act
If an analyst creates the same kind of document over and over or customizes the appearance of a final report, they can use _____ to save them time.
an .rmd file
Fill in the blank: You want to record and share every step of your analysis, let teammates run your code, and display your visualizations. You decide to create _____ to document your work.
an R Markdown notebook
the primary goal of a data ______ is to find answers to existing questions by creating insights from data sources
analyst
Curiosity, understanding context, and having a technical mindset are all examples of _____ used in data-driven decision-making.
analytical skills
the process of identifying and defining a problem, then solving it by using data in an organized, step-by-step manner
analytical thinking
what practice involves identifying, defining, and solving a problem by using data in an organized, step-by-step manner?
analytical thinking
After opening the ice cream shop on her farm, the same dairy farmer then surveys the local community about people's favorite flavors. She uses the data she collected to determine that the top five flavors are strawberry, vanilla, chocolate, mint chip, and peanut butter. She feels confident in her decision to sell these flavors. This is part of which phase of the data life cycle?
analyze
During which phase of data analysis would a data analyst use spreadsheets or query languages to transform data in order to draw conclusions?
analyze
Fill in the blank: You can use the _____ function to put a text label on your plot to call out specific data points.
annotate()
Fill in the blank: R Markdown notebooks can be converted into HTML, PDF, and Word documents, slide presentations, and _____.
dashboards
________________ is the collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making
data analysis
Data governance is the process of ensuring that a company's _____ are managed in a formal manner.
data assets
the analytical skill that involves how you organize information
data design
in data analysis, which analytical skill involves the management of people, processes, and tools?
data strategy
Fill in the blank: In RStudio, the _____ is where you can find all the data you currently have loaded, and can easily organize and save it.
environment pane
Data _____ refers to well-founded standards of right and wrong that dictate how data is collected, shared, and used.
ethics
A dairy farmer decides to open an ice cream shop on her farm. After surveying the local community about people's favorite flavors, she takes the data they provided and stores it in a secure hard drive so it can be maintained safely on her computer. This is part of which phase of the data life cycle?
manage
use demographic data to target advertisements for a new consumer product for youths
marketing
quartet %>% group_by(set) %>% summarize(mean(x), sd(x), mean(y), sd(y), cor(x, y))
mean(y)
a metric goal is a _________ goal set by a company that is evaluated using metrics
measurable
A data analyst is working with a data frame called salary_data. They want to create a new column named total_wages that adds together data in the standard_wages and overtime_wages columns. What code chunk lets the analyst create the total_wages column?
mutate(salary_data, total_wages = standard_wages + overtime_wages)
A data analyst is working with a data frame named salary_data. They want to create a new column named wages that includes data from the rate column multiplied by 40. What code chunk lets the analyst create the wages column?
mutate(salary_data, wages = rate * 40)
Fill in the blank: A data-storytelling narrative connects the data to the project _____.
objectives
Some of the most common symbols used in formulas include + (addition), - (subtraction), * (multiplication), and / (division). These are called _____.
operators
A doctor's office has discovered that patients are waiting 20 minutes longer for their appointments than in past years. To help solve this problem, a data analyst could investigate how many nurses are on staff at a given time compared to the number of _____.
patients with appointments
In which stage of the data life cycle does a business decide what kind of data it needs, how the data will be managed, and who will be responsible for it?
plan
During the _____ phase of the data life cycle, a business decides what kind of data it needs, how it will be managed, who will be responsible for it, and the optimal outcomes.
planning
Fill in the blank: In ggplot2, you use the _____ to add layers to your plot.
plus sign (+)
subject-matter experts look at the....
results of data analysis to identify any inconsistencies, make sense of gray areas, and eventually validate the choices being made
the reason why a problem occurs
root cause
Fill in the blank: You should distinguish elements of your data visualization by _____ the foreground and background and using contrasting colors and shapes. This makes the content more accessible.
separating
Question 5 A metric is a single, quantifiable type of data that can be used for what task?
setting and evaluating
In which data analysis phase would a data analyst use visuals such as charts or graphs to simplify complex data for better understanding?
share
Data Analyst
someone who collects, transforms, and organizes data in order to drive informed decision-making
people who invest time and resources into a project and are interested in its outcome
stakeholders
Fill in the blank: A data analyst creates a presentation for stakeholders. They include _____ visualizations because they don't want the visualizations to change unless they choose to edit them.
static
in data analytics, SQL is an acronym meaning
structured query language
the people very familiar with a business problem are called _____. They are an important part of data-driven decision-making
subject-matter experts
Fill in the blank: A predetermined structure that includes a function's required information and its proper placement is called _____.
syntax
A data analyst works for an appliance manufacturer. Last year, the company's profits were down. Lower profits can be a result of fewer people buying appliances, higher costs to make appliances, or a combination of both. The analyst recognizes that those are big issues to solve, so they break down the problems into smaller pieces to analyze them in an orderly way. Which analytical skill are they using?
technical mindset
the ability to break things down into smaller steps or pieces and work with them in an orderly and logical way
technical mindset
use geographic data to power GPS technology in cars
technology
Fill in the blank: The benefits of using _____ for data analysis include the ability to quickly process lots of data and create high quality visualizations.
the R programming language
using spreadsheets to aggregate data would happen during....
the analyze stage
ownership is a key issue in data ethics. who owns data
the individual who originally generates the data
data ecosystem
the various elements that interact with one another in order to produce, manage, store, organize, analyze, and share data
Collaborating with a social scientist to provide insights into human bias and social contexts is an effective way to avoid bias in your data.
true
Data analysts use queries to request, retrieve, and update information within a database.
true
Data transformation can change the structure of the data. An example of this is taking data stored in one format and converting it to another.
true
Data-driven decision-making involves the five analytical skills: curiosity, understanding context, having a technical mindset, data design, and data strategy. Each plays a role in data-driven decision-making.
true
If you have a short time frame for data collection and need an answer immediately, you likely will have to use historical data
true
The columns in a spreadsheet are ordered by letter, and the rows are ordered by number.
true
To evaluate how well two or more data sources work together, data analysts use data mapping.
true
a database is a collection of data stored in a computer system
true
data analysts ask, "why?" five times in order to get to the root cause of a problem
true
during the process phase of data analysis, a data analyst cleans data to ensure it's completed and correct
true
gap analysis is used to examine and evaluate how a process currently works with the goal of getting to where you want to be in the future
true
in general, the usefulness of data decreases as time passes
true
surveying customers about their preferences and using that information to inform business strategy is an example of data-driven decision-making
true
A Boolean data type can have _____ possible values.
two
the analytical skill that has to do with how you group things into categories
understanding context
Fill in the blank: Documentation is the process of tracking _____ during data cleaning. Select all that apply.
- additions - deletions - changes
Which of the following are benefits of open-source code? Select all that apply.
- anyone can fix bugs in the code - anyone can use the code for free - anyone can create an add-on package for the code
Benefits to using dashboards to tell data stories
- being able to organize information from multiple datasets into one central location and enabling tracking and analysis of data - in addition, dashboard can simplify data visualization using tables, charts, and graphs
Using a programming language can help you with which aspects of data analysis? Select all that apply.
- clean your data - transform your data - visualize your data
In ggplot2, which of the following aesthetic attributes can you use to map variables to points? Select all that apply.
- color - size - shape
Which of the following are benefits of using ggplot2? Select all that apply.
- combine data manipulation and visualization - customize the look and feel of your plot - easily add layers to your plot
RStudio's integrated development environment lets you perform which of the following actions? Select all that apply.
- create data visualizations - import data from spreadsheets - install R packages
When working with data from an external source, what can metadata help data analysts do? Select all that apply.
- ensure data is clean and reliable - understand the contents of a database - combine data from more than one source
Relational databases illustrate relationships between tables. Which fields represent the connection between these tables? Select all that apply.
- foreign keys - primary keys
what actions might a data analytics team take in the act phase of the data analysis process
- putting a plan into action to help solve the business problem - validating insights provided by analysts - finalizing a strategy based on the analysis
What are the key elements of effective visualizations you should focus on when creating data visualizations? Select all that apply.
- refined execution - clear meaning
Structured query language (SQL) enables data analysts to _____ the information in a database. Select all that apply.
- request - retrieve - update
Which of the following are included in R packages? Select all that apply.
- tests for checking your code - reusable R functions
what are the key benefits of data visualizations
- they can clearly demonstrate patterns and trends - they can help stakeholders understand complex data more quickly - they can illustrate relationships between data points
A data analyst is considering using tibbles instead of basic data frames. What are some of the limitations of tibbles? Select all that apply.
- tibbles can never create row names - tibbles can neber change the input type of the data
Many data analysts prefer to use a programming language for which of the following reasons? Select all that apply.
- to save time - to easily reproduce and share an analysis - to clarify the steps of an analysis
Fill in the blank: TRIM is a function that removes _____ spaces in data. Select all that apply.
- trailing - inner - leading
what steps do data analysts take to ensure fairness when collecting data
- use an inclusive sample population - understand the social context - include data self reported by individuals
Asking questions including, "Does my analysis answer the original question?" and "Are there other angles I haven't considered?" enable data analysts to accomplish what tasks? Select all that apply.
- use data to get a solid conclusion - consider the best ways to share data with others - help team members make informed, data-driven decisions
An effective slideshow guides your audience through your main communication points. What are some best practices to use when writing text for a slideshow? Select all that apply.
- Choose a font size that audience members can read easily - avoid slang terms - define unfamiliar abbreviations
Understanding context is an analytical skill best described by which of the following? Select all that apply.
- Gathering additional information about data to understand the broader picture - identifying the motivation behind the collection of a dataset - adding descriptive heads to columns of data in a spreadsheet
You use spotlighting to help you identify the most important insights. Which of the following activities are involved with spotlighting? Select all that apply.
- Identifying connections or patterns - Finding ideas or concepts that keep arising -
You decide to create an R Markdown notebook to document your work. What are your reasons for choosing an R Markdown notebook? Select all that apply.
- It allows users to run your code - It displays your data visualizations - it lets you record and share every step of your analysis
You read an interesting article in a magazine and want to share it in the discussion forum. What should you do when posting? Select all that apply.
- Make sure the article is relevant to data analytics. - Check your post for typos or grammatical errors.
Data-driven decision-making is using facts to guide business strategy. The benefits include which of the following?
- Using data analytics to find the best possible solution to a problem - getting a complete picture of a problem and its causes - combining observation with objective data
A data analyst adds specific characters before and after their code chunk to mark where the data item begins and ends in the .rmd file. What are these characters called?
Delimiters
A data analyst is inserting a line of code directly into their .rmd file. What will they use to mark the beginning and end of the code?
Delimiters
A data analyst makes sure that they approach problems in a user-centric way. What element of data analytics does this describe?
Design thinking
Interoperability is key to open data's success. Which of the following is an example of interoperability?
Different databases use common formats and terminology
What is the process of tracking changes, additions, deletions, and errors during data cleaning?
Documentation
A data analyst uses an absolute reference to lock a function array so rows and columns don't change if the function is copied. What symbol is used to create an absolute reference?
Dollar sign ($)
You are presenting your theory about the correlation between recent sales increases and a current pop culture trend. When is the best time to establish your presentation's hypothesis for the audience?
During the introduction
A teammate asks you about the benefits of using R for the project. You mention that R can quickly process lots of data and create high quality data visualizations. What is another benefit of using R for the project?
Easily reproduce and share an analysis
You are preparing to communicate to an audience about an analysis project. You consider the roles that your audience members play and their stake in the project. What aspect of data storytelling does this scenario describe?
Engagement
A data analyst is working with spreadsheet data. The analyst imports the data from the spreadsheet into RStudio. Where in RStudio can the analyst find the imported data?
Environment pane
You run a colleague test on your presentation before getting in front of an audience. Your coworker asks a question about a section of your analysis, but addressing their concern would mean adding information you didn't plan to include. How should you proceed with building your presentation?
Expand your presentation by including the information
The COUNT DISTINCT function includes repeating values when returning values in a specified range.
False
The VALUE function converts a numeric value into a text string in a spreadsheet.
False
Verification and reporting come directly before the data-cleaning process.
False
You are working with the World Happiness data in Tableau. Which tool will enable you to show certain data while hiding the rest?
Filter
Fill in the blank: When working with a spreadsheet, data analysts can use the _____ function to locate specific characters in a string.
Find
In SQL databases, what data type refers to a number that contains a decimal?
Float
Fill in the blank: An important part of dashboard design is ensuring that charts, graphs, and other visual elements are cohesive. This means that they are _____ and make good use of available space.
balanced
putting data into context helps data analysts eliminate
bias
A data analyst wants to find out how much the predicted outcome and the actual outcome of their data model differ. What function can they use to quickly measure this?
bias()
the question or problem data analysis resolves for a business
business task
You want to create a vector with the values 12, 23, 51, in that exact order. After specifying the variable, what R code chunk allows you to create the vector?
c(12, 23, 51)
in a data life cycle, which phase involves gathering data from various sources and bringing it into the organization?
capture
which aspect of analytical thinking involves being able to identify a relationship between two or more pieces of data?
correlation
What does the geom_jitter() function do to the points in the plot?
creates a scatterplot and then adds a small amount of random noise to each point in the plot to make the points easier to find.
Fill in the blank: Filtering involves showing only the data that meets a specific _____ while hiding the rest.
criteria
A real estate company needs to hire a human resources assistant. The owner asks a data analyst to help them decide where to advertise the job opening. The analyst learns that the majority of human resources professionals are women, validates this finding with research, and targets ads to a women's community college. This is fair because the analyst conducted research to make sure the information about gender breakdown of human resources professionals was accurate. 0 / 1 point
false
An employer accesses an employee's credit report without their consent. This is not a violation of the employee's privacy because they work at the company.
false
As part of the data-cleaning process, a data analyst creates a rule to highlight any empty cells in a bright blue color. This is an example of data visualization.
false
Correlation is the aspect of analytical thinking that involves figuring out the specifics that help you execute a plan.
false
Data analysis is the various elements that interact with one another in order to provide, manage, store, organize, analyze and share data
false
If a data analyst compares the cost of an investment to the net profit of that investment over a period of time, they're analyzing the investment scope.
false
In data analytics, a pattern is defined as a process or set of rules to be followed for a specific task.
false
VLOOKUP searches for a value in a row in order to return a corresponding piece of information.
false
When writing a query, it's necessary for the name of the dataset to be inside two backticks in order for the query to run properly.
false
a data analyst finishes using a dataset, so they erase or shred the files in order to protect private information. this is called archiving
false
data analysts use a process called encryption to organize folders into subfolders
false
during the capture stage of the data lifecycle, a data analyst may use spreadsheets to aggregate data
false
A data analyst is working with the penguins data. The variable species includes three penguin species: Adelie, Chinstrap, and Gentoo. The analyst wants to create a data frame that only includes the Adelie species. The analyst receives an error message when they run the following code:
filter(species == "Adelie")
a set of instructions used to perform a calculation using the data in a spreadsheet
formula
when writing a query, what word does a data analyst use to indicate the table from which the data will be retrieved?
from
a preset command that automatically performs a specified process or task using the data in a spreadsheet
function
What method involves examining and evaluating how a process works currently in order to get it where you want it to be in the future?
gap analysis
a method for examining and evaluating the current state of a process in order to identify opportunities for improvement in the future
gap analysis
the term ______ is defined as an intuitive understanding of something with little or no explanation
gut instinct
A data analyst is exploring their data to get more familiar with it. They want a preview of just the first six rows to get a better idea of how the data frame is laid out. What function should they use?
head()
Fill in the blank: During the _____ phase of the design process, you start to generate data visualization ideas.
ideate
Fill in the blank: Data ecosystems are made up of elements that _____ with each other. This makes it possible for them to produce, manage, store, organize, analyze, and share data.
interact
Which type of bias is the tendency to always construe ambiguous situations in a positive or negative way?
interpretation
Fill in the blank: When a data analyst notices a data point that is very different from the norm in a scatter plot, the best course of action is to _____ the outlier.
investigate
Fill in the blank: A data analyst can make their visualizations more accessible by adding _____, which are text explanations placed directly on the visualizations.
labels
Fill in the blank: In Tableau, a diverging palette displays two value ranges. It uses a color to show the range where a data point is from and color intensity to show its ______.
magnitude
A data analyst is working with a data frame named retail. It has separate columns for dollars (price_dollars) and cents (price_cents). The analyst wants to combine the two columns into a single column named price, with the dollars and cents separated by a decimal point. For example, if the value in the price_dollars column is 10, and the value in the price_cents column is 50, the value in the price column will be 10.50. What code chunk lets the analyst create the price column?
unite(retail, "price", price_dollars, price_cents, sep=".")
Fill in the blank: Design thinking is a process used to solve problems in a _____ way.
user-centric
data-driven decision-making
using facts to guide business strategy
In data analytics, the data ecosytem refers to the .....
various elements that interact with one another to produce, manager, store, organize, analyze and share data
Fill in the blank: While cleaning data, a data analyst can use a changelog to keep a chronological list of changes they make. They can refer to it during the _____ period if there are errors or questions.
verification