Quiz Questions

Ace your homework & exams now with Quizwiz!

Which of the following principles are key elements of data integrity? Select all that apply. -Consistency -Accuracy -Selectivity -Trustworthiness

-Consistency -Accuracy -Trustworthiness

A data analyst runs a SQL query to extract some data from a database for further analysis. How can the analyst save the data? Select all that apply. -Use the UPDATE query to save the data. -Create a new table for the data. -Run a SQL query to automatically save the data. -Download the data as a spreadsheet.

-Create a new table for the data. -Download the data as a spreadsheet.

In a survey about a new cleaning product, 75% of respondents report they would buy the product again. The margin of error for the survey is 5%. Based on the margin of error, what percentage range reflects the population's true response?

70%-80%

A data analyst uses the COUNTIF function to count the number of times a value less than 5 occurs between spreadsheet cells A2 through A100. What is the correct syntax?

=COUNTIF(A2:A100,"<5")

TRUE or FALSE: To evaluate how well two or more data sources work together, data analysts use data mapping.

True

To correct a typo in a database column, where should you insert a CASE statement in a query? -As a GROUP BY clause -As a FROM clause -As a SELECT clause -As an ORDER BY clause

-As a SELECT clause

Which SQL tool considers one or more conditions, then returns a value as soon as a condition is met? -WHEN -CASE -ELSE -THEN

-CASE

Fill in the blank: To count the total number of spreadsheet values within a specified range, a data analyst uses the _____ function. -COUNTA -SUM -WHOLE -TOTAL

-COUNTA

Before analysis, a company collects data from countries that use different date formats. Which of the following updates would improve the data integrity? -Remove data in an unfamiliar date format -Leave the dates in their current formats -Organize the data by country -Change all of the dates to the same format

-Change all of the dates to the same format

What is the first step in the verification process? -Inform others of your data-cleaning effort -Determine the quality of the data -Create a chronological list of modifications made to the data -Compare cleaned data with the original, uncleaned dataset and compare it to what is there now

-Compare cleaned data with the original, uncleaned dataset and compare it to what is there now

Making sure data is properly verified is an important part of the data-cleaning process. Which of the following tasks are involved in this verification? Select all that apply. -Asking stakeholders to check and confirm the data is clean -Considering whether the data is credible and appropriate for the project -Manually fixing any errors found in the data -Rechecking the data-cleaning effort

-Considering whether the data is credible and appropriate for the project -Manually fixing any errors found in the data -Rechecking the data-cleaning effort

A data analyst at a nonprofit organization is working with a dataset about a summer fundraiser. Although they have a lot of useful data by the end of the month, they recognize that the data is insufficient. So, they decide to wait until the end of the season to begin working with the dataset. Which type of insufficient data does this example describe? -Data from only one source -Data that keeps updating -Geographically limited data -Outdated data

-Data that keeps updating

Which of the following are limitations that might lead to insufficient data? Select all that apply. -Data that updates continually -Data from a single source -Outdated data -Duplicate data

-Data that updates continually -Data from a single source -Outdated data

What are the most common processes and procedures handled by data warehousing specialists? Select all that apply. -Ensuring data is secure -Ensuring data is backed up to prevent loss -Ensuring data is properly cleaned -Ensuring data is available

-Ensuring data is secure -Ensuring data is available -Ensuring data is backed up to prevent loss

What should an analyst do if they do not have the data needed to meet a business objective? Select all that apply. -Continue with the analysis using data from less reliable sources. -Create and use hypothetical data that aligns with analysis predictions. -Gather related data on a small scale and request additional time to find more complete data. -Perform the analysis by finding and using proxy data from other datasets.

-Gather related data on a small scale and request additional time to find more complete data. -Perform the analysis by finding and using proxy data from other datasets.

Which of the following tasks can data analysts do using both spreadsheets and SQL? Select all that apply. -Process huge amounts of data efficiently -Join data -Perform arithmetic -Use formulas

-Join data -Perform arithmetic -Use formulas

Documenting data-cleaning makes it possible to achieve what goals? Select all that apply. -Keep team members on the same page -Demonstrate to project stakeholders tht you are accountable -Visualize the results of your data analysis -Be transparent about your process

-Keep team members on the same page -Demonstrate to project stakeholders tht you are accountable -Be transparent about your process

A data analyst is cleaning a dataset with inconsistent formats and repeated cases. They use the TRIM function to remove extra spaces from string variables. What other tools can they use for data cleaning? Select all that apply. -Remove duplicates -Import data -Protect sheet -Find and replace

-Remove duplicates -Find and replace

A research team runs an experiment to determine if a new security system is more effective than the previous version. What type of results are required for the experiment to be statistically significant? -Results that are unlikely to occur again -Results that are real and not caused by random chance -Results that are hypothetical and in need of more testing -Results that are inaccurate and should be ignored

-Results that are real and not caused by random chance

Which of the following are benefits of using SQL? Select all that apply. -SQL offers powerful tools for cleaning data. -SQL can be used to program microprocessors on database servers. -SQL can be adapted and used with multiple database programs. -SQL can handle huge amounts of data.

-SQL can be adapted and used with multiple database programs. -SQL can handle huge amounts of data. -SQL offers powerful tools for cleaning data.

A data analyst wants to find out how many people in Utah have swimming pools. It's unlikely that they can survey every Utah resident. Instead, they survey enough people to be representative of the population. This describes what data analytics concept? -Margin of error -Statistical significance -Confidence level -Sample

-Sample

Data and business objectives might not align for a number of reasons. Which of the following issues can prevent alignment? Select all that apply. -Data visualization -Sampling bias -Insufficient data -Data integrity

-Sampling bias -Insufficient data

SQL is a language used to communicate with databases. Like most languages, SQL has dialects. What are the advantages of learning and using standard SQL? Select all that apply. -Standard SQL is much easier to learn than other dialects. -Standard SQL is automatically translated by databases to other dialects. -Standard SQL requires a small number of syntax changes to adapt to other dialects. -Standard SQL works with a majority of databases.

-Standard SQL requires a small number of syntax changes to adapt to other dialects. -Standard SQL works with a majority of databases.

In order to have a high confidence level in a customer survey, what should the sample size accurately reflect? -The most valuable members of the population -The entire population -The trends from other customer surveys -The predictions of stakeholders

-The entire population

A data analyst uses the COUNTA function to count which of the following? -The total number of headers in a specific range -The specific numbers in a dataset -The total number of values within a specified range -The total number of entries in a changelog

-The total number of values within a specified range

Conditional formatting is a spreadsheet tool that changes how cells appear when values meet a specific condition. Data analysts can use conditional formatting to do which of the following tasks? Select all that apply. -To sort data in series of cells into a meaningful order -To identify blank cells or missing information -To make cells stand out for more efficient analysis -To calculate mathematical equations

-To identify blank cells or missing information -To make cells stand out for more efficient analysis

Why is it important for a data analyst to document the evolution of a dataset? Select all that apply. -To recover data-cleaning errors -To determine the quality of the data -To inform other users of changes -To identify best practices in the collection of data

-To recover data-cleaning errors -To determine the quality of the data -To inform other users of changes

What are the most common processes and procedures handled by data engineers? Select all that apply. -Transforming data into a useful format for analysis -Developing, maintaining, and testing databases and related systems -Verifying results of data analysis -Giving data a reliable infrastructure

-Transforming data into a useful format for analysis -Developing, maintaining, and testing databases and related systems -Giving data a reliable infrastructure

A data analyst is given a dataset for analysis. It includes data about the total population of every country in the previous 20 years. Which of the following questions would the analyst need more data to address? -Which country had the smallest population in 2017? -What was the reason for the population increase in a certain country? -What was the population of a certain country in 2020? -Which country had the greatest population in 2015?

-What was the reason for the population increase in a certain country?

In which of the following situations would a data analyst use SQL instead of a spreadsheet? -When quickly pulling information from many different sources in a database -When recording queries and changes throughout a project -When using the COUNTIF function to find a specific piece of information -When working with a huge amount of data

-When quickly pulling information from many different sources in a database -When recording queries and changes throughout a project -When working with a huge amount of data

Fill in the blank: Documentation is the process of tracking _____ during data cleaning. Select all that apply. -additions -changes -deletions -inactivity

-additions -changes -deletions

Fill in the blank: If a data analyst is using data that has been _____, the data will lack integrity and the analysis will be faulty. -public -compromised -wide -clean

-compromised

Fill in the blank: While cleaning data, documentation is used to track _____. Select all that apply. -errors -deletions -changes -bias

-errors -deletions -changes

Fill in the blank: While cleaning data, a data analyst can use a changelog to keep a chronological list of changes they make. They can refer to it during the _____ period if there are errors or questions. -visualization -documentation -presenting -verification

-verification

An analyst is cleaning a new dataset containing 500 rows. They want to make sure the data contained from cell B2 through cell B300 does not contain a number greater than 50. Which of the following COUNTIF function syntaxes could be used to answer this question? Select all that apply.

=COUNTIF(B2:B300,">50")

In order to extract the five-digit postal code from Burlington, MA, what is the correct function? 8621 Glendale Dr. Burlington, MA 01803

=RIGHT(B3,5)

Describe the difference between a null and a zero in a dataset.

A null indicates that a value does not exist. A zero is a numerical response.

A car manufacturer wants to learn more about the brand preferences of electric car owners. There are millions of electric car owners in the world. Who should the company survey?

A sample of all electric car owners

Describe the relationship between a text string and a substring.

A text string is a group of characters within a cell. A substring is a smaller subset of that text string.

A data analyst is working with product sales data. They import new data into a database. The database recognizes the data for product price as text strings. What SQL function can the analyst use to convert text strings to floats?

CAST

What SQL function lets you add strings together to create new text strings that can be used as unique keys?

CONCAT

Fill in the blank: Conditional formatting is a spreadsheet tool that changes how _____ appear when values meet a specific condition.

Cells

Every database has its own formatting, which can cause the data to seem inconsistent. Data analysts use the _____ tool to create a clean and consistent visual appearance for their spreadsheets.

Clear formatting

Fill in the blank: In data analytics, _____ describes how well two or more datasets are able to work together.

Compatibility

Which process do data analysts use to make data more organized and easier to read?

Data Manipulation

What is the process of combining two or more datasets into a single dataset?

Data Merging

A financial analyst imports a dataset to their computer from a storage device. As it's being imported, the connection is interrupted, which compromises the data. Which of the following processes caused the compromise?

Data Transfer

A data analyst at a software company wants to learn more about industry competitors. Because the software industry has more mergers than any other field, the companies and their products are constantly evolving. The analyst has a dataset from three years ago, and they notice that many of the companies and products in the dataset have changed. What makes the analyst decide that the data is insufficient, so they should generate fresh data instead?

Data is outdated

A data analyst uses the SPLIT function to divide a text string around a specified character and put each fragment into a new, separate cell. What is the specified character separating each item called?

Delimiter

TRUE or FALSE: A data analyst is given a dataset for analysis. It includes data about the total population of every country in the previous 20 years. Based on the available data, an analyst would be able to determine the reasons behind a certain country's population increase from 2016 to 2017.

FALSE

TRUE or FALSE: Sometimes during analysis, an analyst discovers that it's necessary to adjust the business objective. When this happens, the analyst should take the initiative to do so without involving others in order to be respectful of their time.

FALSE

TRUE or FALSE: When gathering data through a survey, companies can save money by surveying 100% of a population.

FALSE

TRUE OR FALSE: A data analyst determines an appropriate sample size for a survey. They can check their work by making sure the confidence level percentage plus the margin of error percentage add up to 100%.

False

TRUE or FALSE: VLOOKUP searches for a value in a row in order to return a corresponding piece of information.

False

TRUE or FALSE: Verification and reporting come directly before the data-cleaning process.

False

In SQL databases, what data type refers to a number that contains a decimal?

Float

Fill in the blank: Data _____ refers to the accuracy, completeness, consistency, and trustworthiness of data throughout its life cycle.

Integrity

Fill in the blank: Data mapping is the process of _____ fields from one data source to another.

Matching

Fill in the blank: Margin of error is the _____ amount that the sample results are expected to differ from those of the actual population.

Maximum

A data analyst is analyzing medical data for a health insurance company. The dataset contains billions of rows of data. Which of the following tools will handle the data most efficiently?

SQL

A delimiter is a character that indicates the beginning or end of a data item. The split text to columns tool uses a delimiter to accomplish what task?

Split one column into two

Fill in the blank: A predetermined structure that includes a function's required information and its proper placement is called _____.

Syntax

Fill in the blank: To remove leading, trailing, and repeated spaces in data, analysts use the ____ function.

TRIM

Which of the following functions automatically remove extra spaces when cleaning data? -SNIP -CLEAR -TRIM -REMOVE

TRIM

Fill in the blank: Sampling bias in data collection happens when a sample isn't representative of _____.

The population as a whole

TRUE or FALSE: A data analyst makes changes to SQL queries and uses these comments to create a changelog. This involves specifying the changes they made and why they made them.

True

A data analyst is cleaning a dataset. They want to confirm that users entered five-digit zip codes correctly by checking the data in a certain spreadsheet column. What would be most helpful as the next step?

Using the field length tool to specify the number of characters in each cell in the column

The V in VLOOKUP stands for what?

Vertical


Related study sets

CCNA 200-301 Domain 1: Network Fundamentals

View Set

Chapter 9: The Inheritance of Personality

View Set

AP Chemistry Possible Questions Bank

View Set

Psychology of Drugs and Behaviour Final Content

View Set

Physiology Exam 3 Week 7 (Questions)

View Set

CHAPTER 3: INTERESTS AND ESTATES

View Set

VIVA 1 MODULE 5 5.En la cafetería

View Set