D467 WGU Exploring Data

Ace your homework & exams now with Quizwiz!

Which statistical power is typically considered the minimum for statistical significance?

0.8 (80%)

Fill in the blank: Typically, a data professional aims to achieve a statistical power of at least _____ to consider their results statistically significant.

0.8, or 80%

To track people's online activities and interests, which method of data collection is most effective?

Cookies

What is the process of structuring folders broadly at the top, then breaking down those folders into more specific topics?

Creating a hierarchy

A data analyst removes personally identifying information from a dataset. What task are they performing?

Data anonymization

Which process do data analysts use to make data more organized and easier to read?

Data manipulation

Which process may restrict data analysis needs and should be balanced with data access needs?

Data security

Which tool is used by data analysts to store and organize data, making it easier for them to manage and access information?

Database

What is an example of conceptual data modeling that an analyst uses?

Defining the business requirements for a new database

Which of the following questions would enable a data professional to collect nominal qualitative data?

Did anyone recommend our music lessons to you?

What data-security measure uses a unique algorithm to alter data and make it inaccessible without the algorithm?

Encryption

In data analytics, what is the term for data that is generated from, and lives, outside of an

External

When discussing structured databases, data analysts refer to the data contained in a row as a record. How do they refer to the data contained in a column?

Field

A data analyst is reviewing a national database of real estate sales. They are only interested in sales of condominiums. How can the analyst narrow their scope?

Filter out non-condominium sales

What is a feature of the filtering process when applied to spreadsheets?

Filtering hides the data temporarily.

If you have to complete your analysis with insufficient data, how should you address this limitation?

Identify trends with the available data

Bringing data from a .csv file into a spreadsheet is an example of what process?

Importing data

How does an analyst apply the principle of ownership to ethics and privacy in data collection?

Individuals who create the data should own it.

A manager in charge of selling a particular product interprets any ambiguous customer feedback about the product as being positive. What type of bias does this represent?

Interpretation

What is the general rule regarding the suggested length of each line in a query to maintain indentation best practices?

Less than or equal to 100 characters

A data analytics team uses data about data to indicate consistent naming conventions for a project. What type of data is involved in this scenario?

Metadata

What is the term for an identifier that references a database column in which each value is unique?

Primary key

Which file name follows formatting conventions?

SalesReport_2021

A university surveys its student-athletes about their experience in college sports. The survey only includes student-athletes with scholarships. What type of bias does this scenario describe?

Sampling

A grocery store chain purchases customer data from a credit card company. The grocer uses this data to identify its most loyal customers and offer them special promotions and discounts. What type of data is being used in this scenario?

Second-party

What are cookies?

Small files stored on computers that contain information about users

What is the process for arranging data into a meaningful order to make it easier to understand, analyze, and visualize?

Sorting

Which type of structured data does an analyst use?

Store inventory

A large company has several databases across its many departments. What kind of metadata describes how many locations contain a certain piece of data?

Structural

When using tokenization as a safety measure, what is replaced as a randomly generated token?

The data elements to be protected

In order to have a high confidence level in a customer survey, what should the sample size accurately reflect?

The entire population

What concept states that all data-processing activities and algorithms should be completely explainable and understood by the individual who provides their data?

Transaction transparency

What is the best practice for naming folders and subfolders to organize data?

Use descriptive names.

A data analyst at a retail company uses a tool to explore the data in its customer database. They learn the definition of each column, the data types contained, and the relationships between different tables. What does this scenario describe?

Using a metadata repository

A data analyst wants to find out how many middle school students in Helsinki have laptops. It is unlikely that they can survey every middle schooler in the city. Instead, they survey enough students to represent all middle schoolers. This describes what data analytics concept?

Using a sample

A political scientist needs to poll all voters in Seoul, South Korea, in order to predict the outcome of an election. Because it would be impossible to collect data from every single person in the city, the political scientist polls a part of the population that is representative of the whole. What does this scenario describe?

Using a sample

What data-security practice enables all collaborators within a file to track changes, such as who made what edits to the file, when they were made, and why?

Version control

Fill in the blank: When using SQL, the _____ clause can be used to filter a dataset of customers to only include people who have made a purchase in the past month.

WHERE

What are data ethics?

Well-founded standards of right and wrong that dictate how data is collected, shared, and used

Fill in the blank: A data type is a specific kind of data _____ that tells what kind of value the data is.

attribute

Fill in the blank: Sampling _____ occurs when some members of a population are overrepresented or underrepresented in the data.

bias

Fill in the blank: Bias is a _____ preference in favor of or against a person, group of people, or thing.

conscious or subconscious

Fill in the blank: Naming _____ are consistent guidelines used to describe the content, date, or version of a file.

conventions

Fill in the blank: The number of stars awarded to a product review is an example of _____ data.

discrete

Fill in the blank: For data analytics projects, _____ data is typically preferred because users know it originated within the organization.

first-party

A data analyst uses _____ to organize multiple files for a given project so they can be found and accessed in an efficient manner.

foldering

Fill in the blank: Openness refers to _____ access, usage, and sharing of data.

free

Fill in the blank: To keep a header row at the top of a spreadsheet, highlight the row and select _____ from the View menu.

freeze

Fill in the blank: Data _____ is a process data professionals use to ensure the formal management of their organization's data assets.

governance

Fill in the blank: Data professionals use data _____ to handle issues related to internal and external data flows while ensuring data assets are formally managed.

governance

Fill in the blank: To keep files organized, use a logical _____ to organize folders and subfolders.

hierarchy

Fill in the blank: Data _____ involves the accuracy, completeness, consistency, and trustworthiness of data throughout its lifecycle.

integrity

Fill in the blank: Hypothesis testing is a way to see if a survey or experiment has _____ results.

meaningful

What is an acceptable syntax for the SELECT keyword in MySQL?

select

Fill in the blank: A relational database contains a series of _____ that can be connected to form relationships.

tables

You are in charge of your company's weekly accounting spreadsheet. It has 15 sheets, each containing a different employee's purchases. You add restrictions to the spreadsheet to make sure employees can only edit their own sheets. What practice does this scenario describe?

Data security

A large metropolitan high school gives each of its students an ID number to differentiate them in its database. What kind of metadata are the ID numbers?

Descriptive

Which element of a Notepad file would be considered data as opposed to underlying metadata?

File contents

What is an example of administrative metadata for a digital file?

File permission

A data team at a trade school is sending a text alert to all students who have fewer than 10 credits. What spreadsheet tool will enable them to display only the students who meet that condition?

Filter out students with more than 10 credits

Which process utilizes logical and descriptive names for files, making them easier to find and use?

Foldering

A data analyst works on an urgent traffic study. As a result of the short time frame, which type of data might yield the best results?

Historical

In Google Sheets, what function enables a data analyst to specify a range of cells in one spreadsheet to be duplicated in another?

IMPORTRANGE

How does an analyst ensure that a data source is reliable?

It is accurate, complete, and unbiased information.

An expert in query languages searched for month_name = using Vertica. The data set contains variations of the word December, such as dec, Dec, etc. What will the output of this search query be?

It will return all entries that match DEC only.

What process do data professionals use to eliminate data redundancy, increase data integrity, and reduce complexity in a database?

Normalization

When using long data, each subject has data in multiple rows. This is because each row represents what?

One observation per subject

A financial institution publishes data about stock prices and market trends, which any business, nonprofit, or citizen can access, reuse, or redistribute through its online databases. What type of data is described in this scenario?

Open

What leads to confirmation bias in data collection?

People search to verify preexisting beliefs.

In data analytics, what term refers to all possible data values in a dataset?

Population

An analyst used a column of a table to uniquely identify each record within a table. Which tool did they use?

Primary key

Legal right to access your data is an element of which aspect of data ethics?

Privacy

What is the difference between raw data and information?

Raw data is unorganized, while information is structured.

A research team conducts an experiment to determine if a new cybersecurity tool is more effective than the previous version. What type of results are required for the experiment to be statistically significant?

Results that are real and not caused by random chance

Fill in the blank: A data model is used to organize _____ and how they relate to one another.

data elements

Fill in the blank: When using a relational database, data analysts write _____ to request data from the related tables.

queries

Fill in the blank: Data is considered _____ when it is accurate, complete, and unbiased information that has been vetted and proven fit for use.

reliable

Fill in the blank: Data security involves adopting _____ in order to protect data from unauthorized access or corruption.

safety measures

Fill in the blank: The data ethics principle of _____ states that an individual has the right to understand all of the data processing activities and algorithms used on their data.

transaction transparency

Which example shows the use of primary data?

A company's survey data of its customers' satisfaction

What is the preferred method for open data to be made available?

A convenient and modifiable internet download

Which of the following examples would be the most effective file name?

AirportCampaign_2013_10_09_V01

What can be removed from the following query without preventing it from running? SELECT * FROM `Uni_dataset.new_table` WHERE ID = 'Lawrence'

Backticks (`)

What should an analyst consider at the start of data collection to reduce errors?

Bias and fairness

A data scientist at a tech company records whether users have accepted their company's terms of service or not. What data type is being collected in this scenario?

Boolean

How does an analyst apply the principle of consent to ethics and privacy in data collection?

By disclosing how and why the data will be used before the survey

A data analyst works for a rental car company. They have a spreadsheet that lists car ID numbers and the dates cars were returned. How should they sort the spreadsheet to find the most recently returned cars?

By return date, in descending order

How does an analyst apply data ethics to privacy and collection?

By using the collected data responsibly

A database table is named WebTrafficAnalytics. What type of case is this?

Camel case

Before analysis, a company collects data from countries that use different date formats. Which of the following actions would improve the data integrity?

Change all of the dates to the same format

An analyst needs to show the geographic distribution of customers in the United States by region. Which visual representation should this analyst use?

Charts

A data team at a nature preserve researches the origin of a dataset to confirm it was created by a reputable source, such as a nonprofit research institution. Which aspect of good data are they prioritizing?

Cited

In a data table, where are fields contained?

Columns

What type of file saves data in a table format?

Comma-separated values (.csv)

What is the term for the tendency to search for or interpret information in a way that validates pre-existing beliefs?

Confirmation bias

Before completing a survey, a respondent learns more about how their data will be used. They understand why their data is being collected and how long it will be stored. What data ethics concept does this describe?

Consent

Before completing a survey, an individual acknowledges reading information about how and why the data they provide will be used. What is this concept called?

Consent

What type of data is the height of a skyscraper?

Continuous

Which role does an analyst have in collecting second-party data?

Contracts with an external entity

A junior data analyst learns that the dataset they have been given is six years old. After looking into this further, they also discovered that the age of the data is making the information irrelevant to their project. What good data source principle have they used to evaluate the dataset?

Current


Related study sets

Spinal Cord Injury NCLEX Questions

View Set

Some Endocrine Med NCLEX Saunders Questions

View Set

Ch 27_ Child with Cerebral Dysfunction

View Set

Pentest+ Lesson 11 - Targeting Mobile Devices

View Set

Ch 6 Fin, Ch 7 Fin, CH 8 Fin, Chapter 11

View Set

BIO-1106 Final Exam Practice Questions

View Set

English Questions and Spanish answers

View Set

Chapter 14: Preoperative Nursing Management

View Set