Week 2 Ensuring data integrity

Ace your homework & exams now with Quizwiz!

Bias

A preference in favor of or against a person, group of people, or thing

Open data

Data that can be accessed, updated, reused, and exchanged by anyone and everyone.

Openness

Free access, usage, and sharing of data

GDPR (General Data Protection Regulation)

General Data Protection Regulation of the European Union

Which of the following are usually good data sources? Select all that apply.

Governmental agency data Academic papers Vetted public datasets

Ownership

Individuals own the raw data they provide and they have primary control over its usage, how it's processed, and how it's shared

What are the main benefits of open data? Select all that apply

Open data makes good data more widely available Open data combines data from different fields of knowledge

What aspect of data ethics promotes the free access, usage, and sharing of data?

Openness

Privacy

Preserving a data subject's information and activity any time a data transaction occurs

ROCCC

Reliable Original Comprehensive Current Cited

Kaggle's datasets and Data Explorer allow you to do which tasks?

Search for datasets Upload your own datasets Access datasets

What is most often anonymized

Telephone numbers Names License plates and license numbers Social security numbers IP addresses Medical records Email addresses Photographs Account numbers

Data Interoperability

The ability of data systems and services to openly connect and share data

A data analyst is analyzing sales data for the newest version of a product. They use third-party data about an older version of the product. For what reasons is this inappropriate for their analysis? Select all that apply.

The data is not current The data is not original

Data anonymization

The process of protecting people's private or sensitive data by eliminating identifying information

Observer Bias (Experimenter Bias/ research bias)

The tendency for different people to observe things differently

Interpretation bias

The tendency to always interpret ambiguous situations in a positive or negative way

comfirmation bias

The tendency to search for interpret information in a way that confirms pre-existing beliefs

An unbiased sample is representative of the population being measured. Which of the following helps ensure unbiased sampling?

Using random sampling during data collection

Data Ethics

Well-founded standards of right and wrong that dictate how data is collected, shared, and used

Ethics

Well-founded standards of right and wrong that prescribe what humans ought to do, usually in terms of rights, obligations, benefits to society fairness, or specific virtues

Unbiased sampling

When a sample is representative of the population being measured

Aspects of data ethics

- ownership - transaction transparency - consent - currency - privacy - openness

Which of the following are examples of sampling bias? Select all that apply.

A clinical study includes three times more men than women A survey of high-school-age students does not include homeschooled students A national election poll only interviews people with college degrees.

Data bias

A type of error that systematically skews results in a certain direction

transaction transparency

All data-processing activities and algorithms should be completely explainable and understood by the individual who provides their data.

Consent

An individual's right to know explicit details about how and why their data will be used before agreeing to provide it

There are 50 students in a class. A data analyst wants to know if a majority of students like the instructor. They decide to survey the 15 students who earned an A in the class because these students were clearly paying attention to the instructor. Which of the following statements best describes this sample?

Biased

Before completing a survey, an individual acknowledges reading information about how and why the data they provide will be used. What is this concept called?

Consent

A data analyst removes personally identifying information from a dataset. What task are they performing?

Data anonymization

Universal participation is a standard of open data. What are the key aspects of universal participation? Select all that apply.

Everyone must be able to use, re-use, and redistribute open data No one can place restrictions on data to discriminate against a person or group.

Currency

Individuals should be aware of financial transactions resulting from the use of their personal data and the scale of these transactions

To determine if a data source is cited, you should ask which of the following questions? Select all that apply

Is this dataset from a credible source Who created this dataset

Which of the following terms are also ways of describing observer bias? Select all that apply.

Research bias Experimental bias

Fill in the blank: _____ states that all data-processing activities and algorithms should be completely explainable and understood by the individual who provides their data.

Transaction transparency

Sampling bias

When a sample isn't representative of the population as a whole

Fill in the blank: The tendency to search for or interpret information in a way that validates pre-existing beliefs is _____ bias.

confirmation

Personal identifiable information (PII)

information about an individual that identifies, links, relates, or describes them.


Related study sets

Chapter 10: Valuation & Rates of Return

View Set

Chapter 4 Assessment and Instruction

View Set

FLORIDA REAL ESTATE 63 HOUR PRACTICE TEST

View Set

Python New Package Introduction (NPI)

View Set

Organizational and Professional Health and Well-Being

View Set