ITM 209 Final Exam

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Which of the following are properties of primary keys?

-Each tuple must have a unique primary key -Several distinct attributes could be used together to form a primary key -It is the candidate key that is chosen as the principal means of identifying tuples within a relation

Which of the following is an example of "internet of things" (IOT) data?

-GPS Sensors -Smart Utility Meters -Fitness Sensors

All organizations need to understand and govern PII through which of the following:

-Identifying all sources of created, received, maintained or transmitted PII -Evaluating all external sources of PII -Identifying all human, natural and environmental threats to PII

Which of the following examples could cause a 'butterfly effect' (as defined in the text) in an organizations data?

-Inaccurate customer records -Incomplete purchasing history -A cascading spelling mistake

Which of the following is TRUE about a view:

-It can be used within a database to store table relationships (i.e. a query) for users to access -It conceptually contains the results of a query -If the underlying data in the tables and relations changes, so will the results of the query

What is the most secure type of authentication?

-Something the user knows, such as a user ID and password -Something the user has, such as a smart card or token -Something that is part of the user, such as a fingerprint or voice signature

Which of the following is explained as the reason that humans retain comparative advantage over artificial intelligence when addressing uncertainty and equivocality in decision making?

-Superior intuition -Imagination -Creativity

When considering the colors to use in a visualization, which of the following should be considered?

-Whether the color adds value to the visualization or is just decorative in nature -The manner in which certain color schemes may be interpreted based upon the culture(s) of the audience -The accessibility/readability by individuals with color blindness

Which of the following is a common characteristic of quality data?

-complete -accurate -unique -timely

When inner joining two tables with a many to many relationship, how many inner join clauses are needed in your query?

2

By what year does Ray Kurzweil predict that machines will be able to achieve the intelligence of human beings?

2029

Which of the following does not describe unstructured data?

A defined length, type, and format.

Which of the following describes what Trend Lines are?

A feature Tableau to show a line that represents the relationships between a set of data points that have been plotted (i.e.regression)

What is a data lake?

A storage repository that holds a vast amount of raw data in its original format until the business needs it.

Which of the following use of predictive analytics has variables that are changed due to factors outside the data-generating process and are independent of all other variables?

Active Prediction

Which of the following use of predictive analytics has variables that are changed due to factors outside the data-generating process and are independent of all other variables?

Active prediction

Angela works for an identity protection company that maintains large amounts of sensitive customer information such as usernames, passwords, personal information, and social security numbers. Angela and a coworker decide to use the sensitive information to open credit cards in a few of her customer's names. This is a classic example of which of the following security breaches?

An insider.

Which level of maturity describes companies with an analytic culture that make data driven decisions and rely upon analytics for strategic insights?

Analytic Innovators

AJ wants to send Bryan a small message securely. He wants to make sure that only Bryan can read the message, thus ensuring confidentiality. Which of the following encryption methods would he use?

Asymmetric Encryption, with the message encrypted using Bryan's public key.

Which of the following represents the three areas where technology can aid in the defense against security attacks?

Authentication and authorization, prevention and resistance, detection and response.

When using colors to visually distinguish between dimensions on a visualization in Tableau, which of the following use of color is most commonly applied?

Categorical

What is the flaw with pseudonymization?

Certain data elements can still be used in combination with additional data to identify the data subject it relates to

Which of the following did the article indicate companies are accomplishing through better use of data analytics?

Competitive advantage and innovation

The assurance that messages and information remain only to those authorized to view them.

Confidentiality

Joe is working with accounts receivable data. The database he is getting data from lists the balance in accounts receivable as $250,654. However, when he adds the accounts receivables from all the sub-accounts (i.e. customer accounts), he gets a value of $251,928. The database may be suffering from integrity issues due to which of the following quality characteristics?

Consistency

The ZN function in Tableau can be used to do which of the following?

Convert null values for a field in a data set to a value of 0

Scott has data that contains a field showing the high temperature in degrees Fahrenheit (ex. 65, 70, 73, 40, etc.) in his town by day. He wants to be able to show the temperature for each day in one of two categories: a) Normal: <80 b) Above Normal: >= 80

Create the following calculated field: IF [temperature] < 80 THEN "Normal" ELSE "Above Normal" END

Which of the following would be an example of predictive analytics?

Creating an analysis of the sales of a product for the prior year to identify what sales may be in the upcoming year.

Which of the following would be an example of predictive analysis?

Creating an analysis of the sales of a product for the prior year to identify what sales may be for the upcoming year.

If the following join statement is used to join two tables in a query, which of the following tables would all of the tuples in the relation appear in results? right outer join schema.customers on invoices.customerID = customers.customerID

Customers

Which of the following is the process of analyzing data to extract information not offered by the raw data alone?

Data Mining

Eric was asked to setup a visualization summarizing data on patients staying at the hospital based around the number of days they have been there. He has a data set that contains information on patients, which includes the date the patient was admitted (field admittedDate). Doing some research, he found that Tableau uses TODAY() to represent the current date. Which of the following calculated fields in Tableau would identify the number of days that patients have been at the hospital?

DATEDIFF('day',[admittedDate],TODAY())

Which of the following calculated fields in Tableau would identify the number of days it took to complete a service?

DATEDIFF('day',[dateIn],[dateComplete])

Collecting information from many sources and storing them together into a single location is referred to as:

Data aggregation

Which of the following is the collection of data from various sources for the purpose of data processing?

Data aggregation

Tools used to find patterns and relationships in large volumes of information that predict future behavior and guide decision making are referred to as:

Data mining tools

Which of the following describes the phenomenon where there is an incentive to record everything?

Datafication

Which of the following is a type of visualization in which you are presenting findings to an audience?

Declarative visualization

Which of the following fields in a data set would usually be found in the Dimensions area in Tableau?

Departments

Bill runs a report of all the sales for the past quarter and puts it into a visualization to show his boss the results. This is an example of what type of analysis?

Descriptive

A summary or interpretation of a data set is an example of:

Descriptive Analytics

Less than half of companies surveyed in 2016 indicated they were effective at using data to guide future strategy, which was down over the prior year.

False

Which of the following keywords when used in an SQL select statement will remove duplicate records from the results?

Distinct

Color and formatting should be used in Tableau to:

Draw attention to relevant data

During which of the following processes does information cleansing usually occur?

ETL processes

Companies use data warehouses for each of the following except:

Enter and process invoices real-time as they are received.

Which of the following is not an example of a primary enterprise system?

Enterprise revenue planning

Which of the following would be an example of causal inference?

Establishing a relationship between tariff levels and the quantity of products imported

The principles and standards that guide our behavior toward other people

Ethics

What governing body passed GDPR?

European Union (EU)

Which of the following can be described as using if-then statements to capture human knowledge?

Expert systems

A data set is a collection of organized or unorganized data.

False

Association analysis groups observations according to some measure of similarity.

False

BM's Watson can only analyze structured data.

False

Box and whisker plots are used for identifying correlation between two variables.

False

Contemporary database systems provide a three-level hierarchy for naming relations. The top level of the hierarchy consists of schemas, each of which contains catalogs.

False

Data models show the details of the physical view of information for a database.

False

Data within a view is a duplicate copy of the data that is in the underlying tables related to the view.

False

Discrete data can take on any value within a range

False

If data is changed in a table, any views that reference the table will not show the new data until the data in the view is also updated.

False

Intuitive approaches to decision making rely on depth of information, analytical approaches focus on breadth by engaging a problem with a holistic and abstract view.

False

PKE (Public Key Encryption) uses a single common key between the sender and recipient of a message to encrypt and decrypt the message

False

The additional table used to create a many to many relationship will have a primary key that only consists of a single unique attribute in all cases.

False

The difference between a direct causal relationship and an indirect causal relationship is that an direct casual relationship requires a third variable to impact a change in one variable on another variable.

False

The intersect operation does not remove duplicates, intersect all must be utilized.

False

The majority of companies that responded in the study described their companies as "open about sharing data."

False

The majority of organizations reported in the 2016 study that they are analytically challenged (i.e. they rely upon management's intuition more than data for decision making)

False

The only cause of poor quality of data is human error.

False

The problem solving ability of AI is more useful for supporting intuitive rather than analytical decision making.

False

The technique of organizing data into distinct segments that are defined before the analysis begins is referred to as cluster analysis.

False

The validation set of training data for an Advanced Neural Network is used only to test the final solution in order to confirm the actual predictive power of the network.

False

True or False: In most organizations, the managers in the operational areas (such as the manufacturing plant level) would be more interested in less granular information, whereas the executive officers of the organization would be requesting more granular information.

False

Unstructured data extracts information from data and uses it to predict future trends and identify behavioral patterns.

False

When creating relationships between tables, foreign keys are always optional. Only primary keys are needed in each table.

False

less than half of companies surveyed in 2016 indicated they were effective at using data to guide future strategy, which was down over the prior year.

False

What would be the output from a query if the following wildcard pattern were used?

Finds any cities that have "or" in any position

Amazon tracking the behavior of it's users is an example of collecting:

First-party data

What does GDPR stand for?

General Data Protection Regulations

Which of the following would violate a foreign-key constraint?

Having a value in the attribute for a foreign key that does not correspond to a value in the table which the foreign key is coming from.

Kegan is creating a visualization in Tableau that shows the average profit per unit sold by country for each of the four (4) products that his company sells. His data has the unit selling price and unit cost for each sale. These values may vary from one sale to the next depending on market price of materials relating to the cost and the unit price negotiated with the customer. The data looks like the following (assume all currency values have been converted to US Dollars already) He created the following calculated field, however he is not sure if it is giving an accurate average profit per unit when he applies it in his visualization using the AVG aggregation by country: ([Unit Price] - [Unit Cost]) / [Quantity] What advice would you give him and why?

He should be using the SUM function: SUM([Unit Price] - [Unit Cost]) / SUM([Quantity]) because some sales may contribute more or less to the average than others based upon

HIPAA is a regulation that applies to which industry?

Healthcare

Which of the following are better at making decisions when there is uncertainty?

Humans

Shell's marketing department did what to find ways to use data smarter?

Implemented mandatory data-drive communications training for all marketers

Which of the following refers to the measure of the quality of information?

Information integrity

Which of the following is decreased when using a relational database?

Information redundancy

Which of the following is NOT a component of Artificial Intelligence

Intuition Engine

Which of the following WOULD NOT be considered part of the ACCURATE characteristic of high-quality information?

Is aggregate information in agreement with detailed information?

Encryption:

Is used to scramble information into an alternative form that requires a key to read it.

Jill is creating a visualization in Tableau that is plotting points on a map. She decides to use the 'size' mark in her visualization. What does this accomplish?

It differentiates the points based upon the values of the measures used by making larger values visually bigger points.

Jill is creating a visualization in tableau that is plotting points on a map. She decides to use the 'size' mark in her visualization. What does this accomplish?

It differentiates the points based upon the values of the measures used by making larger values visually bigger points.

What is the role of a foreign key?

It is an attribute that is the primary key of one table that appears as an attribute in another table. It acts to provide a logical relationship between the two tables

What are the first two lines of defense a company should take when addressing security risks?

People first, technology second.

Which of the following describes a full outer join?

It preserves tuples in both relations.

Which of the following is an example of lag information?

KPIs (key performance indicators)

Which of the following charts are good for showing data changes over time?

Line chart

What does PII stand for?

Personally Identifiable Information

Which of the following is a technology challenges for big data?

Managing huge volumes of data, Managing streams at an extremely fast and variable pace, Managing a variety of forms and functions of data, Processing data at a huge speed

When did GDPR become effective?

May 25, 2018

Which of the following are all the same value in a normal distribution?

Mean, Median, Mode

The type of qualitative data that cannot be ranked, but can be used to count, group and take a proportion is:

Nominal

Which of the following is the first line of defense in securing information?

People

Jeff is preparing an analysis of sales year over year to determine what sales may be in the upcoming year based upon the relative seasonal sales cycle that his company experiences. This would be an example of what type of data analysis?

Predictive Analysis

Joey is creating a model based upon past stock trading information. The purpose is to indicate to management management of the best stock derivative arrangements and when to enter into them. This would be an example of why type of analysis?

Prescriptive

What uses techniques that create models indicating the best decision to make or course of action to take?

Prescriptive analytics.

Which of the following is used to uniquely identify a row (or tuple) in a table?

Primary key

The right to be left alone when you want to be.

Privacy

Nominal Data and Ordinal Data both are types of:

Qualitative Data

Nominal Data and Ordinal Data both are types:

Qualitative Data

That feature of Tableau would you utilize to label the percent of total that a slice of a pie chart makes up? (such as 5.5%)

Quick table calculation

When using diverging colors on a diagram, which of the following is one of the least desirable color schemes when considering the ability for those with color blindness to be able to effectively read / use the visualization?

Red-Green Diverging

Tree maps and heat maps use which of the following to show proportional size of values?

Size and color

Which of the following is referred to as the use of social skills to trick people into revealing access credentials or other valuable information?

Social Engineers

The pattern of reading that was originally based upon eye tracking behavior on websites but is applied to visualizations in general when determining the best layout for a dashboard is referred to as:

The F Pattern

What should be your focus when designing your visualization?

The audience

What should be your primary focus when designing your visualization?

The audience

With reference to data granularity, which of the following groups of individuals would typically want to see information at the least granular (i.e. more course) level?

The board of directors

Which of the following is a characteristic of a data lake?

The data is stored in raw form until needed for processing or analysis

Artificial Neural Networks are designed after which of the following:

The human brain

Size, color, label and detail are all examples of Tableau features that are found where?

The marks card

A null value means:

The value is unknown or does not exist

Early systems of AI used deterministic hard-coded logic. Which of the following describes why this method of creating AI became tenuous?

The worlds store of information kept growing

When Artificial Neural Networks are referred to as black boxes, which of the following is being referred to?

They provide little guidance on the intuitive logic behind their predictions

information itself has no ethics. Therefore who is responsible for developing ethical guidelines about how to manage it?

Those who own the information

When creating a histogram, what is the purpose of using the 'create bins' feature within Tableau?

To group together bands of values into buckets for measures that represent continuous data

What is the where clause in an SQL statement used for?

To select only those rows in the result relation of the from clause that satisfy a specified predicate.

Which of the following sets of data are used in machine learning to adjust the weights on the neural network?

Training Set

What chart type would be best to show the hierarchical nature of data?

Tree Map

What chart type would be the best to show the hierarchical nature of data (i.e. how sub-components build up to their parent components)?

Tree Map

Which function would be used in Tableau to show a line that represents the relationships between a set of data points that have been plotted?

Trend Lines

A person can act legally but not be acting ethically

True

A request for information from a database is called a query.

True

Which of the following applies to many to many relationships but not to one to many relationships?

You need a third table to create the relationship

A schema diagram is a pictorial depiction of the schema of a database that shows the relations in the database, their attributes, and primary keys and foreign keys.

True

Analytic Innovators are more than 60% more likely than Analytic Practitioners to use analytics for innovations that lead to new products, services, and processes to improve existing ones.

True

Analytical Innovators use data and analytics both to innovate incrementally in existing products, services and processes and to create all new products, services and business models

True

Big data is growing at an exponential rate.

True

Companies that are using analytics to automate processes in the business are gaining benefits through employees having more time to work on higher-value-added tasks.

True

Companies that are using analytics to automate processes in the business are gaining benefits through employees having more time to work on higher-value-added tasks

True

Deep learning is a subset of machine learning.

True

Dumpster diving is a method of obtaining information from users by going through discarded items (e.g. trash)

True

Human-AI symbiosis is effective because it allows for a blend of both analytic and intuitive approaches to decision making

True

In a SQL statement, union is used to join two queries together.

True

In a full outer join, all the records from the right and left tables that meet the criteria of the query will appear. This would include records from each table where there areno related records (tuples) in the other table.

True

In order to perform any actions on a database, a user (or a program such as MySQL Workbench) must first connect to a database.

True

One example of continuous data is distance.

True

One example of continuous data is height.

True

Qualitative data is categorical data.

True

Tableau allows for connections to live data in a database for purposes of having dashboards that can be refreshed periodically at a predetermined frequency.

True

Text in a novel is an example of unstructured data.

True

The select clause of the statement is used to list the attributes desired in the result of a query.

True

The use of the and logical connective is to find tuples that meet two or more criteria.

True

True or False: Organizations may have inconsistent data definitions between their production systems / databases. This may be a reason for the organization to utilize a data warehouse.

True

Using the 'as' statement in the select clause for a query will label the column or attributes header in the results with the specified text. For example: select people.personName as 'Name' would return 'Name' as the column header rather than personName.

True

With enough training through machine learning, a neural network can learn enough to begin to match the predictive accuracy of a human expert

True

Which of the following describes the veracity characteristics of big data?

Uncertainty and or untrustworthiness of data

Big data is mostly, over 90 percent:

Unstructured data

Donovan is creating a chart that utilizes a map. He wants to have the map show the borders of the different counties within each state. Where would he go to enable this on the map?

Use the Map Styles menu option

Which describes prescriptive analytics?

Uses techniques that create models indicating the best decision to make or course of action to take.

Which method of protecting data is better when considering the value of the data once personally identifiable information has been removed?

Using statistical approaches to convert original data to synthetic

Which of the following IS NOT one of the five common characteristics of quality data? (as described in the text and in class)

Valid

Which of the following describes the speed of data?

Velocity

Which of the following may be indicators of big data?

Velocity Veracity Variety Volume

Which of the following would be a reason to utilize a one-to-one (1-1) relationship?

When you have attributes about tuples (records) for which not every tuple may have information for the attribute. For example, if you were recording information about people and did not record physical characteristics such as height for every person, you may create a 1-1 relationship.

When should you use multiple colors?

When you need to differentiate types of data

Which of the following charts functions well for showing proportions (vs. quantitative data)?

Word Clouds

Which of the following charts is described in the chapter as functioning well for showing proportions (vs. quantitative data)?

Word Clouds

Which of the following would be most likely to contain the most unstructured data?

Your personal music library

Joe is doing an analysis of his investment portfolio. His data contains variables that are change due to factors outside the data- generating process and are independent of all other variables in the data. Which of the following predictive analytics uses describes the type of prediction he is doing?

active prediction

The options for order by when writing a SQL statement are:

asc, desc

If the following join statement is used to join two tables in a query, which of the following tables would all of the tuples in the relation appear in results? full outer join schema.customers on invoices.customerID = customers.customerID

both customers and invoices

The as clause:

can be used to rename attributes in the results of the query.

Which aggregation function shows the number of records that meet a set of criteria?

count

Joe is in the process of trying to eliminate all duplicate records and correct any records in the database where a relationship between the tuples in two related tables no longer exists. This would be an example of:

data scrubbing/cleansing

The purpose of integrity constraints is to:

ensure that changes made to the database do not result in a loss of data consistency.

Regression models are used to:

estimate the relationships among variables

The three factors of the variety of data are:

form, function, source

he global head of CRM (customer relationship management) was positive towards the changes required with GDPR with regard to getting customer consent to collect their data for marketing because it would:

increase data quality

Governance of the ethical and moral issues arises from the development and use of information technologies as well as the creation, collection, duplication, distribution and processing of information.

information Ethics

Considering the following, which would be the correct inner join clause to use in the query: - The two tables being joined are prescriptions and patients. A patient may have multiple prescriptions. A prescription can only relate to a single patient. - The select clause is selecting the following fields: patients.name,patients.dateOfBirth,prescriptions.rxNumber, prescriptions.medication, prescriptions.dosage - The query contained 'from pharmacy.patients' for the from clause - The primary key of patients is patients.patientID - The primary key of prescriptions is prescriptions.rxNumber

inner join pharmacy.prescriptions on prescriptions.patientID = patients.patientID

Which of the following is the human capacity to analyze alternatives with deep perception, transcending ordinary-level functioning based on simple rational thinking?

intuitive intelligence

If the following join statements is used to join two tables in a query, which of the following tables would all of the tuples in the relation appear in results. left outer join schema.customers on invoices.customerID = customers.customerID

invoices

A digital certificate:

is a data file that identifies individuals or organizations online

Which of the following is used in a SQL statement where clause to show all records where a particular attribute has null values.

is null

When creating a view, the data that is returned from querying the view:

is stored in the tables that the view queries.

The integrity constraint that requires that an attribute in a tuple not be blank (i.e. no value) is:

not null

What is needed to train an neural network?

large amounts of data

Details about the data is referred to as:

metadata

The operator like in a SQL statement is used for:

pattern matching

Which of the following is NOT a type of pattern analysis:

perfunctory analysis

Which of the following is "the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information"?

pseudonymization

Which of the following describes the basic premise of how an Artificial Neural Network works?

receive inputs, process the inputs, provide an output

The concept that a value that appears in one relation for a given set of attributes must appear for a set of attributes in another relation is:

referential integrity

Sharing information with other companies for mutual benefit is an example of:

second-party data

Which of the following SQL statements will provide all the tuples (records) and attributes from the table 'employees' which the individuals in the table are less than 40 and that make more than $150,000?

select * from hr.employees where employees.age < 40 and employees.income > 150000

Lauren is querying a data set and the results she keeps getting has a lot of duplicate rows returned. She would like to remove duplicates from the results and only display unique rows of data. What function in SQL would she use in her query?

select distinct

The three basic clauses of a SQL statement to select data are:

select, from, where

Pattern discovery is:

the process of identifying distinctive relationships between observations in a data set.

A variable in data set is considered to be exogenously altered if:

the variable changes due to factors outside the data-generating process, such as the analyst making a change to a variable to identify if there is an impact on another variable.

Purchasing data from an organization that collected it is referred to as:

third-party data

Which of the following is characterized as a lack of information about all alternatives or their consequences?

uncertainty

The integrity constraint that requires that no two tuples can have the same value for an attribute is:

unique

When is having used instead of where?

when groups are present through the use of an aggregate function (such as avg, count, etc.) and conditions need to be applied to the groups.


Kaugnay na mga set ng pag-aaral

RHIA- CH 11-health information privacy and security

View Set

Periodic Table Of The Elements : 15-30

View Set