WPC 300 Final

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

A manager wishes to predict the annual cost (y) of an automobile based on the number of miles (x) driven. The following model was developed: y = $1500 + 0.36x. If a car is driven 15000 miles in a year, the model predicts the annual cost of the car to be: $6900 $2090 $3850 $7400

$6900

The value of R-Squared always falls between ________ and ________, inclusive. -1 and +1 -infinity to + infinity 0 and -1 0 and 1

0 and 1

What is the confidence level when the level of significance is 0.07? 7% 0.930 0.970 0.093

0.930

The WPC Sports Company has noted that the size of individual "customer order" is normally distributed with a mean of $100 and standard deviation of $12. If a soccer team of 16 players were to make the next batch of orders, what would be the standard error of the mean? 4.00 3.00 3.46 10.00

3.00 sigma/sqrt(n) = 12/sqrt(16) = 12/4 = 3

The correlation coefficient between the age of a vehicle and the money spent to repair it is 0.9. Which of the following statement is true? 90% of the repair cost will be explained by the age of the vehicle 81% of money spent on repairs is explained by the age of the vehicle 81% of the variation in the money spent on repairs is explained by the age of the vehicle 90% of the money spent on repair is explained by the age of the vehicle

81% of the variation in the money spent on repairs is explained by the age of the vehicle

Which of the following statement is true based on the following regression equation? IQ = 4.0 + Reading Label * 5.6 A unit point change in reading label will increase IQ by 5.6 point. Reading label is not a good predictor of IQ. A unit point change in IQ will result in 9.6-point increase in reading label. A unit point change in IQ will result in 5.6-point increase in reading label.

A unit point change in reading label will increase IQ by 5.6 point.

You are creating a database to store temperature and wind data from various airport. Which of the following fields is the most likely candidate to use as the basis for a Primary Key in the Airport Table? Airport Code City Address State

Airport Code

What kinds of bias could show up when collecting data? Framing effect Sampling bias Self-selection bias All of the answer selections are correct

All of the answer selections are correct

Which of the following question(s) can be better answered using data in order to reach an evidence-based conclusion? Who will win the NBA championship? How many students will enroll for an online class in the Spring? What is the purchase pattern(s) of our customers? All of the answer selections are correct.

All of the answer selections are correct.

A/B testing can help marketers to Increase more clicks to their website Increase more likes to their social media sites Increase more sales All of the answers are correct

All of the answers are correct

An ideal machine learning process needs Large volume of data Highly granular data Extremely diverse data All other answer are true.

All other answer are true.

In order to reject the null hypothesis, the p-value must be less than the Alpha Standard deviation Variance Degrees of freedom

Alpha

Which of the following is not a component of the relational database? Relationship among rows in tables Tables Analysis Metadata

Analysis

Over-reliant on the first piece of information is called ____________ Bandwagon effect Zero risk bias Clustering illusion Anchoring bias

Anchoring bias

You are collecting data via an online survey to improve education standard at ASU. Which of the following methods will not result in data collection bias? Polls are completed only by visitors to the site Those with an interest in your mission are the ones to participate in the survey Some individuals are less likely to participate because they have strong opinions against ASU education standard. Anonymously data collection by hiding ASU brand in the survey question

Anonymously data collection by hiding ASU brand in the survey question

You bought a top-of-the-line laptop because your friends were so enthusiastic about theirs. Which kind of bias is in action here? Bandwagon effect Zero-risk bias Overconfidence Endowment effect

Bandwagon effect

Which of the following is a step of agglomerative hierarchical clustering? By joining two clusters farthest away from each other By joining two clusters that not at a Euclidean distance By joining two clusters that are closest to each other By separating cluster into two finer groups

By joining two clusters that are closest to each other

A loan officer wants to know if the next customer is likely to default or not on a loan. How can she assess the risk of extending the loan to that customer? By asking the customer if he is planning to default the loan or not By utilizing a simple linear regression model developed by an in-house analyst By asking his colleague if he knows the person By utilizing a multiple logistic regression model developed by an in-house analyst

By utilizing a multiple logistic regression model developed by an in-house analyst

Which of the following is an example of unsupervised machine learning? Artificial neural networks Clustering Decision tree Logistic regression

Clustering

In data extraction process for an ETL tool, which of the following is not an example of legit data source? Competitions' data Online Line Transaction data Point of Sales data Customers' social media data

Competitions' data

When sample size increases Confidence interval remains the same Confidence interval increases Confidence interval decreases Standard deviation of the sample mean increases

Confidence interval decreases

In classification problems, the primary source for accuracy estimation of the model is ________. Confusion matrix Logit Odds ratio Probability of success

Confusion matrix

The ________ is often used to describe the performance of a classification model applied to a set of test data for which the true outcomes are known. Effect summary table Parameter estimates table ANOVA table Confusion matrix

Confusion matrix

Which of the following is not an application of clustering analysis? Market segmentation analysis Web click stream analysis Crime prediction analysis Collaborating filtering analysis

Crime prediction analysis

In an ETL process, data is loaded into a final target database such as: Data warehouse Operational dashboard Public database Social media database

Data Warehouse

Which of the following is not a standard practice in "Data Transformation" process of an ETL tool? Data aggregation Splitting data fields Data extraction from ERP Change of data format

Data extraction from ERP

Which of the following statement(s) about charts is true? Data ink can sometimes help tell a richer story The more data ink, the better A useless chart is called "chart junk" We should make as many grids as possible in a chart

Data ink can sometimes help tell a richer story

Which of the following statements below is false about supervised data analysis? The data is labeled for supervised analysis Data is not labeled for supervised analysis Multiple linear regression analysis is a type of supervised data analysis Logistic regression analysis is a type of supervised data analysis

Data is not labeled for supervised analysis

In loading phase of an ETL tool, the transformed data gets loaded into an end target usually the _______. Original Database Data warehouse Online analytical processing Master data management

Data warehouse

Which of the following techniques is a modern update of artificial neural networks? Clustering Decision tree Deep learning Logistic regression

Deep learning

Target is examining their online sales data during the pandemic to understand what happened. Which kind of analytical technique are they using? Predictive analytics Descriptive analytics Forecast analytics Prescriptive analytics

Descriptive analytics

What are the four types of data analytical method? Critical, analytical, predictive and explanatory Descriptive, analytical, predictive and prescriptive Descriptive, explanatory, predictive and prescriptive Descriptive, logical, predictive and prescriptive

Descriptive, explanatory, predictive and prescriptive

Which of the following is not one of the processes involved in data cleaning? Matching Consolidating Parsing Encrypting

Encrypting

When you buy a new car, you value it more than the price you paid because of: Zero-risk bias Endowment effect bias Sunk cost fallacy None of the answer selections are correct

Endowment effect bias

Which of the following statements is true? Experimentation is a way of analytical thinking Using intuition is a way of analytical thinking Analytical thinking is not based on facts Heuristic thinking is slow

Experimentation is a way of analytical thinking

Which of the following is an example of secondary data? Interview data Simulated data Survey data Firm's proprietary data

Firm's proprietary data

Which of the following is an example of association rule learning? How frequently an item set occurs in a transaction How frequently a cluster can be formed in a given transaction The association between customers and what they purchase How frequently items are purchased in a group of transaction

How frequently an item set occurs in a transaction

When you access information from two different tables connected by an identifier key, the SQL keyword you should use is _______. INNER JOIN GROUP BY COUNT ORDER BY

INNER JOIN

Which of the following statements is not true about artificial neural networks? In the hidden layer of the networks, input data is hidden The input layer in the network receives the data The learning process is similar to our brain The network is modeled after the human brain in which brain cells work in a network

In the hidden layer of the networks, input data is hidden

Deleting the grid lines in a chart Decreases the data-ink ratio Increases the lie-factor Decreases the lie-factor Increases the data-ink ratio

Increases the data-ink ratio

Artificial Intelligence _______ Cannot be used for retail industry Is a broad science of mimicking human abilities Does not depend on machine learning Is a specific subset of machine learning

Is a broad science of mimicking human abilities

AI is not embraced everywhere in every industry because _______. It is not very well understood It is not very well developed It is likely to fail in the future It can be operationally expensive

It can be operationally expensive

Which of the following is true about multi-collinearity? The P-value reduces significantly, leading to rejection of the null hypothesis. It is measured using a measure called variance inflation factor (VIF). The effect of an independent variable on the dependent variable becomes easy to isolate. The regression coefficients become clearer and are easier to interpret.

It is measured using a measure called variance inflation factor (VIF).

Which of the following statements is NOT true about experimental studies to compare two treatments? We can design experiments to minimize any bias in the comparison. Experiments allow us to set up a direct comparison between the treatments of interest. It is not easy to control uncertainties in the comparison.. We can design experiments so that the error in the comparison is small.

It is not easy to control uncertainties in the comparison..

Which of the following describes the standard deviation? It is the average of the greatest and least values in the data set. It is the square of variance It is the square root of the variance. It is the difference between the first and third quartiles of a data set.

It is the square root of the variance.

In developing spam filter algorithms, we need Labeled data of spam emails Unlabeled data of spam emails Unlabeled data of non-spam emails Labeled data of both spam and non-spam emails

Labeled data of both spam and non-spam emails

One of the processes in ETL is Treatment Load Transition Extend

Load

The final stage of an ETL process is: Transform Load Extract Data Analysis

Load

In logistic regression, the dependent variable y is defined as: Log (p/1-p) Log(1/p) Log (1/1-p) Log (1-p)

Log (p/1-p)

In logistic regression analysis, instead of Y as a dependent variable, we use a function of Y called ________. Log of Y Logit Odds ratio Odds

Logit

In a cluster analysis, the distance between the clusters should be: Zero Even Minimized Maximized

Maximized

Regular consumption of organic food will keep you in a good mood. In this example, the confounder could be people's mood Money organic food work ethics

Money

If you want to find out if body weight, calorie intake, fat intake and age have an influence on the probability of having a heart attack (yes or no), which of the following kind of analysis will help determine the answer? Simple logistic regression Multiple logistic regression Simple linear regression Multiple linear regression

Multiple logistic regression

Which of the following biases cannot be categorized as a cognitive bias? Groupthink Anchoring Bias Sunk cost fallacy None of the answer selections are correct

None of the answer selections are correct

Which of the following is not a drawback of analytical decision making? Delayed action Lack of flexibility Frustration in teams None of the answer selections are correct

None of the answer selections are correct

Which of the following statement(s) about charts is false? None of the other answers are false A chart should have graphical integrity A chart should tell a story A chart should minimize graphical complexity

None of the other answers are false

Which of the following proposition describes an existing theory or belief? Null hypothesis Proportion Standard deviation Alternative hypothesis

Null hypothesis

Which of the following tools help in periodic managerial decision-making? Servers OLTP Database OLAP

OLAP

After factoring out the effect of other variables known to affect SAT, such as socioeconomic status, researchers found that music students had a higher SAT score than non-music students. This is an example of __________. . Experimental study Observational Study Comparative study None of the other answers is correct

Observational Study

A person who is convinced he is gaining admission to Harvard by merely applying is suffering from: Gambler's fallacy Overconfidence Zero-risk bias None of the answer selections are correct

Overconfidence

In an agile approach of analytics what is the first step of the process? Perform data discovery Score and deploy Perform business discovery Model data

Perform business discovery

What best describes the nature of a rose diagram? Rarely used for azimuthal data Represents various species of flowers Uses various colors to represent cause of mortality Plots data using a circular historical plot

Plots data using a circular historical plot

Which of the following examples is not an application of AI? Monitoring epidemics and diseases and stopping them from spreading Predicting human behavior by reading natural language used Predicting the exam score by scanning the appropriate textbook Optimizing traffic patterns over time

Predicting the exam score by scanning the appropriate textbook

Costco wants to know how to stock their warehouses for a future pandemic and are using current sales data to help them project the needs. Which kind of analytical technique are they using? Predictive analytics Prescriptive analytics Explanatory analytics Descriptive analytics

Predictive analytics

Predictive analytics may be applied to __________, which is a set of techniques that use descriptive data and forecasts to identify the decisions most likely to result in the best performance. Explanatory analytics Prescriptive analytics Descriptive analytics Forecast analytics

Prescriptive analytics

Which of the following data analysis models use optimization techniques? Predictive analytics Prescriptive analytics Diagnostic analytics Descriptive analytics

Prescriptive analytics

Your professor is considering purchasing a self-driving car that can figure out the best route and the optimum safe way to drive there without human intervention. What kind of analytics is the car using to do this? Prescriptive analytics Predictive analytics Descriptive analytics Explanatory analytics

Prescriptive analytics

Which of the following is an important task of a database management system? Provides unauthorized access to data when authentication fails Provides support such as performing maintenance and routine backups. Helps collect data from vendors Helps create rules for data analysis

Provides support such as performing maintenance and routine backups.

_______ ensures that related data exist in parent table before allowing an entry into a child table. Referential integrity Data redundancy SQL Data Integrity

Referential integrity

The unexplained variance in the regression analysis is also known as: Predicted variance Total variance Residual variance Regression variance

Residual variance

The SQL code to extract only first_name information for all records of the "Actor" table below is: SELECT * FROM Actor WHERE first_name = "NICK"; SELECT * FROM Actor; SELECT first_name FROM Actor; SELECT Actor FROM first_name;

SELECT first_name FROM Actor;

"Google Doc" is an example of _______ in a could computing environment. SaaS IaaS Virtualization PaaS

SaaS

Which of the following category of data mining you would use for Spam filtering of emails? Supervised Unsupervised Both supervised and unsupervised Heuristics

Supervised

Which of the following statements below is true about supervised/unsupervised machine learning? Unsupervised learning requires labeled data for training Supervised learning require unlabeled data for training Unsupervised learning require no supervision from human Supervised learning requires labeled data for training

Supervised learning requires labeled data for training

Which of the following statements is a reason not to use a table for data visualization? Large amount of information can be included in a very small space Tables cannot easily show trends Tables display more information in less space than a chart The table has more precise numbers

Tables cannot easily show trends

In the Target story, why did Target send the teen daughter maternity ads? Target was sending ads to all women in a particular neighborhood Target was using special promotion that targeted all teens in her geographical area Target analytic model confused her with an older woman with a similar name Target analytics model suggested she was pregnant based on her buying patterns

Target analytics model suggested she was pregnant based on her buying patterns

Which of the following is an ETL vendor? MySQL Teradata Tableau JMP

Teradata

Which of the following is true of hierarchical clustering? All clusters must have more than one object in it The data partition does not occur in a single step All clusters must have the same number of data No single cluster can have all data

The data partition does not occur in a single step

Which of the following violates the principle of data visualization? The lie-factor should be closely equal to 1 Avoid unnecessary chart junk The chart should tell a story The data-ink ratio should be higher than 1

The data-ink ratio should be higher than 1

Which of the following is a definition of distance between two clusters in a complete linkage clustering? The distance between the least distant pair of objects, one from each group The sum of square of the distance between clusters The average of distance between all pairs of objects, where each pair is made up of one object from each group The distance between the most distant pair of objects, one from each group

The distance between the most distant pair of objects, one from each group

Which are useful principles for data visualization? It is important to include pointed arrows whenever possible to really draw attention to the eyes Including as many grids as possible is vital for fully specifying the data to be represented The use of a wide range of colors is critical to emphasize distinctions The graph suggests a possible true effect

The graph suggests a possible true effect

Which of the following is a Type-I error? The null hypothesis is actually false, but the test incorrectly fails to reject it. The null hypothesis is actually true, but the hypothesis test incorrectly rejects it. The null hypothesis is actually false, and the test correctly rejects it. The null hypothesis is actually true, and the hypothesis test correctly fails to reject it.

The null hypothesis is actually true, but the hypothesis test incorrectly rejects it.

Which of the following is an example of a sample? The population of Canada The number of IT employees out of all employees working in an office of Google The number of individuals who have a Ford car The number of members in the Democratic party

The number of IT employees out of all employees working in an office of Google

Which of the following is a difference between the t-distribution and the standard normal (z) distribution? The standard normal distributions' confidence levels are wider than those of the t-distribution The t-distribution has a larger variance than the standard normal distribution The standard normal distribution is dependent on parameters like degree of freedom, while t-distribution is not. The t-distribution cannot be calculated without a known standard deviation, while the standard normal distribution can be.

The t-distribution has a larger variance than the standard normal distribution

Which of the following is a continuous random variable? The number of new hires in a year The number bounced check from a bank The time to complete a specific task The outcomes of rolling two dice

The time to complete a specific task

What would be the null hypothesis for testing a linear regression model with profit as the dependent variable and sales as the independent variable? There is a negative relationship between profit and sales. There is a positive relationship between profit and sales. There is no linear relationship between profit and sales. There is a linear relationship between profit and sales that can be either positive or negative.

There is no linear relationship between profit and sales.

Which of the following assumptions is not true for multiple linear regression? The residuals are normally distributed. The relationship between dependent and independent variables is linear. There will be a multi-collinearity effect. The independent variables are not correlated.

There will be a multi-collinearity effect.

A correlation coefficient between "college entrance exam" grades and scholastic achievement was found to be -1.08. On the basis of this, you would tell the university that: Students who do best on this exam will make the worst students. They should hire a new statistician. The exam is a poor predictor of success. The entrance exam is a good predictor of success.

They should hire a new statistician.

In classification analysis, we are determining the probability of an observation ________. To be one To be part of a certain class or not To be zero To be undefined

To be part of a certain class or not

Which of the following is true about A/B testing? You should test multiple elements of your landing page at a time and compare. To increase conversion rate of your website traffic, A/B testing can be beneficial. You need to attend WPC 300 course to learn about A/B Testing. A neutral result on an A/B testing means you correctly performed the test.

To increase conversion rate of your website traffic, A/B testing can be beneficial.

Which of the following is a false statement? The k-means algorithm is a method for doing partitional clustering In cluster analysis, the objects within clusters should exhibit a high degree of similarity To predict sales from transactional data, one should perform clustering analysis. Reducing SSE (sum of squared error) within a cluster increases cohesion

To predict sales from transactional data, one should perform clustering analysis.

In classification analysis, we typically split the data into two mutually exclusive sets, known as ________, to investigate the strength of the developed model. Testing and validation Binary and numeric Training and validation/testing Training and Binary

Training and validation/testing

When you are asked to design a database for the airline ticket reservation system, based on an Entity-Relationship Data model, which of the following could be an example of "entity"? Arrival time Destination city Flight Number Traveler

Traveler

Which of the following is a cloud service provider? iCloud VMWare Gmail Dropbox

VMWare

Which of the following is true about k-means clustering? It is a type of hierarchical clustering The cluster analysis will give us an optimum value for k We choose the value for k before doing the clustering analysis A tree diagram is used to illustrate the steps in the clustering analysis

We choose the value for k before doing the clustering analysis

Logistic regression is a specialized type of regression analysis that is designed to predict ________ variables. independent numeric dependent a binary numeric a binary categorical

a binary categorical

Gamblers' fallacy is ____________. a clustering illusion an endowment effect bias framing effect bias a zero-risk bias

a clustering illusion

Which of the following describes a positively skewed histogram? a histogram for which mean and mode values are the same. a histogram that tails off towards the right a histogram with large kurtosis a histogram that has no fluctuation in mass

a histogram that tails off towards the right

A market analyst is developing a regression model to predict monthly household expenditures on groceries as a function of family size, household income, and household neighborhood (urban, suburban, and rural). The "neighborhood" variable in this model is ________. a linear variable a continuous variable an independent variable a dependent variable

an independent variable

In for a chart to minimize graphical complexity, the data-ink ratio must be: greater than 1 close to 1 close to zero less than 1

close to 1

In order for a chart to have graphical integrity, the lie factor must be: less than 1 close to zero greater than 1 close to 1

close to 1

When two variables are highly positively correlated, the correlation coefficient will be _______. close to 0 close to 1 close to 10 close to -1

close to 1

Which of the following is not a requirement for an ETL architecture? data quality data security data integration data compliance

data quality

Data transformation involves: data splitting and aggregation format changes and encryption duplication and load format changes and load

data splitting and aggregation

The central limit theorem states that even if the population is not normally distributed, the distribution of the sample mean will still be normal when the sample size is large Sampling distribution of the mean will vary from the sample to sample Standard error of the mean will not vary from the population mean Mean of the population can be calculated without using samples

distribution of the sample mean will still be normal when the sample size is large

For a normal distribution mean is _______ to median. greater than less than not equal equal

equal

In the experimental design example "IQ Water", students are called _______. measurement units response variable experimental units treatments

experimental units

The difference between the first and third quartiles is referred to as the ____________. interquartile range standard deviation variance midrange

interquartile range

Standard deviation of a normal data distribution is a _______. measure of data dispersion measure of data quality measure of data shape measure of data centrality

measure of data dispersion

The ________ is the observation that occurs most frequently. mean median mode outlier

mode

An experiment is said to be double-blinded if _________ the researchers only observe the variables of interest a placebo is given to some of the subjects the researcher is not aware of the confounding effect neither the subject nor those working with the subject is aware of who is being given which treatment

neither the subject nor those working with the subject is aware of who is being given which treatment

Odds ratio is defined as ________, where p is the probability of success. 1/p-1 1/1-p p/p-1 p/1-p

p/1-p

Extract function in ETL reads data from data mart specified source database data warehouse unknown database

specified source database

A _______________ is a relationship between two variables that appear to have interdependence or association with each other but actually do not. spurious correlation negative correlation positive correlation non-correlation

spurious correlation

When you keep eating the food you don't like precisely because you already bought the food, you are committing _____________. sunk-cost fallacy availability heuristics bias endowment effect bias zero risk bias

sunk-cost fallacy

When the lie-factor of a graphical chart is more than 1, the size of the effect shown in the graph is bigger than the actual effect in the data. the graph understates the true effect. the graph simplifies the true effect. the graph suggests a possible true effect.

the size of the effect shown in the graph is bigger than the actual effect in the data.

According to statistical notation, what does ∑ stand for? to act as a summation operator to represent population measure to represent the number of items in a population to represent sample statistics

to act as a summation operator

The first step for any kind of A/B testing is to determine how we want to evaluate the performances? to develop a tracking URL. to execute test according to the plan. to develop a test plan for what you want to test.

to develop a test plan for what you want to test.

A sample study is mostly done to estimate the parameters of the population. to establish causality in a controlled environment to learn how different parameters in the population behave together. to rule out any spurious correlation in the data.

to estimate the parameters of the population.

Which of the following is an example of a measure of dispersion? variance median mode mean

variance


Ensembles d'études connexes

MKT 606 - Iacobucci Ch 1-8 Midterm exam

View Set

THEORITICAL FOUNDATION OF NURSING

View Set

SHRM Module 1: Employment Law used for man4402

View Set

Earth's Processes: Climate change Study Guide

View Set

Study guide for Chapters 1 & 2 of Psychology

View Set

American History by Judith Ortiz Cofer

View Set

Anatomy L54: The Last Four Cranial Nerves

View Set