WPC 300 all quizzes for final

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

When we utilize a visualization on paper/screen, that visualization is limited to exploring:

As many variables as we can coherently communicate in 2 dimensions

A/B testing can help marketers to

Increase more likes to their social media sites Increase more clicks to their website Increase more sales

Deleting the grid lines in a chart

Increases the data-ink ratio

Which of the following violates the principle of data visualization?

The data-ink ratio should be higher than 1

Which are useful principles for data visualization?

The graph suggests a possible true effect

Which of the following is a Type-I error?

The null hypothesis is actually true, but the hypothesis test incorrectly rejects it.

Which of the following is an example of a sample?

The number of IT employees out of all employees working in an office of Google

Which of the following statement is false with regard to interpretation?

We cannot refine a model after interpreting its results.

Logistic regression is a specialized type of regression analysis that is designed to predict ________ variables.

a binary categorical

Gamblers' fallacy is ____________.

a clustering illusion

Which of the following describes a positively skewed histogram?

a histogram that tails off towards the right

What function can we use to join two or more text strings into one string?

concatenate

In classification problems, the primary source for accuracy estimation of the model is ________.

confusion matrix

The ________ is often used to describe the performance of a classification model applied to a set of test data for which the true outcomes are known.

confusion matrix

The primary statistical model result should focus on all the answers below except

data cleaning

In an ETL process, data is loaded into a final target database such as:

data warehouse

What function can be used to create a date?

date

Visualizing data is which kind of analytical technique?

decriptive

For a normal distribution mean is _______ to median.

equal

In the experimental design example "IQ Water", students are called _______.

experimental units

An election poll is an example of Machine Learning Applications. (T/F)

false

Compared with observational data, analyses of experimental data are more challenging. (T/F)

false

In order to have a successful A/B testing, we should develop a test plan for what we want to test first. (T/F)

false

New product development could not utilize A/B testing to enhance the process. (T/F)

false

The "Unique" keyword will help obtain unique values from returned columns (T/F)

false

We also need to sample the data that will be used as a training data in machine learning algorithm. (T/F)

false

We can adjust or train input variable and the slope in a machine learning linear regression model? (T/F)

false

We can only have a limited number of variables in a machine learning model. (T/F)

false

You can design an experiment for any scenario. (T/F)

false

Which option below is an example of supervised learning application?

learn to classify spam

In logistic regression analysis, instead of Y as a dependent variable, we use a function of Y called ________.

logit

Visualization of spatial data are most illustrative when shown using

maps

Standard deviation of a normal data distribution is a _______.

measure of data dispersion

The ________ is the observation that occurs most frequently.

mode

In William Playfair's Line Chart, which two parameters did he chart?

national debt vs time

Which of the following proposition describes an existing theory or belief?

null hypothesis

Odds ratio is defined as ________, where p is the probability of success.

p/1-p

William Playfair is credited for inventing which type of chart?

pie chart

With more data is available, the machine learning algorithms improve their performance. (T/F)

true

Visual data enables the reader to see trends and dependencies. (T/F)

true

We can use a letter as a delimiter to split a cell to 2 or 3 cells.(T/F)

true

What Range_lookup parameter value should we use if we want to find an exactly value match?

0

What is the confidence interval when the level of significance is 0.07?

0.930

The WPC Sports Company has noted that the size of individual "customer order" is normally distributed with a mean of $100 and standard deviation of $12. If a soccer team of 16 players were to make the next batch of orders, what would be the standard error of the mean?

3.00 sigma/sqrt(n) = 12/sqrt(16) = 12/4 = 3

To retrieve information, we will use "Select" and "From" keyword in MySQL (T/F)

true

Over-reliant on the first piece of information is called ____________

Anchoring bias

You are collecting data via an online survey to improve education standard at ASU. Which of the following methods will not result in data collection bias?

Anonymously data collection by hiding ASU brand in the survey question.

A loan officer wants to know if the next customer is likely to default or not on a loan. How can she assess the risk of extending the loan to that customer?

By utilizing a multiple logistic regression model developed by an in-house analyst

What are the three principles of describing data?

Center, spread and shape

When sample size increases

Confidence interval decreases

Which of the following statement(s) about charts is true?

Data ink can sometimes help tell a richer story

What function can be used to calculate the time between two dates? Units can be years, months, or days.

Dateif

What are the four types of data analytical method?

Descriptive, explanatory, predictive and prescriptive

Which of the following is an example of secondary data?

Firm's proprietary data

Which of the following describes the standard deviation?

It is the square root of the variance.

In logistic regression, the dependent variable y is defined as:

Log (p/1-p)

If you want to find out if body weight, calorie intake, fat intake and age have an influence on the probability of having a heart attack (yes or no), which of the following kind of analysis will help determine the answer?

Multiple logistic regression

In an agile approach of analytics what is the first step of the process?

Perform business discovery

What best describes the nature of a rose diagram?

Plots data using a circular historical plot

Predictive analytics may be applied to __________, which is a set of techniques that use descriptive data and forecasts to identify the decisions most likely to result in the best performance.

Prescriptive analytics

Which of the following data analytics model use optimization techniques?

Prescriptive analytics

What is data visualization?

Process of graphically representing information and data

"Google Doc" is an example of ____________ in a could computing environment.

SaaS

The central limit theorem states that if the population is normally distributed, then the

Sampling distribution of the mean will also be normal for any sample size

Which of the following statements is a reason not to use a table for data visualization?

Tables cannot easily show trends

Which of the following is a difference between the t-distribution and the standard normal (z) distribution?

The t-distribution has a larger variance than the standard normal distribution

Which of the following is a continuous random variable?

The time to complete a specific task

In classification analysis, we are determining the probability of an observation ________.

To be part of a certain class or not

In classification analysis, we typically split the data into two mutually exclusive sets, known as ________, to investigate the strength of the developed model.

Training and validation/testing

You are creating a database to store temperature and wind data from various airport. Which of the following fields is the most likely candidate to use as the basis for a Primary Key in the Airport Table?

airport code

In order to reject the null hypothesis, the p-value must be less than the

alpha

Which of the following is not a component of relational database?

analysis

In for a chart to minimize graphical complexity, the data-ink ratio must be:

close to 1

In order for a chart to have graphical integrity, the lie factor must be:

close to 1

The difference between the first and third quartiles is referred to as the

interquartile range

What function will return a random number between the numbers specify?

randbetween

What function can be used to sum numbers in a range that meet supplied criteria?

sumif

When you keep eating the food you don't like precisely because you already bought the food, you are committing _____________.

sunk-cost fallacy

What type of learning do we apply when we try to predict who has higher chance to survive on Titanic?

supervised learning

Which machine learning technique can a bank apply to determine if a loan application will be approved?

supervised learning

When the lie-factor of a graphical chart is more than 1,

the size of the effect shown in the graph is bigger than the actual effect in the data

What function can be used to create a time?

time

According to statistical notation, what does ∑ stand for?

to act as a summation operator

What function will remove the extra white space in front of the text?

trim

"Order by" keyword will sort the output in either numerical or alphabetical order. (T/F)

true

A/B testing has been used a lot in marketing promotions. (T/F)

true

Data generation process for observational studies and experiments are different.

true

Experimentation is a way of analytical thinking (T/F)

true

For an even number of observations, the median is the mean of the two middle numbers (T/F)

true

Keyword "where" is used to specify constraints or conditions in SQL. (T/F)

true

Machine learning algorithms can learn better with more data. (T/F)

true

SQL is short for "structured query language". (T/F)

true

To increase conversion rate of your website traffic, A/B testing can be beneficial. (T/F)

true

To retrieve information from a database is called a "query" (T/F)

true

Which machine learning technique can be used to determine where to build cell phone towers?

unsupervised learning

What function can be used to find things in a table or a range by row?

vlookup


Ensembles d'études connexes

1-9: The Conquest of the West and Industrialization of America OLD

View Set

Part 5: You Make the Decision - Marketing

View Set

#12: PUNCTUATION/MECHANICS: Quotation Marks

View Set