WPC 300 QUIZZES TOGETHER
A manager wishes to predict the annual cost (y) of an automobile based on the number of miles (x) driven. The following model was developed: y = $1500 + 0.6x. If a car is driven 15000 miles, the predicted cost of the car is: $7400 $10500 $13850 $6900
$10500
he value of R-Squared always falls between ___________ and ___________ inclusive. 0 and -1 0 and 1 -infinity to + infinity -1 and +1
0 and 1
What is the confidence interval when the level of significance is 0.06? 7% 0.093 0.930 0.940
0.940
The WPC Sports Company has noted that the size of individual "customer order" is normally distributed with a mean of $100 and a standard deviation of $10. If a soccer team of 25 players was to make the next batch of orders, what would be the standard error of the mean? 3.46 2.00 3.00 10.00
2.00 (sigma/sqrt(n) = 10/sqrt(25) = 2)
The correlation coefficient between the age of an auto and the money spent to repair is 0.8. Which of the following statement is true? 81% of money spent on repairs is explained by the age of an auto 90% of the repair cost will be explained by the age of an auto 81% of the variation in the money spent on repairs is explained by the age of the auto 64% of the variation in the money spent on repairs is explained by the age of the auto
64% of the variation in the money spent on repairs is explained by the age of the auto Feedback: R=0.8, R-squared = 0.64. Remember the definition of R-squared.
Gamblers' fallacy is ______________ Framing effect bias A Zero risk bias A clustering illusion An endowment effect bias
A clustering illusion
A market analyst is developing a regression model to predict monthly household expenditures on groceries as a function of family size, household income, and household neighborhood (urban, suburban, and rural). The "household expenditure" variable in this model is ________. A dependent variable A qualitative variable A continuous variable An independent variable
A dependent variable
Which of the following statement is true based on the following regression equation?IQ = 4.0 + Reading Label * 9.6 A unit point change in IQ will result in 5.6 point increase in reading label A unit point change in reading label will increase IQ by 5.6 point A unit point change in reading label will result in 9.6 point increase in IQ level. Reading label is not a good predictor of IQ
A unit point change in reading label will result in 9.6 point increase in IQ level.
In an ETL process, data is loaded into a final target database such as: Operational dashboard Public database Social media database Data warehouse
Data warehouse
In loading phase of an ETL tool, the transformed data gets loaded into an end target usually the ____________. Master data management Online analytical processing Data warehouse Data Mart
Data warehouse
Deleting the grid lines in a chart Decreases the lie-factor Increases the data-ink ratio Decreases the data-ink ratio Increases the lie-factor
Decreases the data-ink ratio
What are the four types of data analytical method? Descriptive, logical, predictive and prescriptive Descriptive, explanatory, predictive and prescriptive Critical, analytical, predictive and explanatory Descriptive, analytical, predictive and prescriptive
Descriptive, explanatory, predictive and prescriptive
________ refers to a bias that causes an individual to value an owned object higher than its market value. Bandwagon effect Clustering illusion Endowment effect Anchoring bias
Endowment effect
In an agile approach of analytics, what is the last step of the process? Model data Score and deploy Perform business discovery Evaluate and Improve
Evaluate and Improve
Which of the following statements is true? Experimentation is a way of analytical thinking Heuristic thinking is slow Using intuition is a way of analytical thinking Analytical thinking is not based on facts
Experimentation is a way of analytical thinking
When you access information from two different tables connected by an identifier key, the SQL keyword you should use is ____________. COUNT INNER JOIN ORDER BY GROUP BY
INNER JOIN
Which of the following is an example of primary data? Firm's proprietary data Data collected through censuses Interviews data Internet searchers
Interviews data
Which of the following is true about multi-collinearity? Is measured using the statistical variance inflation factor (VIF) Regression coefficients become clearer and are easier to interpret The effect of a dependent variable on another becomes difficult to isolate. P-value reduces significantly leading to rejection of the null hypothesis.
Is measured using the statistical variance inflation factor (VIF)
You need to find out if a customer will buy your product or not. An appropriate sample data is available from your current customer base. Which of the following analysis method will be appropriate for this study? Linear regression Multiple linear regression Logistic regression Clustering
Logistic regression
Which of the following is true focus of Information Architecture? Deliver information to the client where there is a misunderstanding Make the information hard to find Make only irrelevant information easy to find Making all information easy to find
Making all information easy to find
Visualization of spatial data are most illustrative when shown using Bubble graphs Maps Line graphs Bar graph
Maps
Kurtosis of a normal data distribution is a ___________________ Measure of data dispersion Measure of data quality Measure of data centrality Measure of data shape
Measure of data shape
Regular consumptions of organic food will keep you in a good mood. In this example, the confounder could be people's mood work ethics Money organic food
Money
Which of the following assumptions is not true for simple linear regression? Correlations between the dependent and independent variables Residuals are normally distributed Multicollinearity effect between independent variabels Relationship between dependent and independent variable should be linear
Multicollinearity effect between independent variabels
The numbers on the basketball jersey is an example of Ordinal data Interval data Ratio data Nominal data
Nominal data
The average value of nominal data is measured by None of the other answers is true Median Mode Mean
None of the other answers is true
Which of the following proposition describes an existing theory or belief? Standard deviation Null hypothesis Proportion Alternative hypothesis
Null hypothesis
Which of the following tools help in periodic managerial decision-making? OLTP OLAP Database Servers
OLAP
Predictive analytics may be applied to_______________, which is a set of techniques that use descriptive data and forecasts to identify the decisions most likely to result in the best performance. Prescriptive analytics Forecast analytics Explanatory analytics Descriptive analytics
Prescriptive analytics
Which of the following data analytics model use optimization techniques? Diagnostic analytics Prescriptive analytics Predictive analytics Descriptive analytics
Prescriptive analytics
Which of the following is an important task of a database management system? Helps collect data from vendors Helps create rules for data analysis Provides support such as performing maintenance and routine backups. Provides unauthorized access to data when authentication fails
Provides support such as performing maintenance and routine backups.
The unexplained variance in the regression analysis is also known as: Regression variance Residual variance Total variance Predicted variance
Residual variance
The SQL code to extract only departure time information for all records of the following "Flight" table is: SELECT Departs FROM Flight; SELECT * FROM Flight; SELECT * FROM Flight WHERE To = "LGA (New York City)"; SELECT Flight # FROM Departs;
SELECT Departs FROM Flight;
The central limit theorem states that if the population is normally distributed, then the Sampling distribution of the mean will also be normal for any sample size Standard error of the mean will not vary from the population mean Mean of the population can be calculated without using samples Sampling distribution of the mean will vary from the sample to sample
Sampling distribution of the mean will also be normal for any sample size
Which of the following steps is not used in the k-means clustering algorithm? Assign each data point to some cluster. Choose the number of clusters K. Select cluster centers in such a way that they are as closest as possible from each other. Calculate the distance between each data point and each cluster center
Select cluster centers in such a way that they are as closest as possible from each other.
Florence Nightingale's rose diagram tells a story about Interests of national debt during the war Soldier's cause of mortality in the hospital during the work Cause of cholera outbreak Napoleon's invasion to Russia
Soldier's cause of mortality in the hospital during the work
When you keep eating the food you don't like precisely because you already bought the food, you are committing _______________ Endowment effect bias Sunk-cost fallacy Zero risk bias Availability heuristics bias
Sunk-cost fallacy
Which of the following category of data mining you would use for Spam filtering of emails? Supervised Unsupervised Heuristics Both supervised and unsupervised
Supervised
Which of the following statements is a reason not to use a table for data visualization? Large amount of information can be included in a very small space Tables cannot easily show trends The table has more precise numbers Tables display more information in less space than a chart
Tables cannot easily show trends
What are the three principles of describing numeric data? Centrality, dispersion and size Mean, median and mode Dispersion, range and standard deviation center, spread and shape
center, spread and shape
In inferential statistics, a _____________ is used to infer about a ______________. sample, population Population, significance Population, sample Sample, significance
sample, population
You are creating a database to store temperature and wind data from various airport. Which of the following fields is the most likely candidate to use as the basis for a Primary Key in the Airport Table? Airport code Address State City
Airport code
In order to reject the null hypothesis, the p-value must be less than the Standard deviation Variance Degrees of freedom Alpha
Alpha
A researcher wants to find out if a lack of exercise leads to weight gain. Which of the following variables could not be considered as confounder? Both "Exercise level" and "Wight" Weight Age Exercise level
Both "Exercise level" and "Wight"
Which of the following the first stage of agglomerative hierarchical clustering? By separating cluster into two finer groups By joining two clusters farthest away from each other By joining two clusters that not at a Euclidean distance By joining two clusters that are closest to each other
By joining two clusters that are closest to each other
When sample size increases Confidence interval remains the same Confidence interval increases Confidence interval decreases Standard deviation of the sample mean increases
Confidence interval decreases
Which of the following is not an application of clustering analysis? Web click stream analysis Market segmentation analysis Collaborating filtering analysis Crime prediction analysis
Crime prediction analysis
Which of the following is not a component of relational database? Relationship among rows in tables CPU of Database Server Metadata Tables
CPU of Database Server
When two variables are highly negatively correlated, the correlation coefficient will be ______________ Close to 0 Close to 1 Close to 10 Close to -1
Close to -1
In data extraction process for an ETL tool, which of the following is not an example of legit data source? Customers' social media data Competitions' data Online Line Transaction data Point of Sales data
Competitions' data
In the target story discussed in the lecture, why did Target send the teen daughter maternity ads? Target analytic model confused her with an older woman with a similar name Target was using special promotion that targeted all teens in her geographical area Target analytics model suggested she was pregnant based on her buying habit Target was sending ads to all women in a particular neighborhood
Target analytics model suggested she was pregnant based on her buying habit
Which of the following basic data visualization principles violated for the graph shown below? All three principles of data visualization violated. The chart should minimize graphical complexity The chart should tell a story The chart should have a graphical integrity
The chart should have a graphical integrity Keep the axis same to maintain graphical integrity.
Which of the following is true of hierarchical clustering? The data partition does not occur in a single step All clusters must have more than one object in it No single cluster can have all data All clusters must have the same number of data
The data partition does not occur in a single step
Which of the following violates the principle of data visualization? The chart should tell a story Avoid unnecessary chart junk The lie-factor should be closely equal to 1 The data-ink ratio should be higher than 1
The data-ink ratio should be higher than 1
Which of the following is a definition of distance between two clusters in a single linkage clustering? The average of distance between all pairs of objects, where each pair is made up of one object from each group The distance between the least distant pair of objects, one from each group The distance between the most distant pair of objects, one from each group The sum of square of the distance between clusters
The distance between the least distant pair of objects, one from each group
Which are useful principles for data visualization? The graph suggests a possible true effect The use of a wide range of colors is critical to emphasize distinctions Including as many grids as possible is vital for fully specifying the data to be represented It is important to include pointed arrows whenever possible to really draw attention to the eyes
The graph suggests a possible true effect
You arrived at a significant test statistic (p-value < 0.05) when comparing responses from three treatment groups in a one-way ANOVA. How would you interpret the alternative hypothesis for this test? None of the other answers is true The mean response from at least one treatment group is different from that of the others The mean responses from all three treatment groups are different The mean responses from all three treatment groups are the same
The mean response from at least one treatment group is different from that of the others
Which of the following is not a continuous random variable? The outcomes of rolling two dice The pollution level (measured in air quality index) in the air around us The time to complete a specific task The possible amount of rain on a given day
The outcomes of rolling two dice
Which of the following is a Type-II error? The research hypothesis is actually true, but the test correctly rejects the null hypothesis. The research hypothesis is actually true, and the test incorrectly fails to rejects null hypothesis The null hypothesis is actually true, but the hypothesis test incorrectly rejects it. The null hypothesis is actually true, and the hypothesis test correctly fails to reject it.
The research hypothesis is actually true, and the test incorrectly fails to rejects null hypothesis
When the lie-factor of a graphical chart is more than 1, The graph understates the true effect The graph suggests a possible true effect The size of the effect shown in the graph is bigger than the actual effect in the data. The graph simplifies the true effect
The size of the effect shown in the graph is bigger than the actual effect in the data
Which of the following is a difference between the t-distribution and the standard normal (z) distribution? The t-distribution cannot be calculated without a known standard deviation, while the standard normal distribution can be. The t-distribution has a larger variance than the standard normal distribution The standard normal distributions' confidence levels are wider than those of the t-distribution The standard normal distribution is dependent on parameters like degree of freedom, while t-distribution is not.
The t-distribution has a larger variance than the standard normal distribution
A correlation coefficient between the "college entrance exam" grades and "scholastic achievement" was found to be 1.08. On the basis of this, you would tell the university that: The entrance exam is a good predictor of success. Students who do best on this exam will make the worst students. The exam is a poor predictor of success. They should hire a new statistician.
They should hire a new statistician. Feedback: -1< r<1
Which of the following is true about A/B testing? To increase conversion rate of your website traffic, A/B testing can be beneficial. You should test multiple elements of your landing page at a time and compare You need to attend WPC 300 course to learn about A/B Testing A neutral result on an A/B testing means you correctly performed the test
To increase conversion rate of your website traffic, A/B testing can be beneficial.
Which of the following is not a true statement? Reducing SSE (sum of squared error) within cluster increases cohesion In the cluster analysis, the objects within clusters should exhibit an high amount of similarity The k-means algorithm is a method for doing partitional clustering To predict sales from transactional data one should perform clustering analysis.
To predict sales from transactional data one should perform clustering analysis.
When are asked to design a database for airline ticket reservation system, based on an Entity Relationship Data model, which of the following could be an example of "entity"? Flight Number Arrival time Destination city Traveler
Traveler
In the experimental design (discussed in the lecture videos) to test the efficacy of "IQ Water", the "IQ water" is called Response variable Treatments Experimental units Measurement units
Treatments
In confirmatory visualization Users expect to see a certain pattern in the data Users typically look for anomaly in the data Users don't know what they are looking for Users confirm the quality of data visualization
Users expect to see a certain pattern in the data
Which of the following is a cloud service provider? Dropbox VMWare Gmail iCloud
VMWare
Which of the following is not a traditional data architectural process? Conceptual Logical Physical Visual
Visual
Do a quick google search and find out about other data visualization tools. Which of the following is not a data visualization tool? Domo Weka Sisense Qlik
Weka