Stats final study guide
Which confidence ratio would lead you to recomend a product (the consequent) if another product is purchased (the antecident)? 0 3 .25 1
1
Which lift ratio would lead you to recomend a product (the consequent) if another product is purchased (the antecident)? 1.41 0.53 0 1
1.41
Using complete linkage, which two clusters should be combined next? plot of chunk cluster 1 and 2 1 and 3 2 and 3 All three
2 and 3
Given the two points in (X,Y,Z) space, (36, 57, 47) and (53, 45, 53), what is the euclidean distance between the two? (round to 3 digits)
21.656 Between 21.646 and 21.666
A study was done on the affect of age on activity. We recorded the individuals age and the number of push-ups and sit-ups that can be done in a minute. What is the euclidean distance between the two observations (Age, Push-ups, Sit-ups): (55, 37, 50) and (47, 49, 44)? (round to 3 digits)
52.62 Between 15.61 and 15.63
Which chart is used to display the relationship between 3 numerical variables? Cluster column chart Pie chart Scatter plot Bubble chart Bar chart
Bubble chart
Mark all that are true about K-Meanse Clustering: Matching coefficient can be used to measure similarities in the process McQuitty's method can be used to combine clusters Centroids are calculated once each cycle Each observation is put into one of k clusters Jacards coefficient can be used to measure similarities in the process
Centroids are calculated once each cycle Each observation is put into one of k clusters
Which chart is used to display the relationship between 2 catagorical variables? Histogram Bar chart Line chart Cluster column chart Pie chart
Cluster column chart
Which charts are used to display information about categorical variables? (mark all that apply) Cluster column chart Line chart Pie chart Bar chart Histogram
Cluster column chart Pie chart Bar chart
Which would lead you to combining clusters whose furthest neighbors are the closest? McQuitty's Method Centroid Method Complete Linkage Single Linkage
Complete Linkage
Mark all that are true about Hierarchical Clustering: Each observation is put into one of k clusters Centroids are calculated once each cycle Complete linkage can be used to combine clusters Euclidean distance can be used to measure distance in the process Matching coefficient can be used to measure similarities in the process
Complete linkage can be used to combine clusters Euclidean distance can be used to measure distance in the process Matching coefficient can be used to measure similarities in the process
Which chart is used to display the distribution of a single numerical variable? Bar chart Cluster column chart Line chart Heat maps Histogram
Histogram
Which chart is used to display the change in a numerical variable over time? Line chart Cluster column chart Pie chart Histogram Bubble chart
Line Chart
Which charts are used to display information about numerical variables? (mark all that apply) Line chart Bubble chart Cluster column chart Bar chart
Line chart Bubble chart
Which would lead you to combining clusters only if their combined internal disimilarities is smaller than any other possible combination? Complete Linkage McQuitty's Method Centroid Method Single Linkage
McQuitty's Method
Which chart is used to display the relationship between 2 numerical variables? Line chart Heat maps Scatter plot Histogram Bar chart
Scatter plot
Which would lead you to combining clusters whose nearest neighbor are the closest? McQuitty's Method Single Linkage Ward's Method Centroid Method
Single Linkage
Which of the following are benefits of tableau? (Mark all that apply) Tableau is quick and interactive visualizations Tableau is a data creation tool Tableau makes it easy to create new variables Tableau is great at pre-processing the data Tableau handles more data than Excel
Tableau is quick and interactive visualizations Tableau handles more data than Excel
What is the purpose of data visualization? To show off your Excel skills To show the data and all of its details To oversimplify an idea To summarize a data set or highlight patterns All of the above
To summarize a data set or highlight patterns
In testing the hypothesis: 𝐻0:µ=75H_0:µ=75 𝐻𝑎:µ≠75H_a:µ≠75 Using an 𝛼=5\alpha=5%, which of the following p_values would allow you to reject the null hypothesis (𝐻0H_0)? (Choose all that apply) a. 0.0272 b. 0.0432 c. 0.0624 d. 0.0834 e. 0.0858
a. 0.0272 b. 0.0432
Which of the following are NOT benefits of tableau? (Mark all that apply) a. Tableau makes it easy to create new variables b. Tableau is great at pre-processing the data c. Tableau handles more data than Excel d. Merging different datasets in Tableau is not a problem! e. Tableau does excelent multi-layered calculations
a. Tableau makes it easy to create new variables b. Tableau is great at pre-processing the data e. Tableau does excelent multi-layered calculations
Universities nation wide would like to see if the number of incoming students is different than it was in the past. Last year on average 5,000 new students started at each University. 𝐻0:µ=5,000H_0:µ=5,000 𝐻𝑎:µ≠5,000H_a:µ≠5,000 Using an 𝛼=5\alpha=5%, which of the following p_values would allow you to reject the null hypothesis (𝐻0H_0)? (Choose all that apply) a. -0.0086 b. 0.0201 c. 0.0313 d. 0.0357 e. 0.0811
b. 0.0201 c. 0.0313 d. 0.0357