Practice Quiz Module 4 Part I
Given are the main steps in any machine learning project. Reorder them to a proper order. Look at the big picture. Get the data. Discover and visualize the data to gain insights. Prepare the data for Machine Learning algorithms. Select a model and train it. Fine-tune your model. Present your solution. Launch, monitor, and maintain your system.
Look at the big picture. Get the data. Discover and visualize the data to gain insights. Prepare the data for Machine Learning algorithms. Select a model and train it. Fine-tune your model. Present your solution. Launch, monitor, and maintain your system.
Which of the following performance measures are used in regression problems, when there are many outliers? Mean squared error Mean absolute error Precision Accuracy
Mean absolute error
Which of the following are the typical performance measures used in regression problems? Precision Mean absolute error Mean squared error Accuracy
Mean absolute error Mean squared error
Describe the correlation in the graph shown. No Correlation Strong Negative Weak Negative Strong Positive
No Correlation
What would the correlation be between math grades and the time it takes to run a mile ? Cannot Be Determined Postive Correlation Negative Correlation No Correlation
No Correlation
Correlation coefficient = 0 => weak positive weak negative strong negative strong positive No correlation
No correlation
How would you describe this distribution? Right Skewed Normal/Symmetrical Left skewed Flat/Uniform
Normal/Symmetrical
What would be the correlation be between study time and test grades? Cannot Be Determined Positive Correlation No Correlation Negative Correlation
Positive Correlation
Range of correlation coefficient? 0 to 1 0 to 100 -100 to 100 -1 to 1
-1 to 1
Which interval shows the greatest number of pets? 5 - 9 0 - 4 15 - 19 10 - 14
15 - 19
Correlation shows ______________ A relationship Cause and effect Both Neither
A relationship
The correlation coefficient shows.. The direction The strength Both Neither
Both
Which of the following object you get after reading CSV file? DataFrame Character Vector Panel All of the mentioned
DataFrame
What type of information does a histogram show? Specific data points Random data Data jumps Frequency of specific data or ranges of data Categorical data
Frequency of specific data or ranges of data
Which of the following is the first question to ask when you start working on a ML project? Is it a classification task, a regression task, or something else? Is it supervised, unsupervised, or Reinforcement Learning? How does the company or organization expect to use and benefit from this model? What the current solution looks like (if any)? Should you use batch learning or online learning techniques?
How does the company or organization expect to use and benefit from this model?
Is predicting California house prices, a classification, regression, or something else? Classification Other Regression
Regression
Describe the histogram: Normal/Symmetrical Right Skewed Uniform/Flat Left Skewed
Right Skewed
Is predicting California house prices (given data has labels), a supervised, unsupervised, or Reinforcement Learning problem? Unsupervised Supervised Reinforcement
Supervised
A percentile indicates the value below which a given percentage of observations in a group of observations fall. For example, if the 25% of housing prices in California is 500k => 25% of houses in California are priced below 500k.
True
After you gathered the data, you need to create a test set, put it aside, and never look at it until the final evaluation.
True
It would be beneficial to try out various feature/attribute combinations.
True
The direction of a correlation can be positive or negative.
True
What kind of distribution is this? Right skewed Symmetrical/Normal Left skewed Uniform/Flat
Uniform/Flat
Which of the following pandas method shows a summary of the numerical attributes? value_counts() info() hist() describe()
describe()
Which is the correct Pandas syntax to read in a csv file and assign it to a DataFrame df? df = read('file.csv', type = 'csv') df = with open('file.csv') as pd.DataFrame df = pd.read_csv('file.csv') df = read_csv('file.csv')
df = pd.read_csv('file.csv')
What is a correct method to find relationships between column in a DataFrame? df.corr() df.relation() df.rel() df.correlation()
df.corr()
Which of the following pandas method is useful to get a quick description of the data, in particular the total number of rows, each attribute's type, and the number of non-null values? hist() describe() info() value_counts()
info()
Correlation coefficient = - 0.89 => weak positive weak negative strong negative strong positive
strong negative
Correlation coefficient close to -1 => weak positive strong negative strong positive weak negative
strong negative
Correlation coefficient close to 1 => strong positive weak positive weak negative strong negative
strong positive
Correlation coefficient = 0.3 => weak positive weak negative strong negative strong positive
weak positive