Intro Machine Learning
DataFrame
holds the type of data you might think of as a table
count
how many rows have non-missing values
selecting features
provide a list of column names inside brackets Ex. melbourne_features = ['Rooms', 'Bathroom', 'Landsize', 'Lattitude', 'Longtitude']
columns property
shows a list of all columns in the dataset
.head()
shows the top few rows
min, 255, 50%, 75%, and max
sorting each column from lowest to highest value
std
standard deviation, measures how numerically spread out the values are
mean
the average
features
the columns inputted into our model ex. y = melbourne_data.Price
selecting prediction target
use dot notation to select the column we want to predict
data tree
using data to break options into multiple groups