Chapter 7 ISDS 574 K- nearest Neighbors (k-NN)

Ace your homework & exams now with Quizwiz!

The idea in k-nearest neighbor methods is?

to identify k records in the training dataset that are similar to a new record that we wish to classify. We then use these similar (neighboring) records to classify the new record into a class, assigning the new record to the predominant class among these neighbors

how to set Setting the Cutoff Value

typiclly .5 or 50% where k = 8 was used to classify the new household. The "Prob for Owner" of 0.625 was obtained because 5 of the 8 neighbors were owners. Using a cutoff of 0.5 leads to a classification of "owner" for the new household.

What distance measurment doe k-NN use?

Euclidean distance, which is computationally cheap, is the most popular in k-NN.

The idea of k-NN can readily be extended to predicting a continuous value (as is our aim with multiple linear regression models). how?

The first step of determining neighbors by computing distances remains unchanged. The second step, where a majority vote of the neighbors is used to determine class, is modified such that we take the average response value of the k-nearest neighbors to determine the prediction.

k-NN with More Than Two Classes

The k-NN classifier can easily be applied to an outcome with m classes, where m > 2. The "majority rule" means that a new record is classified as a member of the majority class of its k neighbors.

Does The k-nearest neighbor algorithm is a classification method make assumptions about the form of the relationship between the class membership (Y) and the predictors X1, X2, ..., Xp?

The k-nearest neighbor algorithm is a classification method that does not make assumptions This is a nonparametric method because it does not involve estimation of parameters in an assumed function form, such as the linear form assumed in linear regression

in some cases we might want to choose a cutoff other than the default 0.5 for the purpose of?

maximizing accuracy or for incorporating misclassification costs.

The main advantage of k-NN methods is?

their simplicity and lack of parametric assumptions.

Classification Rule: one-nearest neighbor

1. Find the nearest k neighbors to the record to be classified. 2. Use a majority decision rule to classify the record, where the record is classified as a member of the majority class of the k neighbors. Example: Riding Mowers

Some ways to overcome the time to find the nearest neighbors in a large training set?

1. reduced dimension using dimension reduction techniques such as principal components analysis 2. Use sophisticated data structures such as search trees to speed up identification of the nearest neighbor. This approach often settles for an "almost nearest" neighbor to improve speed. 3. Edit the training data to remove redundant or almost redundant points to speed up the search for the nearest neighbor

What are two difficulties with the practical exploitation of the power of the k-NN approach?

1. the time to find the nearest neighbors in a large training set can be prohibitive. 2. the number of records required in the training set to qualify as large increases exponentially with the number of predictors p.

the values of the predictors for this new record denoted?

Denote by (x1, x2,..., xp)

what erro rate dose k-NN for a Numerical Response use?

Rather than the overall error rate used in classification, RMSE (or another prediction error metric) is used in prediction (see Chapter 5).

how do we pick the number of Ks

Typically, values of k fall in the range of 1-20. Often, an odd number is chosen to avoid ties.

So how is k chosen?

We choose the k that has the best classification performance. We use the training data to classify the records in the validation data, then compute error rates for various choices of k

whey does the number of records required in the training set to qualify as large increases exponentially with the number of predictors p?

because the expected distance to the nearest neighbor goes up dramatically with p. This phenomenon is known as the curse of dimensionality, a fundamental issue pertinent to all classification, prediction, and clustering techniques. This is why we often seek to reduce the number of predictors

Once k is chosen then what?

the algorithm uses it to generate classifications of new records. An example is shown in Figure 7.3, where eight neighbors are used to classify the new household.

reduce the number of predictors through methods such as?

selecting subsets of the predictors for our model or by combining them using methods such as principal components analysis, singular value decomposition, and factor analysis

The advantage of choosing k > 1 is?

that higher values of k provide smoothing that reduces the risk of overfitting due to noise in the training data if k is too low, we may be fitting to the noise in the data if k is too high, we will miss out on the method's ability to capture the local structure in the data

k-nearest neighbor algorithm what can it be used for?

used for classification (of a categorical outcome) or prediction (of a numerical outcome). To classify or predict a new record, the method relies on finding "similar" records in the training data. These "neighbors" are then used to derive a classification or prediction for the new record by voting (for classification) or averaging (for prediction).


Related study sets

Supervisory management Chapter 9 Group Development and Team Building

View Set

Science Lesson 2 - The Periodic Table

View Set

11.23.10 Molecular Genetics I (13 & 14)

View Set

NCM 109 (MATERNAL & CHILD) MIDTERMS

View Set

Managerial Acct- pre chapter 11: Differential Analysis: The Key to Decision Making

View Set

Biochemistry 2, Various, Sociology 2, Psychology 3, Psychology 2, Biology 3, Sociology 1, Biochemistry 1, Biology 1, Biology (Pictures), Biology 2

View Set

American Pageant 16th Edition AP Questions For Part Two

View Set