11. NAIVE BAYERS
Naive Bayes: Cons
1. Assumes independence of features 2. No comprehensible high-level model 3. Expressiveness somewhat limited
What are the three Sklearn Naive Bayes Classifiers we covered?
1. Gaussian Naive Bayes (GaussianNB) 2. Multinomial Naive Bayes (MultinomialNB) 3. Complement Naive Bayes (ComplementNB)
Naive Bayes: Pros
1: Easy and FAST to use, train, classify and interpret examples 2: Works well with high dimensional problems. Handles multi class problems 3. Not sensitive to irrelevant features or noise
Define "independence" in Naive Bayes
If two events are independent, then knowing the outcome of onedoes not change the probability of the other
term: Posterior Probability
A revised probability based on additional information. P(A|C) is posterior probability of A (given C occurs)
term: Probabilistic Inference
Computes a desired probability from other known probabilities
Describe Gaussian Naive Bayes (GaussianNB)
Essentially the standard Naïve Bayes classifier Feature distributions assumed to be Gaussian
Why is it "Naive" ?
It's called "naive" because it simplifies the real-world scenario by assuming independence among features, which might not always hold true. However, despite this simplification, Naive Bayes often performs surprisingly well in various text classification and other machine learning tasks, especially when working with relatively simple and well-separated data.
REVIEW / understanding check:
Should understand Bayesian Classificationand use of independence assumption ▪ Should be able to classify an example usingNaïve Bayes ▪ Should understand Laplace smoothing andsome of the details of Naïve Bayes ▪ Should be able to list pros and cons of NaïveBayes and compare to other methods slide 39
Describe Multinomial Naive Bayes (MultinomialNB)
Suitable for discrete features, like word counts Used for text classification (e.g., spam)
Complement Naive Bayes (ComplementNB)
Uses statistics from the complement of each class Variant of multinomial Naïve Bayes Works well with imbalanced data & text classification
Naive Bayes Classifier
an algorithm that predicts the probability of a certain outcome based on prior occurrences of related events Naive Bayes is a simple yet powerful classification technique based on applying Bayes' theorem with a strong assumption of independence between features. Bayes' Theorem: It's a probability theory that describes the probability of an event based on prior knowledge of conditions that might be related to the event. Naive Assumption: Naive Bayes assumes that the presence or absence of a particular feature in a class is unrelated to the presence or absence of any other feature. This is a strong and often unrealistic assumption, but it simplifies the calculation by considering each feature independently. Classification: In data mining, Naive Bayes is used for classification tasks. It calculates the probability of a data point belonging to a certain class given its features. It selects the class with the highest probability as the predicted class for that data point.
Conditional Probability
the probability of an event ( A ), given that another ( B ) has already occurred.
term: Prior Probability
the probability that an event will occur before obtaining additional evidence regarding its occurrence in other words, probability assuming no specific info: P(A) is prior probability of event A occurring