Naive Bayes Classifiers, an Overview presentation

About This Presentation

Transcript and Presenter's Notes

Title: Naive Bayes Classifiers, an Overview

1
Naive Bayes Classifiers,an Overview

2
What is Naive Bayes Classifier (NBC)?

NBC is a probabilistic classification method.
Classification (A.K.A. discrimination, or
supervised learning) is assigning new cases to
one of a the pre-defined classes given a sample
of cases for which the true classes are known.
NBC is one of the oldest and simplest
classification methods.

3
Some NBC Applications

4
How does NBC Work?

5
How does NBC work, Cntd.

Let X1,, Xm denote our features (Height, weight,
foot size), Y is the class number (1 for men,2
for women), and C is the number of classes (2).
The problem consists of classifying the case
(x1,, xm) to the class c maximizing P(Yc
X1x1,, Xmxm) over c1,, C. Applying Bayes
rule gives
P(Yc X1x1,, Xmxm) P(X1x1,, Xmxm
Yc)P(Yc) / P(X1x1,, Xmxm)
.
Under the NBs assumption of conditional
independence, P(X1x1,, Xmxm Yc) is replaced
by
And NB reduces the original problem to
.

6
An example

7
NBC advantage

Despite unrealistic assumption of independence,
NBC is remarkably successful even when
independence is violated.
Due to its simple structure the NBC it is
appealing when the set of variables is large.
NBC requires a small amount of training data
It only needs to estimate means and variances of
the variables
No need to form the covariance matrix.
Computationally inexpensive.

8
A Demonstration

Data From an online B2B exchange (1220 cases).
Purpose To distinguish cheaters of good sellers.
Predictors
Member type Enterprise, personal, other
Years since joined 1 to 10 years.
No. of months since last membership renewal
Membership Renewal duration.
Type of service bought standard, limited
edition
If the member has a registered company.
If the company page is decorated.
Number of days in which member logged in during
past 60 days.
Industry production, distribution, investment
Target to predict if a seller is likely to cheat
buyers based on data from old sellers.

9
Issues involved Prob. distribution

With discrete (categorical) features, estimating
the probabilities can be done using frequency
counts.
With continuous features one can assume a certain
form of quantitative probability distribution.
There is evidence that discretization of data
before applying NB is effective.
Equal Frequency Discretization (EFD) divides the
sorted values of a continuous variable into k
equally populated bins.

10
Issues involved Zero probabilities

The case when a class and a feature value never
occur together in the training set creates a
problem, because assigning a probability of zero
to one of the terms causes the whole expression
to evaluate to zero.
The zero probability can be replaced by a small
constant, such as 0.5/n where n is the number of
observations in the training set.

11
Issues involved Missing values

In some applications, values are missing not at
random and can be meaningful. Therefore, missing
values are treated as a separate category.
If one does not want to treat missing values as a
separate category, they should be handled prior
to applying this macro with either a missing
value imputation or excluding cases where they
are present.

Write a Comment

User Comments (0)

About PowerShow.com

Naive Bayes Classifiers, an Overview PowerPoint PPT Presentation