Classification and Prediction - PowerPoint PPT Presentation

About This Presentation

Title:

Classification and Prediction

Description:

Fuzzy logic uses truth values between 0.0 and 1.0 to represent the degree of ... Boosting and Bagging. Boosting increases classification accuracy ... – PowerPoint PPT presentation

Number of Views:16

Avg rating:3.0/5.0

Slides: 36

Provided by: www1Nt

Category:

more less

Transcript and Presenter's Notes

Title: Classification and Prediction

1
Classification and Prediction

Fuzzy

2
Fuzzy Set Approaches

Fuzzy logic uses truth values between 0.0 and 1.0
to represent the degree of membership (such as
using fuzzy membership graph)
Attribute values are converted to fuzzy values
e.g., income is mapped into the discrete
categories low, medium, high with fuzzy values
calculated
For a given new sample, more than one fuzzy value
may apply
Each applicable rule contributes a vote for
membership in the categories
Typically, the truth values for each predicted
category are summed

3
Fuzzy Sets

Sets with fuzzy boundaries

A Set of tall people
Fuzzy set A
1.0
.9
Membership function
.5
510
62
Heights
4
Membership Functions (MFs)

Characteristics of MFs
Subjective measures
Not probability functions

?tall in Asia
MFs
.8
?tall in the US
.5
.1
510
Heights
5
Fuzzy Sets

Formal definition
A fuzzy set A in X is expressed as a set of
ordered pairs

Membership function (MF)
Universe or universe of discourse
Fuzzy set
A fuzzy set is totally characterized by
a membership function (MF).
6
Fuzzy Sets with Discrete Universes

Fuzzy set A sensible number of children
X 0, 1, 2, 3, 4, 5, 6 (discrete universe)
A (0, .1), (1, .3), (2, .7), (3, 1), (4, .6),
(5, .2), (6, .1)

7
Fuzzy Sets with Cont. Universes

Fuzzy set B about 50 years old
X Set of positive real numbers (continuous)
B (x, mB(x)) x in X

8
Fuzzy Partition

Fuzzy partitions formed by the linguistic values
young, middle aged, and old

lingmf.m
9
Set-Theoretic Operations

Subset
Complement
Union
Intersection

10
Set-Theoretic Operations
subset.m
fuzsetop.m
11
MF Formulation
disp_mf.m
12
Fuzzy If-Then Rules

General format
If x is A then y is B
Examples
If pressure is high, then volume is small.
If the road is slippery, then driving is
dangerous.
If a tomato is red, then it is ripe.
If the speed is high, then apply the brake a
little.

13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
Classification and Prediction

Fuzzy
Support Vector Machine

22
Support Vector Machine

To search the Optimal Separating Hyperplane to
maximize the margin

23
Support Vector Machine

To train SVM is equal to solving a quadratic
programming problem
Test phase
si support vectors, yi class of si
K() kernel function, ai b parameters

24
Support Vector Machine

Kernel Function
K(x,y) ?(x) ?(y)
x,y are vectors in input space
?(x), ?(y) are vectors in feature space
d (feature space) gtgt d (input space)
No need to compute ?(x) explicitly
Tr(x,y) sub(x) sub(y), where sub(x) is a
vector represents all the sub-trees of x.
www.csie.ntu.edu.tw/cjlin

25
Classification and Prediction

Fuzzy
Support Vector Machine
Prediction

26
What Is Prediction?

Prediction is similar to classification
First, construct a model
Second, use model to predict unknown value
Major method for prediction is regression
Linear and multiple regression
Non-linear regression
Prediction is different from classification
Classification refers to predict categorical
class label
Prediction models continuous-valued functions

27
Regress Analysis and Log-Linear Models in
Prediction

Linear regression Y ? ? X
Two parameters , ? and ? specify the line and
are to be estimated by using the data at hand.
using the least squares criterion to the known
values of Y1, Y2, , X1, X2, .
Multiple regression Y b0 b1 X1 b2 X2.
Many nonlinear functions can be transformed into
the above.
Log-linear models
The multi-way table of joint probabilities is
approximated by a product of lower-order tables.
Probability p(a, b, c, d) ?ab ?ac?ad ?bcd

28
Locally Weighted Regression

Construct an explicit approximation to f over a
local region surrounding query instance xq.
Locally weighted linear regression
The target function f is approximated near xq
using the linear function
minimize the squared error distance-decreasing
weight K
the gradient descent training rule
In most cases, the target function is
approximated by a constant, linear, or quadratic
function.

29
Classification and Prediction

Fuzzy
Support Vector Machine
Prediction
Classification accuracy

30
Classification Accuracy Estimating Error Rates

Partition Training-and-testing
use two independent data sets, e.g., training set
(2/3), test set(1/3)
used for data set with large number of samples
Cross-validation
divide the data set into k subsamples
use k-1 subsamples as training data and one
sub-sample as test data --- k-fold
cross-validation
for data set with moderate size
Bootstrapping (leave-one-out)
for small size data

31
Boosting and Bagging

Boosting increases classification accuracy
Applicable to decision trees or Bayesian
classifier
Learn a series of classifiers, where each
classifier in the series pays more attention to
the examples misclassified by its predecessor
Boosting requires only linear time and constant
space

32
Boosting Technique (II) Algorithm

Assign every example an equal weight 1/N
For t 1, 2, , T Do
Obtain a hypothesis (classifier) h(t) under w(t)
Calculate the error of h(t) and re-weight the
examples based on the error
Normalize w(t1) to sum to 1
Output a weighted sum of all the hypothesis, with
each hypothesis weighted according to its
accuracy on the training set

33
Is Accuracy Enough to Judge?

Sensitivity t_pos/pos
Specificity t_neg/neg
Precision t_pos/(t_posf_pos)

34
Classification and Prediction

Decision tree
Bayesian Classification
ANN
KNN
GA
Fuzzy
SVM
Prediction
Some issues

35
Summary

Classification is an extensively studied problem
(mainly in statistics, machine learning neural
networks)
Classification is probably one of the most widely
used data mining techniques with a lot of
extensions
Scalability is still an important issue for
database applications thus combining
classification with database techniques should be
a promising topic
Research directions classification of
non-relational data, e.g., text, spatial,
multimedia, etc..