Classification and Prediction - PowerPoint PPT Presentation

About This Presentation
Title:

Classification and Prediction

Description:

Fuzzy logic uses truth values between 0.0 and 1.0 to represent the degree of ... Boosting and Bagging. Boosting increases classification accuracy ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 36
Provided by: www1Nt
Category:

less

Transcript and Presenter's Notes

Title: Classification and Prediction


1
Classification and Prediction
  • Fuzzy

2
Fuzzy Set Approaches
  • Fuzzy logic uses truth values between 0.0 and 1.0
    to represent the degree of membership (such as
    using fuzzy membership graph)
  • Attribute values are converted to fuzzy values
  • e.g., income is mapped into the discrete
    categories low, medium, high with fuzzy values
    calculated
  • For a given new sample, more than one fuzzy value
    may apply
  • Each applicable rule contributes a vote for
    membership in the categories
  • Typically, the truth values for each predicted
    category are summed

3
Fuzzy Sets
  • Sets with fuzzy boundaries

A Set of tall people
Fuzzy set A
1.0
.9
Membership function
.5
510
62
Heights
4
Membership Functions (MFs)
  • Characteristics of MFs
  • Subjective measures
  • Not probability functions

?tall in Asia
MFs
.8
?tall in the US
.5
.1
510
Heights
5
Fuzzy Sets
  • Formal definition
  • A fuzzy set A in X is expressed as a set of
    ordered pairs

Membership function (MF)
Universe or universe of discourse
Fuzzy set
A fuzzy set is totally characterized by
a membership function (MF).
6
Fuzzy Sets with Discrete Universes
  • Fuzzy set A sensible number of children
  • X 0, 1, 2, 3, 4, 5, 6 (discrete universe)
  • A (0, .1), (1, .3), (2, .7), (3, 1), (4, .6),
    (5, .2), (6, .1)

7
Fuzzy Sets with Cont. Universes
  • Fuzzy set B about 50 years old
  • X Set of positive real numbers (continuous)
  • B (x, mB(x)) x in X

8
Fuzzy Partition
  • Fuzzy partitions formed by the linguistic values
    young, middle aged, and old

lingmf.m
9
Set-Theoretic Operations
  • Subset
  • Complement
  • Union
  • Intersection

10
Set-Theoretic Operations
subset.m
fuzsetop.m
11
MF Formulation
disp_mf.m
12
Fuzzy If-Then Rules
  • General format
  • If x is A then y is B
  • Examples
  • If pressure is high, then volume is small.
  • If the road is slippery, then driving is
    dangerous.
  • If a tomato is red, then it is ripe.
  • If the speed is high, then apply the brake a
    little.

13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
Classification and Prediction
  • Fuzzy
  • Support Vector Machine

22
Support Vector Machine
  • To search the Optimal Separating Hyperplane to
    maximize the margin

23
Support Vector Machine
  • To train SVM is equal to solving a quadratic
    programming problem
  • Test phase
  • si support vectors, yi class of si
  • K() kernel function, ai b parameters

24
Support Vector Machine
  • Kernel Function
  • K(x,y) ?(x) ?(y)
  • x,y are vectors in input space
  • ?(x), ?(y) are vectors in feature space
  • d (feature space) gtgt d (input space)
  • No need to compute ?(x) explicitly
  • Tr(x,y) sub(x) sub(y), where sub(x) is a
    vector represents all the sub-trees of x.
  • www.csie.ntu.edu.tw/cjlin

25
Classification and Prediction
  • Fuzzy
  • Support Vector Machine
  • Prediction

26
What Is Prediction?
  • Prediction is similar to classification
  • First, construct a model
  • Second, use model to predict unknown value
  • Major method for prediction is regression
  • Linear and multiple regression
  • Non-linear regression
  • Prediction is different from classification
  • Classification refers to predict categorical
    class label
  • Prediction models continuous-valued functions

27
Regress Analysis and Log-Linear Models in
Prediction
  • Linear regression Y ? ? X
  • Two parameters , ? and ? specify the line and
    are to be estimated by using the data at hand.
  • using the least squares criterion to the known
    values of Y1, Y2, , X1, X2, .
  • Multiple regression Y b0 b1 X1 b2 X2.
  • Many nonlinear functions can be transformed into
    the above.
  • Log-linear models
  • The multi-way table of joint probabilities is
    approximated by a product of lower-order tables.
  • Probability p(a, b, c, d) ?ab ?ac?ad ?bcd

28
Locally Weighted Regression
  • Construct an explicit approximation to f over a
    local region surrounding query instance xq.
  • Locally weighted linear regression
  • The target function f is approximated near xq
    using the linear function
  • minimize the squared error distance-decreasing
    weight K
  • the gradient descent training rule
  • In most cases, the target function is
    approximated by a constant, linear, or quadratic
    function.

29
Classification and Prediction
  • Fuzzy
  • Support Vector Machine
  • Prediction
  • Classification accuracy

30
Classification Accuracy Estimating Error Rates
  • Partition Training-and-testing
  • use two independent data sets, e.g., training set
    (2/3), test set(1/3)
  • used for data set with large number of samples
  • Cross-validation
  • divide the data set into k subsamples
  • use k-1 subsamples as training data and one
    sub-sample as test data --- k-fold
    cross-validation
  • for data set with moderate size
  • Bootstrapping (leave-one-out)
  • for small size data

31
Boosting and Bagging
  • Boosting increases classification accuracy
  • Applicable to decision trees or Bayesian
    classifier
  • Learn a series of classifiers, where each
    classifier in the series pays more attention to
    the examples misclassified by its predecessor
  • Boosting requires only linear time and constant
    space

32
Boosting Technique (II) Algorithm
  • Assign every example an equal weight 1/N
  • For t 1, 2, , T Do
  • Obtain a hypothesis (classifier) h(t) under w(t)
  • Calculate the error of h(t) and re-weight the
    examples based on the error
  • Normalize w(t1) to sum to 1
  • Output a weighted sum of all the hypothesis, with
    each hypothesis weighted according to its
    accuracy on the training set

33
Is Accuracy Enough to Judge?
  • Sensitivity t_pos/pos
  • Specificity t_neg/neg
  • Precision t_pos/(t_posf_pos)

34
Classification and Prediction
  • Decision tree
  • Bayesian Classification
  • ANN
  • KNN
  • GA
  • Fuzzy
  • SVM
  • Prediction
  • Some issues

35
Summary
  • Classification is an extensively studied problem
    (mainly in statistics, machine learning neural
    networks)
  • Classification is probably one of the most widely
    used data mining techniques with a lot of
    extensions
  • Scalability is still an important issue for
    database applications thus combining
    classification with database techniques should be
    a promising topic
  • Research directions classification of
    non-relational data, e.g., text, spatial,
    multimedia, etc..
Write a Comment
User Comments (0)
About PowerShow.com