Different Forms of Learning: - PowerPoint PPT Presentation

About This Presentation

Title:

Different Forms of Learning:

Description:

Number of Views:46

Avg rating:3.0/5.0

Slides: 9

Provided by: scie231

Learn more at: https://www2.cs.uh.edu

Category:

Tags: different | forms | learning

Transcript and Presenter's Notes

Title: Different Forms of Learning:

1
Learning Paradigms and General Aspects of
Learning

Different Forms of Learning
Learning agent receives feedback with respect to
its actions (e.g. from a teacher)
Supervised Learning feedback is received with
respect to all possible actions of the agent
Reinforcement Learning feedback is only received
with respect to the taken action of the agent
Unsupervised Learning Learning when there is no
hint at all about the correct action
Inductive Learning is a form of supervised
learning that centers on learning a function
based on sets of training examples. Popular
inductive learning techniques include decision
trees, neural networks, nearest neighbor
approaches, discriminant analysis, and
regression.
The performance of an inductive learning system
is usually evaluated using n-fold
cross-validation.

2
Classifier Systems

According to Goldberg 113, a classifier system
is a machine learning system that learns
syntactically simple string rules to guide its
performance in an arbitrary environment.
A classifier system consists of three main
components
Rule and message system
Apportionment of credit system
Genetic Algorithm (for evolving classifers)
First implemented in a system called CS1 by
Holland/Reitman(1978).
Example of classifer rules
000000
0001100
111000
000001
Fitness of a classifier is defined by its
surrounding environments that pays payoff to
classifiers and extract fees from classifiers.
Classifier systems employ a Michigan approach
(populations consist of single rules) in the
context of an externally defined fitness
function.

3
Bucket Brigade Algorithm

Developed by Holland for the apportionment of
credits that relies on the model of a service
economy, consisting of two main componens
auction and a clearing house.
The environment as well as the classifiers post
messages.
Each classifier maintains a bank account that
measures its strength. Classifiers that match a
posted string, make a bid proportial to their
strength. Usually, the highest bidding classifier
is selected to post its message (other, more
parallel schemes are also used)
The auction permits appropriate classifiers to
post their messages. Once a classifier is
selected for activation, it must clear its
payments through a clearing house paying its bid
to other classifiers or the environment for
matching messages rendered. A matched and
activated classifier sends its bid to those
classifiers responsible for sending messages that
matched the bidding classifiers conditions. The
sent bid-money is distributed in some manner
between those classifiers.

4
Bucket Bridgade (continued)

Rules that cooperate with a classifier are
rewarded by receiving the classifiers bid, the
last classifier in a chain receives the
environmental reward, all the other classifiers
receive the reward from their predecessor.
A classifiers strength might be subject to
taxation. The idea that underlies taxation is to
punish inactive classifiers Ti(t)ctax?Si(t)
The strength of a classifier is updated using the
following equation
Si(t1) Si(t) - Pi(t) - Ti(t) Ri(t)
A classifier bids proportional to its strength
Bicbid?Si
Genetic algorithms are used to evolve
classifiers. A classifiers strength defines its
fitness, fitter classifiers reproduce with higher
probability (e.g. roulette wheel might be
employed) and binary string mutation and
crossover operators are used to generate new
classifiers. Newly generated classifiers replace
weak, low strength classifier (other schemes such
as crowding could also be employed).

5
Pittburgh-style Systems

Populations consist of rule-sets, and not of
individual rules.
No bucket brigade algorithms is necessary.
Mechanisms to evaluate individual rules are
usually missing.
Michigan-style systems are geared towards
applications with dynamically changing
requirements (models of adaptation) Pitt-style
systems rely on more static environments assuming
a fixed fitness function for rule-sets that are
not necessary in the Michigan approach.
Pittsburgh approach systems usually have to cope
with variable length chromosomes.
Popular Pittsburgh-style systems include
Smiths LS-1-system (learns symbolic rule-sets)
Janikovs GIL system (learns symbolic rules
employs operators of Michalskis inductive
learning theory as its genetic operators)
GiordanaSaitas REGAL(learns symbolic concept
descriptions)
DELVAUX (learns (numerical) Bayesian rule-sets)

6
New Trends in Learning Classifier Systems (LCS)

Holland-style LCS work is very similar to work in
reinforcement learning, especially Evolutionary
Reinforcement Learning and an approach called
Q-Learning. Newer paper claim that bucket
brigade and Q-Learning are basically the same
thing, and that LCS can benefit from recent
advances in the area of Q-learning.
Wilson accuracy-based XCS has received
significant attention in the literature (to be
covered later)
Holland stresses the adaptive component of his
invention in his newer work.
Recently, many Pittsburgh-style systems have been
designed that learn rule-based systems using
evolutionary computing which are quite different
from Hollands data-driven message passing
systems such as
Systems that learn Bayesian Rules or Bayesian
Belief Networks
Systems that learn fuzzy rules
Systems that learn first order logic rules
Systems that learn PROLOG style programs
Work somewhat similar to classifier systems has
become quite popular in field of agent-based
systems that have to learn how to communicate and
collaborate in a distributed environment.

7
Important Parameters for XCS

XCS learns/maintains the following parameters for
all its classifiers during the course of its
operation
p is the expected payoff has a strong influence
(combined with the rules fitness value) if a
matching classifiers action is selected for
execution.
e is the error made in predicting the payoffs
F (called fitness) denotes a classifiers
normalized accuracy --- accuracy is the inverse
of the degree of error made by a classifier F
combined with as determines which classifiers are
chosen to be deleted from the population. F
combined with p determines which actions of
competing classifiers are selected for execution.
as determines the average size of action-sets
this classifier belonged to the smaller as/F is
the less likely it becomes that this classifier
is deleted.
exp (experience) counts how often the classifier
the classifier belonged to the action set has
some influence on the prediction of other
parameters --- namely, if exp is low default
parameters are used when predicting the other
parameter (especially, for e, F and as)
Moreover, it is important to know that only
classifiers belonging to the action set are
considered for reproduction.

8
Symbolic Empirical Learning (SEL)

SELs topic creating symbolic descriptions,
whose structure is unknown a priori. Its most
important subfield is Learning symbolic concept
descriptions from sets of examples. Popular
systems include
Systems of the ID./C4 family that employ decision
trees (originated from work of Quinlan and his
co-workers). C4.5 is one of the most popular, and
powerful inductive learning system.
Systems of the AQ.-family which originated from
work of Michalski and his co-workers.
On the other hand, various systems that employ
numerical empirical learning have been proposed
to obtain classifications from sets of example
these include
neural networks
systems that employ statistical and/or
probabilistic reasoning, and fuzzy techniques.
GA-style systems (inbetween numerical and
symbolic approaches)