Data Mining - PowerPoint PPT Presentation

About This Presentation
Title:

Data Mining

Description:

Data Mining Lecture 3 – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 24
Provided by: Erta7
Category:

less

Transcript and Presenter's Notes

Title: Data Mining


1
Data Mining
  • Lecture 3

2
Course Syllabus
  • Course topics
  • Introduction (Week1-Week2)
  • What is Data Mining?
  • Data Collection and Data Management Fundamentals
  • The Essentials of Learning
  • The Emerging Needs for Different Data Analysis
    Perspectives
  • Data Management and Data Collection Techniques
    for Data Mining Applications (Week3-Week4)
  • Data Warehouses Gathering Raw Data from
    Relational Databases and transforming into
    Information.
  • Information Extraction and Data Processing
    Techniques
  • Data Marts The need for building highly
    specialized data storages for data mining
    applications

3
Week3- Remainder-Data to Knowledge Pyramid
Increasing potential to support business decisions
End User
Making Decisions
Business Analyst
Data Presentation
Visualization Techniques
Data Mining
Data Analyst
Information Discovery
Data Exploration
Statistical Analysis, Querying and Reporting
Data Warehouses / Data Marts
OLAP, MDA
DBA
Data Sources
Paper, Files, Information Providers, Database
Systems, OLTP
4
Week 2 - Remainder - Data Mining Perspective to
Knowledge Discovery
Knowledge
adapted from U. Fayyad, et al. (1995), From
Knowledge Discovery to Data Mining An
Overview, Advances in Knowledge Discovery and
Data Mining, U. Fayyad et al. (Eds.), AAAI/MIT
Press
5
Week 3 - Remainder - Essentials of Learning
  • Learning ?
  • can we formalize it?
  • is it just a chemical activation?
  • is it memorization?
  • is it continous node connecting/disconnecting on
    dynamically changing brain network topology?

6
Week 3- Remainder -Essentials of Learning
  • The Artifical Intelligence View
  • central to human knowledge and intelligence,
    essential for building intelligent machines.
  • years of effort in AI has shown that trying to
    build intelligent computers by programming all
    the rules cannot be done automatic learning is
    crucial. For example, we humans are not born with
    the ability to understand language we learn it
    and it makes sense to try to have computers
    learn language instead of trying to program it
    all it

7
Week 3- Remainder- Essentials of Learning
  • The Software Engineering View
  • Machine Learning allows us to program computers
    by example, which can be easier than writing code
    the traditional way.
  • The Stats View
  • Machine Learning is the marriage of computer
    science and statistics
  • computational techniques are applied to
    statistical problems. Machine Learning has been
    applied to a vast number of problems in many
    contexts, beyond the typical statistics problems.
    Machine Learning is often designed with different
    considerations than statistics (e.g., speed is
    often more important than accuracy).

8
Week 3- Essentials of Learning
Informal Learning Problem Definition computer
program that improves its performance at some
task through experience Formal Learning
Problem Definition computer program is said to
learn from experience E with respect to some
class of tasks T and performance measure P, if
its performance at tasks in T, as measured by P,
improves with experience E
9
Week 3- Essentials of Learning
A chess learning problem Task T playing
chess Performance measure P percent of games won
against opponents Training experience E playing
practice games against itself A handwriting
recognition learning problem Task T recognizing
and classifying handwritten words within
images Performance measure P percent of words
correctly classified Training experience E a
database of handwritten words with given
classifications A robot driving learning
problem Task T driving on public four-lane
highways using vision sensors Performance measure
P average distance traveled before an error (as
judged by human overseer) Training experience E
a sequence of images and steering commands
recorded while observing a human driver
10
Week 3- Essentials of Learning
Attributes of Experience learn from direct
training examples consisting of states and the
correct move for each supervised
learning -CHESS PROBLEM providing individual
chess board states and the correct move for
each learn from indirect information consisting
of the moves and final outcomes of these moves.
unsupervised learning -CHESS PROBLEM providing
sequences of moves and final outcomes of various
games played causality credit assignment
11
Week 3- Essentials of Learning
Attributes of Experience the degree to which the
learner controls the sequence of training
examples CHESS PROBLEM rely on the teacher to
select informative board states and to provide
the correct move for each the learner might
itself propose board states that it finds
particularly confusing and ask the teacher for
the correct move learner may have complete
control over both the board states and (indirect)
training classifications, as it does when it
learns by playing against itself with no
teacher present
12
Week 3- Essentials of Learning
Attributes of Experience how well it represents
the distribution of examples over which the final
system performance P must be measured !!!! most
current theory of machine learning rests on the
crucial assumption that the distribution of
training examples is identical to the
distribution of test examples.Despite our need
to make this assumption in order to obtain
theoretical results, it is important to keep in
mind that this assumption must often be violated
in practice.
13
Week 3- Essentials of Learning
Central Limit Theorem The Central Limit Theorem
is a theorem stating that the sum of a large
number of independent, identically distributed
random variables approximately follows a Normal
distribution Consider a set of independent,
identically distributed random variables Y1 . . .
YN, governed by an arbitrary probability
Distribution even if we dont know the
distrubition of individual Yi but we could
compute the distribution of A common rule of
thumb is that we can use the Normal approximation
when n gt 30
follows the Normal Distrubition
14
Week 3- Essentials of Learning
Operational Definition of Learning Function-
Target Function Given training experience and
target definition deciding on learning
architecture by considering correctness applicabil
ity performance CHESS PROBLEM Ideal Target
Function ChooseMove B -gt M to indicate that
this function accepts as input any board from the
set of legal boardstates B and produces as
output some move from the set of legal moves
M. What if indirect training experience
available to our system ?
15
Week 3- Essentials of Learning
Operational Definition of Learning Function-
Target Function Given training experience and
target definition deciding on learning
architecture by considering correctness applicabil
ity performance CHESS PROBLEM Operational Target
Function V B -gtR to denote that V maps any
legal board state from the set B to some real
value. assign higher scores to better board
states. then use it to select the best move from
any current board position. can be accomplished
by generating the successor board state produced
by every legal move, then using V to choose the
best successor state and therefore the best
legal move. Is it really operational Not So!!
searching all the way down to the end of game.
Computationally not operational
16
Week 3- Essentials of Learning
Operational Definition of Learning Function-
Target Function Given training experience and
target definition deciding on learning
architecture by considering correctness applicabil
ity performance CHESS PROBLEM Operational Target
Function V B -gtR to denote that V maps any
legal board state from the set B to some real
value. assign higher scores to better board
states. then use it to select the best move from
any current board position. can be accomplished
by generating the successor board state produced
by every legal move, then using V to choose the
best successor state and therefore the best
legal move. Is it really operational Not So!!
searching all the way down to the end of game.
Computationally not operational
17
Week 3- Essentials of Learning
Operational Definition of Learning Function-
Target Function Given training experience and
target definition deciding on learning
architecture by considering correctness applicabil
ity performance CHESS PROBLEM Choosing complex
target function brings expressebility but also
bring performance battleneck also brings the
urgent need on extra more training examples (a
lot more) to learn Real issue is choosing the
operation target function -gt MODELING-gtfunction
approximation
18
Week 3- Essentials of Learning
Importance of Target Function Target function
simply determines the size of our hypothesis
space (solution space) What if needed solution
cannot be represented in our hypothesis
space lets have perfect hypothetical H
hypothesis space that can represent every
teachable function so expressebility is not our
problem are we OK with that H NO we are now
completely unable to generalize beyond the
observed examples
19
Week 3- Essentials of Learning
Importance of Target Function if we need
generalization and applicability to unseen
instances we must choose biased-target function
(generalizable target function) a learner that
makes no a priori assumptions regarding the
identity of the target function has no rational
basis for classifying any unseen instances we
simply wish to capture here is the policy by
which the learner generalizes beyond the observed
training data, to infer the classification of new
instances
20
Week 3- Essentials of Learning
Inductive Bias target concept target
function Formal Definition
21
Week 3- Essentials of Learning
Inductive Bias Examples
ROTE-LEARNER Learning corresponds simply to
storing each observed training example in memory.
Subsequent instances are classified by looking
them up in memory. If the instance is found in
memory, the stored classification is returned.
Otherwise, the system refuses to classify the new
instance. Inductive Bias Bias-
Free CANDIDATE-ELIMINATlION ALGORITHM New
instances are classified only in the case where
all members of the current version space (subset
of hypetheses consistent with our training
examples) agree on the classification. Otherwise,
the system refuses to classify the new
instance. Inductive Bias Target Concept in the
Hypothesis Space FIND-S This algorithm, finds
the most specific hypothesis consistent with the
training examples. It then uses this hypothesis
to classify all subsequent instances. Inductive
Bias Target Concept in the Hypothesis Space and
most specific hypothesis represent it
22
Week 3- Essentials of Learning
Search Bias vs Restriction Bias
ID3 searches a complete hypothesis space (i.e.,
one capable of expressing any finite
discrete-valued function). It searches
incompletely through this space, from simple to
complex hypotheses, until its termination
condition is met (e.g., until it finds a
hypothesis consistent with the data). Its
inductive bias is solely a consequence of the
ordering of hypotheses by its search strategy.
Its hypothesis space introduces no additional
bias. (SEARCH BIAS, PREFERENCE BIAS) CANDIDATE-EL
IMINATlON algorithm searches an
incomplete hypothesis space (i.e., one that can
express only a subset of the potentially teachable
concepts, version space). It searches this space
completely, finding every hypothesis consistent
with the training data. Its inductive bias is
solely a consequence of the expressive power of
its hypothesis representation. Its search
strategy introduces no additional bias.
(RESTRICTION BIAS LANGUAGE BIAS)
23
Week 3-End
  • read
  • Supplemantary Book Machine Learning- Tom
    Mitchell Chapter 1 Chapter 2
  • Course Text Book Chapter 2 (preparation for the
    next week)
Write a Comment
User Comments (0)
About PowerShow.com