A model of Inductive Bias Learning - PowerPoint PPT Presentation

1 / 22

About This Presentation

Title:

A model of Inductive Bias Learning

Description:

A model of Inductive Bias Learning. Jonathan Baxter. Frans Oliehoek faolieho_at_science.uva.nl ... Model for learning the inductive bias ... – PowerPoint PPT presentation

Number of Views:113

Avg rating:3.0/5.0

Slides: 23

Provided by: Fra5250

Category:

Tags: baxter | bias | inductive | learning | model

Transcript and Presenter's Notes

Title: A model of Inductive Bias Learning

1
A model of Inductive Bias Learning

Jonathan Baxter
Frans Oliehoek
ltfaolieho_at_science.uva.nlgt

2
Overview of presentation

Problem selecting the inductive bias
Environment of related tasks
Revision PAC-learning model
Bias learning model
Example feature learning
The covering numbers (?)
Conclusion
Questions Discussion

3
Problem selecting the inductive bias

Important question
COIL example
How to choose the hypothesis space?
Large enough to contain solution
Small enough to generalize well
Model for learning the inductive bias
Assumption learner is embedded in environment
of related tasks

4
Environment of related tasks

Idea learning a bias for related tasks
Less examples required per task
Bias appropriate for new tasks of same type
Examples
Handwritten character recognition
1 task distinguish A from all characters
Pre-processing good for all different tasks
Recognizing n faces
Learning a bias that is good for recognizing new
faces

5
Revision PAC-learning model

Input, output space X, Y
P prob. distr. on X Y
defines the (non-deterministic) task
H hypothesis space, h X?Y
l loss function. l YY?R
Training error erz(h)
training set z ((x1,y1),,(xm,ym))
Generalization error erP(h)

6
Revision PAC-learning model - 2

Upper bound on number of examples needed for a
certain generalization error by VC-dimension of H
m 1/e ( 4log2(2/d) 8VC(H) log2(13 /e) )
Gives condition under which erP(h) is likely to
be small, but is no guarantee H should still
contain a good hypothesis!
Bias selection of hypothesis space H

7
Bias learning model

Goal learning a good bias
find an appropriate H for
the environment of related tasks
prob. distr. P on X Y is a task
Q is prob. distr. on P
what tasks the learner is likely to see
gives environment (P,Q)
Hypothesis space family H H

8
1 task vs. bias learning

find h ? H
erP(h)
erz(h)
z ((x1,y1),,(xm,ym))
sample complexity bounded by VC-dimension

find H ? H
erQ(H)
erz(H)
z
(x11,y11) (x1m,y1m)
.
(xn1,yn1) (xnm,ynm)
bounded by covering numbers

9
Uniform convergence for bias learners

Covering numbers
capacities related to lower bound on
generalization error and to average loss
functions over n hypotheses
Sample complexity bounds
Given
environment (P, Q)
(n,m) sample z
n gt
m gt
with probability 1-d, all H ? H satisfy erQ(H)
erz(H) e

10
Implications

Any bias the learner selects can bound erQ(H)
In order to learn bias such that
erQ(H) erz(H) e, for all H ? H
Both m and n need to be sufficiently large
When a H ? H with small erz(H) has been learned,
this H can be used to learn new related tasks
with improved bounds
Fix d, e Examples required per task, m,
decreases when number of tasks, n, increases. ?
sharing information between tasks

11
Example feature learning

Feature learning a bias learning problem
Selecting strong features
f X?V, maps input to space V of lower dimension
F f , set of all feature maps
Then applying classification (regr., etc.)
g V ? Y, g ? G
G is a class of functions (H relative to V)
H Gf gf g ? G for each f
H Gf f ? F

12
Feature learning 2

H Hw
H has W parameters (vij, uij), w is the vector of
parameters
Feature map learns k features, using neural net
of h hidden units

13
Feature learning 3

We want to learn a good bias
z (n,m) sample
locate a Hw with a small erz(Hw)
erz(Hw) bla
gradient decent over w and (a1,,ak1)
What n, m for good generalization?
given by theorem, but what about capacities?
for squared loss

14
The covering numbers

Convergence theorem depends on the covering
numbers
Characteristics of H similar to the VC dimension
of H
We start with

15
The covering numbers 2

For each H ? H
function that maps from task P to lower bound
of gen. error
set of all these functions
pseudo-metric
difference in gener. error for distribution Q

16
The covering numbers 3

e-cover of
is a set
The size of the smallest e-cover
Now the capacity of H is

17
Covering numbers 4

average error of n hypothesis on n different
tasks
all of these functions for a certain H
the union over the hyp. family

18
Covering numbers 5

pseudo-metric
difference in error over a fixed vector of tasks
P
again smallest e-cover
the capacity of

19
Feature learning contd

For the network used it can be shown
Therefore, if
with probability 1 - d any Hw satisfies
erQ(H) erz(H) e

20
Choosing the Hypothesis family

Choosing hyp. space family, H
which to select?
hyper-bias
claimed to be easier

21
Conclusions

Formal model to bias learning
Assumption learner embedded in environment of
related tasks
Bounds on sample complexity
First step to formal model hierarchical learning

22
Discussion questions

Who starts?

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

World's Best PowerPoint Templates PowerPoint PPT Presentation

World's Best PowerPoint Templates - CrystalGraphics offers more PowerPoint templates than anyone else in the world, with over 4 million to choose from. Winner of the Standing Ovation Award for “Best PowerPoint Templates” from Presentations Magazine. They'll give your presentations a professional, memorable appearance - the kind of sophisticated look that today's audiences expect. Boasting an impressive range of designs, they will support your presentations with inspiring background photos or videos that support your themes, set the right mood, enhance your credibility and inspire your audiences.

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Bayesian models of inductive learning PowerPoint PPT Presentation

Bayesian models of inductive learning - A sense for how to go about making your own Bayesian models ... Perceiving the world from sense data. Learning about kinds of objects and their properties ... | PowerPoint PPT presentation | free to view

PHL 320 STUDY Fantastic Learning--phl320study.com PowerPoint PPT Presentation

PHL 320 STUDY Fantastic Learning--phl320study.com - FOR MORE CLASSES VISIT www.phl320study.com PHL 320 Assignment Week 1 Apply Creating an Argument (New Syllabus) PHL 320 Assignment Week 1 Practice Knowledge Check (New Syllabus) PHL 320 Assignment Week 2 Apply Vague Statements (New Syllabus) PHL 320 Assignment Week 2 Practice Knowledge Check (New Syllabus) PHL 320 Assignment Week 3 Apply Inductive and Deductive Reasoning (New Syllabus) | PowerPoint PPT presentation | free to view

Grammar induction by Bayesian model averaging PowerPoint PPT Presentation

Grammar induction by Bayesian model averaging - Grammar induction by Bayesian model averaging Guy Lebanon LARG meeting May 2001 Based on Andreas Stolcke s thesis UC Berkeley 1994 Why automatic grammar induction ... | PowerPoint PPT presentation | free to view

Breaking the Word Learning Barrier: How children learn their first words PowerPoint PPT Presentation

Breaking the Word Learning Barrier: How children learn their first words - Breaking the Word Learning Barrier: How children learn their first words Kathy Hirsh-Pasek Roberta Golinkoff Beth Hennon Mandy Maguire | PowerPoint PPT presentation | free to view

Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering PowerPoint PPT Presentation

Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering - The inductive learning hypothesis: ... Else replace ai in h by the next more. general constraint that is satisfied by x ... An UNBiased Learner. Idea: Choose H ... | PowerPoint PPT presentation | free to view

Overfitting, Bias/Variance tradeoff, and Ensemble methods PowerPoint PPT Presentation

Overfitting, Bias/Variance tradeoff, and Ensemble methods - Illustration (1) Problem definition: One input x, uniform random variable in [0,1] ... Illustration (4) No noise doesn't imply no variance (but less variance) ... | PowerPoint PPT presentation | free to view

Information Extraction with Finite State Models and Scoped Learning PowerPoint PPT Presentation

Information Extraction with Finite State Models and Scoped Learning - Information Extraction with Finite State Models and Scoped Learning Andrew McCallum WhizBang Labs & CMU Joint work with John Lafferty (CMU), Fernando Pereira (UPenn), | PowerPoint PPT presentation | free to view

Information Extraction with Finite State Models and Scoped Learning PowerPoint PPT Presentation

Information Extraction with Finite State Models and Scoped Learning - Information Extraction with Finite State Models and Scoped Learning Andrew McCallum WhizBang Labs & CMU Joint work with John Lafferty (CMU), Fernando Pereira (UPenn), | PowerPoint PPT presentation | free to view

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation PowerPoint PPT Presentation

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation - Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 and towards the end from Chapter 5 Introduction to Data Mining | PowerPoint PPT presentation | free to view

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation PowerPoint PPT Presentation

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation - Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach, Kumar | PowerPoint PPT presentation | free to view

Inductive Learning 22 Neural Nets R PowerPoint PPT Presentation

Inductive Learning 22 Neural Nets R - Intros to AI: CS121 and CS221. CS 222: Knowledge Representation. CS 223A: Intro to Robotics. CS 223B: Intro to Computer Vision. CS 224M: Multi-Agent Systems ... | PowerPoint PPT presentation | free to view

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation PowerPoint PPT Presentation

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation - Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach, Kumar | PowerPoint PPT presentation | free to view

Revealing inductive biases with Bayesian models PowerPoint PPT Presentation

Revealing inductive biases with Bayesian models - Tom Griffiths. UC Berkeley. with Mike Kalish, Brian Christian, and Steve Lewandowsky ... Griffiths & Tenenbaum, 2006) ... (Griffiths & Kalish, in press) ... | PowerPoint PPT presentation | free to view

CS 461: Machine Learning Lecture 1 PowerPoint PPT Presentation

CS 461: Machine Learning Lecture 1 - Recognize faces, recognize speech, filter spam. Goals for today ... Spam filtering, terrain navigability (rovers) Classification ... | PowerPoint PPT presentation | free to view

An Introduction to Active Learning PowerPoint PPT Presentation

An Introduction to Active Learning - you will have the opportunity to shoot the speaker halfway through the talk ... bird(Opus) can_fly(Opus) Inductive inference - the best guess we can make ... | PowerPoint PPT presentation | free to view

Introduction to Machine Learning PowerPoint PPT Presentation

Introduction to Machine Learning - Interested in learning Big Data. Click here for more info https://www.dezyre.com/Hadoop-Training-online/19 | PowerPoint PPT presentation | free to view

Learning - Decision Trees PowerPoint PPT Presentation

Learning - Decision Trees - Title: Search problems Author: Jean-Claude Latombe Last modified by: Indrajit Bhattacharya Created Date: 1/10/2000 3:15:18 PM Document presentation format | PowerPoint PPT presentation | free to view

Learning Structure in Unstructured Document Bases PowerPoint PPT Presentation

Learning Structure in Unstructured Document Bases - Learning, Navigating, and Manipulating Structure in Unstructured Data/Document Bases Author: David Cohn Last modified by: David Cohn Created Date: 2/25/2000 1:39:05 PM | PowerPoint PPT presentation | free to view

CS 391L: Machine Learning Text Categorization PowerPoint PPT Presentation

CS 391L: Machine Learning Text Categorization - 1. CS 391L: Machine Learning. Text Categorization. Raymond J. Mooney. University of Texas at Austin ... lottery. win. Friday. exam. computer. May. PM. test ... | PowerPoint PPT presentation | free to view

From Machine Learning to Inductive Logic Programming: ILP made easy PowerPoint PPT Presentation

From Machine Learning to Inductive Logic Programming: ILP made easy - Contents and s in co-operation with Luc De Raedt. of the University of ... identify substructure that causes it to 'dock' on certain other molecules ... | PowerPoint PPT presentation | free to view

Chapter 3: Supervised Learning PowerPoint PPT Presentation

Chapter 3: Supervised Learning - Ensemble methods: Bagging and Boosting. Summary. CS583, Bing Liu, UIC. 3. An example application ... A decision is needed: whether to put a new patient in an ... | PowerPoint PPT presentation | free to view

ECE 353 course Marvelous Learning/snaptutorial.com PowerPoint PPT Presentation

ECE 353 course Marvelous Learning/snaptutorial.com - ECE 353 Week 1 Discussions 1 Theories And Theoretical Frameworks (Piagetian Theory/Nativist Approaches) ECE 353 Week 1 Discussions 2 Genetic And Biological Factors ECE 353 Week 2 Assignment How The Brain Learns ECE 353 Week 2 Discussions 1 Brain Development ECE 353 Week 3 Assignment Role Of Families And The Community ECE 353 Week 3 Discussions 1 Cognitive And Conceptual Development | PowerPoint PPT presentation | free to view

Part III Learning structured representations Hierarchical Bayesian models PowerPoint PPT Presentation

Part III Learning structured representations Hierarchical Bayesian models - Bags in general. Meta-constraints. Shape of the Beta prior. A hierarchical Bayesian model ... Bags in general. Meta-constraints. Learning about feature ... | PowerPoint PPT presentation | free to view

Learning on Relevance Feedback in Content-based Image Retrieval PowerPoint PPT Presentation

Learning on Relevance Feedback in Content-based Image Retrieval - Oral Defense of M. Phil Learning on Relevance Feedback in Content-based Image Retrieval Hoi, Chu-Hong ( Steven ) Supervisor: Prof. Michael R. Lyu | PowerPoint PPT presentation | free to view

From Machine Learning to Inductive Logic Programming: ILP made easy PowerPoint PPT Presentation

From Machine Learning to Inductive Logic Programming: ILP made easy - Contents and s in co-operation with Luc De Raedt. of the University of ... identify substructure that causes it to 'dock' on certain other molecules ... | PowerPoint PPT presentation | free to view

Machine Learning Chapter 3. Decision Tree Learning PowerPoint PPT Presentation

Machine Learning Chapter 3. Decision Tree Learning - A Tree to Predict C-Section Risk. Learned from medical records of 1000 women ... Local minima... Statistically-based search choices. Robust to noisy data... | PowerPoint PPT presentation | free to view

Grammar induction by Bayesian model averaging PowerPoint PPT Presentation

Grammar induction by Bayesian model averaging - The solution of expanding the grammar leads to explosion of grammar rules. ... For each grammar (rule probabilities rules), a prior probability p(M) is assigned. ... | PowerPoint PPT presentation | free to view