Automated Theory Formation for Machine Learning Tasks - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Automated Theory Formation for Machine Learning Tasks

Description:

This work is very much a sideline. Forward look ahead, ... small, closed carriage eastbound. HR invents concept of small carriages. Forward look ahead fails ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 21
Provided by: simonc4
Category:

less

Transcript and Presenter's Notes

Title: Automated Theory Formation for Machine Learning Tasks


1
Automated Theory Formation for Machine Learning
Tasks
  • Simon Colton

2
Goal of Talk - Feedback
  • This work is very much a sideline
  • Forward look ahead, Bayesian averaging
  • Are the methods proposed novel?
  • Are the methods proposed a good idea?

3
Michalski Trains
Progol takes about a second to solve this
4
Solving this with HR
  • Simple search form concepts blindly
  • Demo HR takes 60 seconds normally
  • Hill climbing not always best option
  • No discernable gradient
  • Better to know when youre nearly there

5
Forward Look Ahead Overview
  • For each new concept invented
  • Pass it through each production rule
  • Check if the concept produced is a winner
  • Can stop checking very quickly in most cases
  • Can check up to three steps in advance
  • Also checks all pairs of concepts
  • Add relevant steps to the agenda

6
Forward Look Ahead Example
  • Trains example
  • small, closed carriage ? eastbound
  • HR invents concept of small carriages
  • Forward look ahead fails
  • HR invents concept of closed carriages
  • Using the compose rule, it realises it should
    combine the two concepts
  • Demo Less than a second

7
Forward Look Ahead Results
  • From 2000 ICML paper
  • Integer sequences
  • 20 sequences, e.g., primes, squares, repdigit
  • Average 373 seconds without look ahead
  • Average 3.5 seconds with look ahead
  • Odious numbers from 90 mins to 7 secs

8
Back to the Trains
Which way will the new train go? Are you sure?
9
A More Considered Approach
  • Progol is great for such IQ test problems
  • Very successful in scientific discovery
  • Situations where many concepts
  • Should be taken into account
  • Automated theory formation
  • Theory contains 100s of concepts conjectures
  • Form a theory, use it all to make predictions
  • Question how best to use a theory?

10
Dataset
  • Example dataset from UCI repository
  • Moral reasoner is someone guilty (Wogulis)
  • 100 guilty people, 100 not guilty, 23 properties
  • Eg., Person p10
  • plan_known(y), careful(y), plan_included_harm(y),
    external_cause(n), etc.
  • Guilty as charged.

11
HRs Concepts
  • HR produces many, many concepts
  • Instantiate careful(y/n) becomes careful(y).
  • Compose careful(n) and plan_known(y)
  • Disjunct external_cause(y) or careful(y)
  • Other ways to produce concepts
  • Not applicable to this dataset

12
Using Bayesian Probabilities
  • Given a theory T constructed by HR
  • Given a new, unseen, person P
  • And possible classifications X1, X2, X3,
  • For each X1
  • For each property concept C in the theory
  • If C is true of P
  • Use examples in the theory to calculate P(X1
    P(C))
  • Average over all Cs in the theory
  • To produce an estimate of P(X1T)
  • Choose X such that P(XT) gt P(X1T) gt

13
Example is p77 guilty???
  • Person p77
  • P(guiltyplan_known(n)) 0.63..
  • P(guiltyseverity_harm(1)) 0.66..
  • Etc. giving average P(guiltyT) 0.53
  • P(not_guiltyplan_known(n)) 0.36..
  • P(not_guiltybenefit_victim(0)) 0.34..
  • Etc. giving average P(not_guiltyT) 0.46..
  • Hence predict that p77 is not guilty
  • Demo HR with p77

14
Very Preliminary Results
  • Train theory using leave-one-out

15
Moral Results
80 of data held back HR gets between 80 and 90
16
Anomalies
  • Holding back less data doesnt help
  • Forming a theory for longer doesnt help?
  • Must exhaust theory formation
  • Up to a given complexity limit
  • Otherwise biasing creeps in

17
Advantages and Disadvantage
  • Can predict for any concept
  • Not just one prescribed in advance
  • Demo did p77 cause harm? Was s/he careful?
  • Often not appropriate (e.g., trains)
  • Is very time/memory inefficient
  • E.g. on laptop cannot finish mutagenesis theory

18
Research Questions
  • Forward look ahead
  • Is this is a novel/plausible approach?
  • Does it scale up (probably not)
  • Bayesian averaging
  • Is there any better way to use the theory?
  • Is averaging a good idea
  • Is it novel
  • Do certain choices for theory formation relate
  • to other machine learning techniques (naïve
    bayes)

19
Other possibilities Near Equivalences
  • HR grew up in mathematics domains
  • A conjecture is only true if it is 100 true
  • But in ML in general an 80 result is good
  • HR now makes near equivalences
  • P is guilty iff severity_harm(1) 75 true
  • P is guilty iff benefit_victim(0) 72 true
  • Could select from these for prediction
  • HR can also fix faulty conjectures
  • Lakatos-style reasoning (Alison Pease Phd)

20
Future Work
  • Develop more strategies for prediction
  • Apply HR to many more datasets
  • Compare the results
  • With standard machine learning algorithms
  • Check out the WEKA project
  • http//www.cs.waikato.ac.nz/ml/
Write a Comment
User Comments (0)
About PowerShow.com