CIS 830 Advanced Topics in AI Lecture 45 of 45 - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

CIS 830 Advanced Topics in AI Lecture 45 of 45

Description:

Applications: control systems, robotics, game playing ... Applications: pattern recognition, time series prediction ... up to massive real-world data sets (e.g. ... – PowerPoint PPT presentation

Number of Views:138
Avg rating:3.0/5.0
Slides: 29
Provided by: willia48
Category:

less

Transcript and Presenter's Notes

Title: CIS 830 Advanced Topics in AI Lecture 45 of 45


1
Lecture 45
Course Review and Future Research Directions
Friday, May 5, 2000 William H. Hsu Department of
Computing and Information Sciences,
KSU http//www.cis.ksu.edu/bhsu Readings Chapte
rs 1-10, 13, Mitchell Chapters 14-21, Russell and
Norvig
2
Main ThemesArtificial Intelligence and KDD
  • Analytical Learning Combining Symbolic and
    Numerical AI
  • Inductive learning
  • Role of knowledge and deduction in integrated
    inductive and analytical learning
  • Artificial Neural Networks (ANNs) for KDD
  • Common neural representations current
    limitations
  • Incorporating knowledge into ANN learning
  • Uncertain Reasoning in Decision Support
  • Probabilistic knowledge representation
  • Bayesian knowledge and data engineering (KDE)
    elicitation, causality
  • Data mining KDD applications
  • Role of causality and explanations in KDD
  • Framework for data mining wrappers for
    performance enhancement
  • Genetic Algorithms (GAs) for KDD
  • Evolutionary algorithms (GAs, GP) as optimization
    wrappers
  • Introduction to classifier systems

3
Class 0A Brief Overview of Machine Learning
  • Overview Topics, Applications, Motivation
  • Learning Improving with Experience at Some Task
  • Improve over task T,
  • with respect to performance measure P,
  • based on experience E.
  • Brief Tour of Machine Learning
  • A case study
  • A taxonomy of learning
  • Intelligent systems engineering specification of
    learning problems
  • Issues in Machine Learning
  • Design choices
  • The performance element intelligent systems
  • Some Applications of Learning
  • Database mining, reasoning (inference/decision
    support), acting
  • Industrial usage of intelligent systems

4
Class 1Integrating Analytical and Inductive
Learning
  • Learning Specification (Inductive, Analytical)
  • Instances X, target function (concept) c X ? H,
    hypothesis space H
  • Training examples D positive, negative examples
    of target function c
  • Analytical learning also given domain theory T
    for explaining examples
  • Domain Theories
  • Expressed in formal language propositional
    logic, predicate logic
  • Set of assertions (e.g., well-formed formulae)
    for reasoning about domain
  • Expresses constraints over relations (predicates)
    within model
  • Example Ancestor (x, y) ? Parent (x, z) ?
    Ancestor (z, y).
  • Determine
  • Hypothesis h ? H such that h(x) c(x) for all x
    ? D
  • Such h are consistent with training data and
    domain theory T
  • Integration Approaches
  • Explanation (proof and derivation)-based
    learning EBL
  • Pseudo-experience incorporating knowledge of
    environment, actuators
  • Top-down decomposition programmatic (procedural)
    knowledge, advice

5
Classes 2-3Explanation-Based Neural Networks
  • Paper
  • Topic Explanation-Based and Inductive Learning
    in ANNs
  • Title Integrating Inductive Neural Network
    Learning and EBL
  • Authors Thrun and Mitchell
  • Presenter William Hsu
  • Key Strengths
  • Idea (state, action)-to-state mappings as steps
    in generalizable proof (explanation) for observed
    episode
  • Generalizable approach (significant for RL, other
    learning-to-predict inducers)
  • Key Weaknesses
  • Other numerical learning models (HMMs, DBNs) may
    be more suited to EBG
  • Tradeoff domain theory of EBNN lacks semantic
    clarity of symbolic EBL
  • Future Research Issues
  • How to get the best of both worlds (clear DT,
    ability to generate explanations)?
  • Applications to explanation in commercial,
    military, legal decision support
  • See work by Thrun, Mitchell, Shavlik, Towell,
    Pearl, Heckerman

6
Classes 4-5Phantom Induction
  • Paper
  • Topic Distal Supervised Learning and Phantom
    Induction
  • Title Iterated Phantom Induction a Little
    Knowledge Can Go a Long Way
  • Authors Brodie and Dejong
  • Presenter Steve Gustafson
  • Key Strengths
  • Idea apply knowledge to generate
    (pseudo-experiential) training data
  • Speedup learning curve significantly shortened
    with respect to RL by application of small
    amount of knowledge
  • Key Weaknesses
  • Havent yet seen how to produce plausible,
    comprehensible explanations
  • How much knowledge is a small amount? (How to
    measure?)
  • Future Research Issues
  • Control, planning domains similar (but not
    identical) to robot games
  • Applications adaptive (e.g., ANN, BBN, MDP, GA)
    agent control, planning
  • See work by Brodie, Dejong, Rumelhart,
    McClelland, Sutton, Barto

7
Classes 6-7Top-Down Hybrid Learning
  • Paper
  • Topic Learning with Prior Knowledge
  • Title A Divide-and-Conquer Approach to Learning
    from Prior Knowledge
  • Authors Chown and Dietterich
  • Presenter Aiming Wu
  • Key Strengths
  • Idea apply programmatic (procedural) knowledge
    to select training data
  • Uses simulation to boost inductive learning
    performance (cf. model checking)
  • Divide-and-conquer approach (multiple experts)
  • Key Weaknesses
  • Doesnt illustrate form, structure of
    programmatic knowledge clearly
  • Doesnt systematize and formalize model checking
    / simulation approach
  • Future Research Issues
  • Model checking and simulation-driven hybrid
    learning
  • Applications consensus under uncertainty,
    simulation-based optimization
  • See work by Dietterich, Frawley, Mitchell,
    Darwiche, Pearl

8
Classes 8-9Learning Using Prior Knowledge
  • Paper
  • Topic Refinement of Approximate Domain-Theoretic
    Knowledge
  • Title Refinement of Approximate Domain Theories
    by Knowledge-Based Neural Networks
  • Authors Towell, Shavlik, and Noordewier
  • Presenter Li-Jun Wang
  • Key Strengths
  • Idea build relational explanations compile into
    ANN representation
  • Applies structural, functional, constraint-based
    knowledge
  • Uses ANN to further refine domain theory
  • Key Weaknesses
  • Cant get refined domain theory back!
  • Explanations also no longer clear after
    compilation (transformation) process
  • Future Research Issues
  • How to retain semantic clarity of explanations,
    DT, knowledge representation
  • Applications intelligent filters (e.g., fraud
    detection), decision support
  • See work by Shavlik, Towell, Maclin, Sun,
    Schwalb, Heckerman

9
Class 10Introduction to Artificial Neural
Networks
  • Architectures
  • Nonlinear transfer functions
  • Multi-layer networks of nonlinear units (sigmoid,
    hyperbolic tangent)
  • Hidden layer representations
  • Backpropagation of Error
  • The backpropagation algorithm
  • Relation to error gradient function for nonlinear
    units
  • Derivation of training rule for feedfoward
    multi-layer networks
  • Training issues local optima, overfitting
  • References Chapter 4, Mitchell Chapter 4,
    Bishop Rumelhart et al
  • Research Issues How to
  • Learn from observation, rewards and penalties,
    and advice
  • Distribute rewards and penalties through learning
    model, over time
  • Generate pseudo-experiential training instances
    in pattern recognition
  • Partition learning problems on the fly, via
    (mixture) parameter estimation

10
Classes 11-12Reinforcement Learning and Advice
  • Paper
  • Topic Knowledge and Reinforcement Learning in
    Intelligent Agents
  • Title Incorporating Advice into Agents that
    Learn from Reinforcements
  • Authors Maclin and Shavlik
  • Presenter Kiranmai Nandivada
  • Key Strengths
  • Idea compile advice into ANN representation for
    RL
  • Advice expressed in terms of constraint-based
    knowledge
  • Like KBANN, achieves knowledge refinement through
    ANN training
  • Key Weaknesses
  • Like KBANN, lose semantic clarity of advice,
    policy, explanations
  • How to evaluate refinement effectively?
    Quantitatively? Logically?
  • Future Research Issues
  • How to retain semantic clarity of explanations,
    DT, knowledge representation
  • Applications intelligent agents, web mining
    (spiders, search engines), games
  • See work by Shavlik, Maclin, Stone, Veloso, Sun,
    Sutton, Pearl, Kuipers

11
Classes 13-14Reinforcement Learning Over Time
  • Paper
  • Topic Temporal-Difference Reinforcement Learning
  • Title TD Models Modeling the World at a Mixture
    of Time Scales
  • Author Sutton
  • Presenter Vrushali Koranne
  • Key Strengths
  • Idea combine state-action evaluation function
    (Q) estimates over multiple time steps of
    lookahead
  • Effective temporal credit assignment (TCA)
  • Biologically plausible (simulates TCA aspects of
    dopaminergic system)
  • Key Weaknesses
  • TCA methodology is effective but semantically
    hard to comprehend
  • Slow convergence can knowledge help? How will
    we judge?
  • Future Research Issues
  • How to retain clarity, improve convergence speed,
    of multi-time RL models
  • Applications control systems, robotics, game
    playing
  • See work by Sutton, Barto, Mitchell, Kaelbling,
    Smyth, Shafer, Goldberg

12
Classes 15-16Generative Neural Models
  • Paper
  • Topic Pattern Recognition using Unsupervised
    ANNs
  • Title The Wake-Sleep Algorithm for Unsupervised
    Neural Networks
  • Authors Hinton, Dayan, Frey, and Neal
  • Presenter Prasanna Jayaraman
  • Key Strengths
  • Idea use 2-phase algorithm to generate training
    instances (dream stage) and maximize
    conditional probability of data given model
    (wake stage)
  • Compare expectation-maximization (EM) algorithm
  • Good for image recognition
  • Key Weaknesses
  • Not all data admits this approach (small samples,
    ill-defined features)
  • Not immediately clear how to use for
    problem-solving performance elements
  • Future Research Issues
  • Studying information theoretic properties of
    Helmholtz machine
  • Applications image/speech/signal recognition,
    document categorization
  • See work by Hinton, Dayan, Frey, Neal,
    Kirkpatrick, Hajek, Gharahmani

13
Classes 17-18Modularity in Neural Systems
  • Paper
  • Topic Combining Models using Modular ANNs
  • Title Modular and Hierarchical Learning Systems
  • Authors Jordan and Jacobs
  • Presenter Afrand Agah
  • Key Strengths
  • Idea use interleaved EM update steps to update
    expert, gating components
  • Effect forces specialization among ANN
    components (GLIMs) boosts performance of single
    experts very fast convergence in some cases
  • Explores modularity in neural systems (artificial
    and biological)
  • Key Weaknesses
  • Often cannot achieve higher accuracy than ML,
    MAP, Bayes optimal estimation
  • Doesnt provide experts that specialize in
    spatial, temporal pattern recognition
  • Future Research Issues
  • Constructing, selecting mixtures of other ANN
    components (not just GLIMs)
  • Applications pattern recognition, time series
    prediction
  • See work by Jordan, Jacobs, Nowlan, Hinton,
    Barto, Jaakola, Hsu

14
Class 19Introduction to Probabilistic Reasoning
  • Architectures
  • Bayesian (Belief) Networks
  • Tree structured, polytrees
  • General
  • Decision networks
  • Temporal variants (beyond scope of this course)
  • Parameter Estimation
  • Maximum likelihood (MLE), maximum a posteriori
    (MAP)
  • Bayes optimal classification, Bayesian learning
  • References Chapter 6, Mitchell Chapters 14-15,
    19, Russell and Norvig
  • Research Issues How to
  • Learn from observation, rewards and penalties,
    and advice
  • Distribute rewards and penalties through learning
    model, over time
  • Generate pseudo-experiential training instances
    in pattern recognition
  • Partition learning problems on the fly, via
    (mixture) parameter estimation

15
Classes 20-21Approaches to Uncertain Reasoning
  • Paper
  • Topic The Case for Probability
  • Title In Defense of Probability
  • Author Cheeseman
  • Presenter Pallavi Paranjape
  • Key Strengths
  • Idea probability is mathematically sound way to
    represent uncertainty
  • Views of probability considered objectivist,
    frequentist, logicist, subjectivist
  • Argument made for meta-subjectivist belief
    measure concept of probability
  • Key Weaknesses
  • Highly dogmatic view without concrete
    justification for all assertions
  • Does not quantitatively, empirically compare
    Bayesian, non-Bayesian methods
  • Future Research Issues
  • Integrating symbolic and numerical (statistical)
    models of uncertainty
  • Applications uncertain reasoning, pattern
    recognition, learning
  • See work by Cheeseman, Cox, Good, Pearl, Zadeh,
    Dempster, Shafer

16
Classes 22-23Learning Bayesian Network Structure
  • Paper
  • Topic Learning Bayesian Networks from Data
  • Title Learning Bayesian Network Structure from
    Massive Datasets
  • Authors Friedman, Pe'er, Nachman
  • Presenter Jincheng Gao
  • Key Strengths
  • Idea can use graph constraints, scoring
    functions to select candidate parents in
    constructing directed graph model of probability
    (BBN)
  • Tabu search, greedy score-based methods (K2),
    etc. also considered
  • Key Weaknesses
  • Optimal Bayesian network structure learning still
    intractable for conventional (single-instruction
    sequential) architectures
  • More empirical comparison among alternative
    methods warranted
  • Future Research Issues
  • Scaling up to massive real-world data sets (e.g.,
    medical, agricultural, DSS)
  • Applications diagnosis, troubleshooting, user
    modeling, intelligent HCI
  • See work by Friedman, Goldszmidt, Heckerman,
    Cooper, Beinlich, Koller

17
Classes 24-25Bayesian Networks for User Modeling
  • Paper
  • Topic Decision Support Systems and Bayesian User
    Modeling
  • Title The Lumiere Project Bayesian User
    Modeling for Inferring the Goals and Needs of
    Software Users
  • Authors Horvitz, Breese, Heckerman, Hovel,
    Rommelse
  • Presenter Yuhui (Cathy) Liu
  • Key Strengths
  • Idea BBN model is developed from user logs, used
    to infer mode of usage
  • Can infer goals, skill level of user
  • Key Weaknesses
  • Need high accuracy in inferring goals to deliver
    meaningful content
  • May be better to use next-generation search
    engine (more interactivity, less passive
    monitoring)
  • Future Research Issues
  • Designing better interactive user modeling
  • Applications clickstream monitoring, e-commerce,
    web search, help
  • See work by Horvitz, Breese, Heckerman, Lee,
    Huang

18
Classes 26-27Causal Reasoning
  • Paper
  • Topic KDD and Causal Reasoning
  • Title Symbolic Causal Networks for Reasoning
    about Actions and Plans
  • Authors Darwiche and Pearl
  • Presenter Yue Jiao
  • Key Strengths
  • Idea use BBN to represent symbolic constraint
    knowledge
  • Can use to generate mechanistic explanations
  • Model actions
  • Model sequences of actions (plans)
  • Key Weaknesses
  • Integrative methods (numerical, symbolic BBNs)
    still need exploration
  • Unclear how to incorporate methods for learning
    to plan
  • Future Research Issues
  • Reasoning about systems
  • Applications uncertain reasoning, pattern
    recognition, learning
  • See work by Horvitz, Breese, Heckerman, Lee,
    Huang

19
Classes 28-29Knowledge Discovery from
Scientific Data
  • Paper
  • Topic KDD for Scientific Data Analysis
  • Title KDD for Science Data Analysis Issues and
    Examples
  • Authors Fayyad, Haussler, and Stolorz
  • Presenter Arulkumar Elumalai
  • Key Strengths
  • Idea investigate how and whether KDD techniques
    (OLAP, learning) scale up to huge data sets
  • Answer it depends on computational
    complexity, many other factors
  • Key Weaknesses
  • Havent developed clear theory yet of how to
    assess how much data is really needed
  • No technical treatment or characterization of
    data cleaning
  • Future Research Issues
  • Data cleaning (aka data cleansing), pre- and
    post-processing (OLAP)
  • Applications intelligent databases,
    visualization, high-performance CSE
  • See work by Fayyad, Smyth, Uthurusamy, Haussler,
    Foster

20
Classes 30-31Relevance Determination
  • Paper
  • Topic Relevance Determination in KDD
  • Title Irrelevant Features and the Subset
    Selection Problem
  • Authors John, Kohavi, and Pfleger
  • Presenter DingBing Yang
  • Key Strengths
  • Idea cast problem of choosing relevant
    attributes (given top-level learning problem
    specification) as search
  • Effective state space search (A/A-based)
    approach demonstrated
  • Key Weaknesses
  • May not have good enough heuristics!
  • Can either develop them (via information theory)
    or use MCMC methods
  • Future Research Issues
  • Selecting relevant data channels from continuous
    sources (e.g., sensors)
  • Applications bioinformatics (genomics,
    proteomics, etc.), prognostics
  • See work by Kohavi, John, Rendell, Donoho, Hsu,
    Provost

21
Classes 32-33Learning for Text Document
Categorization
  • Paper
  • Topic Text Documents and Information Retrieval
    (IR)
  • Title Hierarchically Classifying Documents using
    Very Few Words
  • Authors Koller and Sahami
  • Presenter Yan Song
  • Key Strengths
  • Idea use rank-frequency scoring methods to find
    keywords that make a difference
  • Break into meaningful hierarchy
  • Key Weaknesses
  • Sometimes need to derive semantically meaningful
    cluster labels
  • How to integrate this method with dynamic cluster
    segmentation, labeling?
  • Future Research Issues
  • Bayesian architectures using non-Bayesian
    learning algorithms (e.g., GAs)
  • Applications digital libraries (hierarchical,
    distributed dynamic indexing), intelligent search
    engines, intelligent displays (and help indices)
  • See work by Koller, Sahami, Roth, Charniak,
    Brill, Yarowsky

22
Classes 34-35Web Mining
  • Paper
  • Topic KDD and The Web
  • Title Learning to Extract Symbolic Knowledge
    from the World Wide Web
  • Authors Craven, DiPasquo, Freitag, McCallum,
    Mitchell, Nigam, and Slattery
  • Presenter Ping Zou
  • Key Strengths
  • Idea build probabilistic model of web documents
    using keywords that matter
  • Use probabilistic model to represent knowledge
    for indexing into web database
  • Key Weaknesses
  • How to account for concept drift?
  • How to explain and express constraints (e.g.,
    proper nouns that are person names dont
    matter)? Not considered here
  • Future Research Issues
  • Using natural language processing (NLP), image /
    audio / signal processing
  • Applications searchable hypermedia, digital
    libraries, spiders, other agents
  • See work by McCallum, Mitchell, Roth, Sahami,
    Pratt, Lee

23
Class 36Introduction to Evolutionary Computation
  • Architectures
  • Genetic algorithms (GAs), genetic programming
    (GP), genetic wrappers
  • Simple vs. parameterless GAs
  • Issues
  • Loss of diversity
  • Consequence collapse of Pareto front
  • Solutions niching (sharing, preselection,
    crowding)
  • Parameterless GAs
  • Other issues (not covered) genetic drift,
    population sizing, etc.
  • References Chapter 9, Mitchell Chapters 1-6,
    Goldberg Chapter 1-5, Koza
  • Research Issues How to
  • Design GAs based on credit assignment system (in
    performance element)
  • Build hybrid analytical / inductive learning GP
    systems
  • Use GAs to perform relevance determination in KDD
  • Control diversity in GA solutions for hard
    optimization problems

24
Class 37-38Genetic Algorithms and Classifier
Systems
  • Paper
  • Topic Classifier Systems and Inductive Learning
  • Title Generalization in the XCS Classifier
    System
  • Author Wilson
  • Presenter Elizabeth Loza-Garay
  • Key Strengths
  • Idea incorporate performance element (classifier
    system) into GA design
  • Solid theoretical foundation advanced building
    block (aka schema) theory
  • Can use to engineer more efficient GA model, tune
    parameters
  • Key Weaknesses
  • Need to progress from toy problems (e.g., MUX
    learning) to real-world ones
  • Need to investigate scaling up of GA principles
    (e.g., building block mixing)
  • Future Research Issues
  • Building block scalability in classifier systems
  • Applications reinforcement learning, mobile
    robotics, other animats, a-life
  • See work by Wilson, Goldberg, Holland, Booker

25
Class 39-40Knowledge-Based Genetic Programming
  • Paper
  • Topic Genetic Programming and Multistrategy
    Learning
  • Title Genetic Programming and Deductive-Inductive
    Learning A Multistrategy Approach
  • Authors Aler, Borrajo, and Isasi
  • Presenter Yuhong Cheng
  • Key Strengths
  • Idea use knowledge-based system to calibrate
    starting state of MCMC optimization system (here,
    GP)
  • Can incorporate knowledge (as in CIS830 Part 1 of
    5)
  • Key Weaknesses
  • Generalizability of HAMLET population seeding
    method not well established
  • General-purpose problem solving systems can
    become Rube Goldberg-ian
  • Future Research Issues
  • Using multistrategy GP systems to provide
    knowledge-based decision support
  • Applications logistics (military, industrial,
    commercial), other problem solving
  • See work by Aler, Borrajo, Isasi, Carbonell,
    Minton, Koza, Veloso

26
Class 41-42Genetic Wrappers for Inductive
Learning
  • Paper
  • Topic Genetic Wrappers for KDD Performance
    Enhancement
  • Title Simultaneous Feature Extraction and
    Selection Using a Masking Genetic Algorithm
  • Authors Raymer, Punch, Goodman, Sanschagrin,
    Kuhn
  • Presenter Karthik K. Krishnakumar
  • Key Strengths
  • Idea use GA to empirically (statistically)
    validate inducer
  • Can use to select, synthesize attributes (aka
    features)
  • Can also use to tune other GA parameters (hence
    wrapper)
  • Key Weaknesses
  • Systematic experimental studies of genetic
    wrappers have not yet been done
  • Wrappers dont yet take performance element into
    explicit account
  • Future Research Issues
  • Improving supervised learning inducers (e.g., in
    MLC)
  • Applications better combiners feature subset
    selection, construction
  • See work by Raymer, Punch, Cherkauer, Shavlik,
    Freitas, Hsu, Cantu-Paz

27
Class 43-44Genetic Algorithms for Optimization
  • Paper
  • Topic Genetic Optimization and Decision Support
  • Title A Niched Pareto Optimal Genetic Algorithm
    for Multiobjective Optimization
  • Authors Horn, Nafpliotis, and Goldberg
  • Presenter Li Lian
  • Key Strengths
  • Idea control representation of neighborhoods
    Pareto optimal front by niching
  • Gives abstract and concrete case studies of
    niching (sharing) effects
  • Key Weaknesses
  • Need systematic exploration, characterization of
    sweet spot
  • Shows static comparisons, not small-multiple
    visualizations that led to them
  • Future Research Issues
  • Biologically (ecologically) plausible models
  • Applications engineering (ag / bio, civil,
    computational, environmental, industrial,
    mechanical, nuclear) optimization computational
    life sciences
  • See work by Goldberg, Horn, Schwefel, Punch,
    Minsker, Kargupta

28
Class 45Meta-Summary
  • Data Mining / KDD Problems
  • Business decision support
  • Classification
  • Recommender systems
  • Control and policy optimization
  • Data Mining / KDD Solutions Machine Learning,
    Inference Techniques
  • Models
  • Version space, decision tree, perceptron, winnow
  • ANN, BBN, SOM
  • Q functions
  • GA/GP building blocks (schemata), GP building
    blocks
  • Algorithms
  • Candidate elimination, ID3, delta rule, MLE,
    Simple (Naïve) Bayes
  • K2, EM, backprop, SOM convergence, LVQ, ADP,
    simulated annealing
  • Q-learning, TD(?)
  • Simple GA, GP
Write a Comment
User Comments (0)
About PowerShow.com