Learning Methodology - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Learning Methodology

Description:

One or more target functions, possibly in concert with decision functions, can implement ... and execute the programs in Listings 2.1, 2.2 and 2.3. Key points: ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 33
Provided by: randyz
Category:

less

Transcript and Presenter's Notes

Title: Learning Methodology


1
Lecture 1
  • Learning Methodology
  • (Chapter 1, Cristianini Shawe-Taylor)

2
Basics of Machine Learning
  • We want to use a computer algorithm to classify
    data. (Is this sequence likely to contain an
    intron splice site?)
  • Classification is based on descriptors, variables
    which measure the observations.
  • Descriptors can be continuous (GC content) or
    category (sequence character at position n of
    the sequence).
  • We visualize the possible values of the
    descriptors as labeling the axes of a coordinate
    system. Each observation maps to a single point
    in this descriptor space.

3
Machine Learning (cont)
  • Thought question - how can we code a nucleotide
    sequence ten characters long into a descriptor
    space?
  • Can we visualize that space? Why or why not?

4
Machine Learning (cont)
  • Once observations are mapped to points in our
    descriptor space, we want to classify them, or
    report some measure that describes/summarizes the
    data.
  • Requires a target function that maps from a data
    point (which lives in our descriptor space) to
    some target space (usually the real line). The
    target function usually provides some numerical
    measure of the likelihood that a data point
    represents a member of a particular category.

5
Machine Learning (cont)
  • The target function is usually supplemented with
    an additional function, the decision function,
    that makes a final assignment of an observation
    to a category.
  • Typically this is achieved by applying a
    threshold to the target function. In most cases,
    if the target function is sufficiently positive
    when fed a particular data point, that point is
    then placed in the category associated with the
    target/decision function.

6
Machine Learning (cont)
  • A hypothesis is a particular choice for the
    target function. Our goal is to find a hypothesis
    (possibly coupled with a decision function) which
    most accurately categorizes data.
  • Often, we already have in mind the form of the
    target function. The function will always include
    one or more parameters that define its specific
    shape. A particular choice of parameters then
    constitutes a single hypothesis.
  • We often speak of working with a space of
    hypotheses, and of picking the best hypothesis
    from this space.

7
Machine Learning (cont)
  • Example - orbit of a satellite
  • In our context,the hypothesis isnot Orbits
    areelliptical, butrather My satellite has
    anorbit with semi-major axis of 3.5 million
    milesand eccentricity 0.1
  • Graphic from Windows on the Universe,http//www
    .windows.ucar.edu/

8
Machine Learning (cont)
  • Learning means developing one or more target
    functions that can be used to describe or
    categorize our data is some useful way.
  • One or more target functions, possibly in concert
    with decision functions, can implement
  • Binary clasification
  • Mult-class classification
  • Regression
  • For example, we might develop a machine learning
    algorithm that could predict the semi-major axis
    of the orbit of a satellite, and its
    eccentricity, from a series of 5 observations of
    the position of the satellite.

9
Machine Learning (cont)
  • Thought question
  • In the satellite example,
  • What might the descriptors be? Are they
    continuous or category descriptors?
  • What are the target functions? What category of
    classification is this?
  • Are there any decision functions in this example?
    If not, how might we introduce one?
  • As observations are added, how does the space of
    possible hypotheses change?

10
Machine Learning (cont)
  • How does a machine learn?
  • In this course we will focus on supervised
    learning
  • Training data - used to adjust the target
    function implemented by the learning method
  • Generalization - After training, how well does
    our algorithm classify observations not included
    in the training set?
  • A machine learning algorithm may perfectly
    classify its training data, yet be totally
    useless in practice. It must be able to
    generalize!

11
Machine Learning (cont)
  • In our satellite example, the training data might
    consists of a series of sets of five measurements
    of position for a collection of known objects
    circling the Sun (planets, asteroids, comets,
    etc). One set of five measurements corresponds to
    a single object, and comprises a single
    observation in constructing our model.
  • To test our model, we would apply it to a series
    of objects not included in our training set, but
    each with a known orbit. This is the validation
    set.
  • If the model accurately predicts the orbits of
    the objects in the validation set, we say it
    generalizes well. We can compute some metrics
    that measure just how well the model performs.
  • If the model does not generalize, it is of little
    use!

12
Machine Learning (cont)
  • Computational Complexity
  • Time complexity How does compute time scale with
    the size of the problem?
  • Polynomial Time - algorithm scales in accordance
    with a polynomial function of problem size.
  • Exponential Time - algorithm scales exponentially
    with the problem size. Intractable.
  • Size complexity How does memory usage scale with
    the size of the problem?

13
Machine Learning (cont)
  • NP-complete problems
  • The Traveling Salesman problem is a good example
    of an NP-complete problem
  • How does the time needed to generate the
    candidate solutions scale with problem size?
  • How does the time needed to evaluate a solution
    vary with problem size? How about the overall
    compute time?
  • Solve one problem in the NP-Complete class in
    polynomial time and youve done it for all of
    them!
  • No one has figured out how to do that.

14
Machine Learning (cont)
  • Algorithmic Complexity
  • Can be measured by the length of the algorithm
    itself, measured in bytes.
  • We may imagine compressing the algorithm before
    measuring its length
  • Big algorithms generalize poorly. They
    typically handle many special cases derived from
    the training set. In the worst scenario, the
    algorithm simply stores all the training examples
    and spits them back on cue! (Rote learning)

15
Machine Learning (cont)
  • Machine learning methods we will explore in this
    course
  • Neural networkshttp//www.comp.rgu.ac.uk/staff/n
    c/Files/Internal/SummaryBANN.htm
  • Hidden Markov modelshttp//en.wikipedia.org/wiki
    /ImageMarkovModel.png
  • Support vector machines

16
Machine Learning (cont)
  • Applications
  • Neural networks to identifying ribosome binding
    sites in bacterial sequences
  • Hidden Markov models to identifying bacterial
    promoters
  • Support vector machine to predict protein
    secondary structure

17
Lecture 1
  • Introduction to Java
  • (Chapters 12, Lewis Loftus)

18
Introduction to Java
  • Developed as a cross-platform language
  • Depends upon a Java Virtual Machine (JVM), a
    virtual processor that must be implemented
    separately for each target platform
    (Intel/Windows, Intel/Linux, Sun/SPARC, Apple/OS
    X, etc)
  • The JVM runs an instruction set called bytecode.
    Java programs are compiled into bytecode, and
    will then run (in theory at least) on any
    supported platform.

19
Introduction to Java (cont)
  • Java is an object-oriented language. It supports
    objects called classes. Java classes encapsulate
    both data and methods (functions which operate on
    class data).
  • Java is an event-driven language. It is expected
    that Java applications will be run in a graphical
    environment, and there is extensive support for
    capturing mouse events and creating live
    graphical objects such as buttons and sliders.

20
Introduction to Java (cont)
  • Java plays well with web browsers. A Java
    application can be written as an applet, and can
    be loaded and run from within a browser.
  • Java incorporates special security features which
    provide various levels of protection which depend
    upon how an application is run. Applets have more
    restricted access to system resources than
    standalone applications.

21
Introduction to Java (cont)
  • Java is supported by a number of APIs
    (Application Programming Interfaces). These are
    software toolkits which provide extensive support
    for a broad variety of graphical objects and data
    types. APIs are available to construct menus, to
    handle text editing, for image and sound
    processing, and for development of
    special-purpose applications (e.g. neural
    networks!)

22
Introduction to Java (cont)
  • Fundamental concepts of software and hardware
  • Review sections 1.1-1.3 of the text as needed.
  • Syntax. We will start with the example program,
    Listing 1.1.
  • All computers should have available the editor
    subEthaEdit. You may use this for editing
    programs. (You may also use Xcode, provided in OS
    X, although this provides a sophisticated
    environment intended more for experienced
    programmers.)
  • Once you have created your code, in a file named
    Lincoln.java, you compile it like this OS Xgt
    javac Lincoln.java
  • The compiled code is in Lincoln.class
  • To run it, type OS Xgt java Lincoln

23
Introduction to Java (cont)
  • Key concepts
  • Identifiers (names for variables, functions,
    etc). Distinguish
  • Those chosen by the programmer
  • Those chosen by other programmers
  • Reserved words
  • Comment styles
  • Use of whitespace

24
Introduction to Java (cont)
  • Object-oriented programming
  • Objects encapsulate data and the methods that
    manipulate the data. They are often
    representations, in software, of real-world
    objects.
  • Attributes are the variables that describe the
    state of an object. They may be primitives (like
    floats or ints), or objects.
  • An object is defined using a class. A class is a
    blueprint to make objects. An object constructed
    from a class is said to be an instance of the
    class.
  • We can define a new class based on an existing
    class by using inheritance. The new class
    inherits all the attributes and methods of the
    parent, but may add more of its own.

25
Introduction to Java (cont)
  • Chapter 2 (Data and expressions).
  • Java primitives are basically the same as C
    variable types.
  • Declarations of int, float, double are just like
    in C. The final keyword can be placed in front of
    a primitive to declare it as a constant.
  • Assignment statements, using are just like in
    C.
  • Increment (), decrement (--) and combined
    operators (, -, , /) are available just as
    in C.
  • Operator precedence is just like in C.
  • Type casting is just like in C.
  • Review sections 2.2-2.5 to refresh yourself on
    these basic ideas.

26
Introduction to Java (cont)
  • String Handling
  • The class String is used to create and manipulate
    text strings.
  • Type in , compile and execute the programs in
    Listings 2.1, 2.2 and 2.3.
  • Key points
  • Understand the use of for string
    concatenation. This is an example of operator
    overloading.
  • Understand how the library utilities
    System.out.printl and System.out.print differ.
  • Understand the tricky example presented in 2.3!

27
Introduction to Java (cont)
  • Take this opportunity to learn more about the
    String class.
  • Point your browser to http//java.sun.com -gtRe
    ference/API Specifications -gtJ2SE 1.5.0
  • Choose java.lang-gtString
  • Find methods that will allow you to
  • Return the fourth character of a string
  • Compare two strings for equality, case
    insensitive
  • Convert a string to all upper case
  • Convert an integer to a String
  • Try out these methods in your test code!

28
Introduction to Java (cont)
  • Sec. 2.6. The Scanner class allows you to easily
    parse a line of text with tokens separated by
    whitespace (the default).
  • Look at the list of methods, Fig. 2.7.
  • Try out the program shown in Listing 2.9.
  • Notice how methods in a class are
    accessed scan Scanner.create(System.in) m
    essage scan.nextLine()

class name
class method
instance name
instance method
29
Introduction to Java (cont)
  • 2.7 Graphics. Here we learn a little about
    drawing in a Java program. Main points
  • Pixels as elements of graphical representations
  • Coordinate system assumed in Java
  • Use of the Color class the RGB color system
  • How class variables are shown Color.blue
  • Color.blue is a static class variable that always
    points at a predefined instance of the Color
    class with RGB components R0, B1, G0.

class variable name
class name
30
Introduction to Java (cont)
  • 2.8 Graphics class
  • The Graphics class supplies many methods for
    drawing shapes and text.
  • An instance of the Graphics class contains as an
    attribute a drawing area (technically an instance
    of the Component class) on which to draw. Drawing
    takes place in the coordinate system of this
    associated drawing area.
  • Figure 2.12 shows a collection of useful methods
    from this class.

31
Introduction to Java (cont)
  • 2.8 Java Applets
  • Applets are Java applications meant to be run
    through a web browser. They are typically sent
    from a server to a browser running on a client
    machine (although they can easily be run on a
    local browser). Fig. 2.11 illustrates this
    process.
  • Listing 2.10 shows how to create a simple applet
    . Lets implement it!
  • To run it, you need to make a short HTML
    document. Lets call it Einstein.htmllthtmlgtltapp
    let code"Einstein.class" width"350"
    height"175"gtlt/appletgtlt/htmlgt

32
Introduction to Java (cont)
  • Applets, cont.
  • Notice that the applet HTML tag specifies a size
    for the area in which to draw. A Graphics
    instance with a drawing component of this size is
    created by the browser when the applet is run.
    This instance is passed to the paint() method as
    its argument.
  • To test your applet, you can
  • Open the file Einstein.html from within a
    browser, or
  • Use an applet viewerOS Xgtappletviewer
    Einstein.html
  • For fun, implement and run Listing 2.11.
Write a Comment
User Comments (0)
About PowerShow.com