Neural Networks - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Neural Networks

Description:

Synonyms: connectionist networks, connectionism, neural computation, parallel ... a neuron is a brain cell capable of collecting electric signals, processing them, ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 46
Provided by: isabellebi
Category:

less

Transcript and Presenter's Notes

Title: Neural Networks


1
Neural Networks
2
Learning Objectives
  • Understand the principles of neural networks
  • Understand the backpropagation algorithm

3
Principles of Neural Networks
  • A Neural Network (NN) is an Artificial Neural
    Network (ANN) based on the analogy of the brain
    as a network of neurons a neuron is a brain
    cell capable of collecting electric signals,
    processing them, and disseminating them.
  • Synonyms connectionist networks, connectionism,
    neural computation, parallel distributed
    processing.
  • Among the most efficient machine learning methods
    to interpret complex real-world sensor data, for
    example recognize hand-written characters
    (LeCun), spoken words (Lang), or faces (Cottrell).

4
Principles of Neural Networks
  • Biological backgroundthe human brain contains a
    network of 1011 interconnected neurons, with high
    number of interconnections (104)
  • Fastest neuron answer time is 10-3
    secondsCapable of fast decisions (10-1 s to
    recognize mother)Speed of answer can be
    explained by parallel processing
  • ANNs imitate imperfectly real neurons- many
    characteristics of real neural networks are not /
    cannot be reproduced in ANNs

5
Principles of Neural Networks
wi,j
?
f
yi
yj
yi f(xi)
w0,j
inputfunction
activationfunction
output
y0 -1
  • Mathematical model for a neuron

6
Principles of Neural Networks
  • Bias weightw0,i is the bias, or threshold of the
    unit, and associated with an input activity of 1.
  • Criteria of activation function
  • Unit should be active (near 1) when the right
    input arrives and inactive (near 0) when other
    input arrive
  • Nonlinear function
  • Threshold function
  • Sigmoid function 1 / 1 e -x

7
Principles of Neural Networks
Threshold function
Sigmoid function
8
Principles of Neural Networks
  • Types of neural networks
  • Feed-forward networks (or acyclic)function of
    the current input only
  • Recurrent networks (or recurrent)feeds outputs
    back into its own inputs
  • Several layers
  • Input layer ? input units
  • Layer of hidden units
  • Output layer ? output unit

9
Principles of Neural Networks
  • x5 f( w3,5 x3 w4,5 x4 ) f( w3,5
    (w1,3 x1 w2,3 x2 ) w4,5 (w1,4 x1
    w2,4 x2 ) ) f( x1 , x2 )

10
Principles of Neural Networks
w00.5
w01.5
w11
w11
w21
w21
OR
AND
  • ANNs can represent boolean functions AND, OR,
    NAND, NORany boolean function with two levels
    deep network

11
Principles of Neural Networks
  • A perceptron is a single layer feed-forward
    neural network
  • Each output unit is independent of the others

12
Principles of Neural Networks
  • A perceptron (single layer feed forward neural
    network) can only represent functions that are
    linearly separable

13
Neural Networks Principles
  • Learn by adjusting weights to reduce error on
    training set
  • The squared error for an example with input x and
    true output y is
  • Perform optimization search by gradient descent
  • Simple weight update rule

14
Principles of Neural Networks
  • ANNs can be used for
  • Classification
  • Regression
  • Machine learning terminology
  • Data (d1, t1), , (dn, tn) (pairs
    data/target)
  • Training set / validation set
  • Supervised learning model is fitted to the pairs
    data/target
  • Unsupervised learning the target is not known
  • Classification ? supervised learning
  • Regression ? unsupervised learning

15
Universal Approximation Properties
  • Neural networks can approximate any reasonable
    real function to any degree of precision
    (regression) with a 3-layer network
  • Any boolean function can be approximated by a
    multi-layer feed forward network since they are
    combinations of threshold gates
  • 3-layer network with x in the input, a hidden
    layer of sigmoid units, and one layer of linear
    (identity function) output units, hidden layer
    being as large as needed.

16
Universal Approximation Properties
  • Hypothesis
  • f is uniformly continuous on 0,1
  • f can be approximated with a function g such
    that
  • g(0) f(0)
  • g(x) f(k/n) for x in ((k-1)/n, k/n, k1..n
  • Network needs one input unit, one output unit
    receiving a connection from each hidden unit, and
    n1 hidden threshold

17
Backpropagation Algorithm
  • Layers are usually fully connected numbers of
    hidden units typically chosen by hand

18
Backpropagation Algorithm
  • Expressiveness of multilayer perceptronsAll
    continuous functions w/ 2 layers, all functions
    w/ 3 layers

19
Backpropagation Algorithm
  • Output layer same as for single-layer
    perceptron,
  • Hidden layer back-propagate the error from the
    output layer
  • Update rule for weights in hidden layer

20
Backpropagation Algorithm
  • The squared error on a single example is defined
    as
  • where the sum is over the nodes in the output
    layer.

21
Backpropagation Algorithm
22
Backpropagation Algorithm
  • At each epoch (one cycle through the examples),
    sum gradient updates for all examples and apply

23
Backpropagation Algorithm
24
Backpropagation Algorithm
  • Handwritten digit recognition3-nearest-neighbor
    2.4 error400-300-10 unit MLP 1.6
    errorLeNet 768-192-30-10 unit MLP 0.9

25
Applications
  • Data clustering
  • Classification
  • Gene Reduction
  • Gene Regulatory Networks

26
Clustering
  • Tamayo et al. (1999) have used SOMs to cluster
    gene expressions of yeast and humans.
  • Data yeast (Sacharomyces cerevisae) cell cycle
    data from Spellman et al. (1998)hematopoietic
    differentiation data
  • SOMs (Self Organizing Feature Maps, Kohonen) are
    well suited to classify data into clusters in
    complex multidimensional data

27
Clustering
  • Measurement of gene expression at 10 minutes
    intervals throughout two cell cycles (160
    minutes), gives 16 timesteps
  • Data first filtered to find the genes showing
    significant variation in expression over the time
    series
  • Gene expression levels normalized across
    experiments to focus on shape of patterns, not
    magnitude

28
Clustering
  • Self Organizing Map are at the basis of
    GENECLUSTER, developed by the authors to cluster
    and visualize gene expressions
  • A 6 x 5 node SOM trained on 416 genes, on yeast
    cell cycle gene expressions previously analyzed
    by hand, to compare with the clusters found with
    SOM.
  • 30 clusters. 4 replicate the four cell cycles
    stages. The clusters identified by SOM match very
    well the clusters built by human experts.
    Correspond to G1, S, G2, and M phases of the cell
    cycle.

29
Clustering
SOM-derived
Human-derived
30
Classification
  • Cai and Chou (1998) use ANNs to predict HIV
    protease cleavage sites in proteins.
  • Knowing the HIV protease cleavage sites in
    proteins will be helpful for designing specific
    and efficient HIV protease inhibitors.
  • Subject of study HIV-1 protease.
  • Training set 299 oligopeptides.Test set 63
    oligopeptides. Result high rate of correct
    prediction (58/63 92.06).

31
Classification
32
Classification
  • HIV data 114 positive sequences, 248 negative
    sequences, for a total of 362 sequences.300
    cycles of ANNs.
  • HCV data 168 positive sequences, 752 negative
    sequences, for a total of 920 sequences.500
    cycles of ANNs.
  • 20 positive for testing.10 different training
    and test sets created for HCV and HIV using a
    roulette wheel random selection preserving the
    20 criterion.Each training/test pair was run
    three times with random initialization of network.

33
Gene Expression Data (GED)
  • GED measure the relative expression levels of
    genes at a single timestep using cDNA or
    Affymetrix chips
  • When individuals are measured only once, a gene
    classificatory network for the population can be
    extracted (see myeloma data)
  • When individuals are measured more than once
    across time, a gene regulatory network needs to
    be reverse engineered

34
Gene Reduction
  • Narayanan et al. (2004) use ANNs to analyze
    myeloma gene expressions
  • Goal by analyzing the genes involved temporally
    in the development a disease, identify patterns
    of genes to better characterize the disease, and
    design efficient drugs.
  • Design drugs to target specific genes at
    important points in time.

35
Gene Reduction
  • Two major problems for current gene expression
    analysis techniques.
  • Dimensionality the sheer volume of data leads to
    the need for fast analytical tools
  • Sparsity there are many more genes than samples
  • G S CG (gene expression analysis) is
    concerned with selecting a small subset of
    relevant genes (the S problem) as well as
    combining individual genes to identify important
    causal and classificatory relationships (the C
    problem).

36
Gene Reduction
  • Myeloma data 7129 gene expression values for 105
    samples.ANN with one-layer, 7129 input nodes,
    one output node (myeloma / normal), feed forward
    backpropagation ANN.
  • Until sum of squared errors (SSE) on output node
    is less than 0.001 (3000 epochs, 8 minutes on
    pentium laptop).
  • Weight values between 0.08196 and
    0.07343.Average 0.000746.1443 links had 0 on
    their weights across all runs.

37
Gene Reduction
  • The top 220 genes were then selected.Process of
    training the network was repeated again on this
    subset.
  • The relevant data was extracted from the full
    dataset, with the class information of each
    sample.Top 21 genes for myeloma were finally
    extracted.
  • Learnt interesting causal and classificatory
    rules.

38
Gene Reduction
  • Negative rules
  • If U24685 (-1.84127) is absent then
    myeloma.U24685 corresponds to anti-B sell
    antoantibody IgM heavy chain variable V-D-J
    region (VH4)classified correctly 63 of 75
    myeloma cases, with no false positives.
  • If L00022 (-1.79993) is absent then
    myeloma.L00022 corresponds to Ig active heavy
    chain epsilon-1classified correctly 68 of 75
    myeloma cases, but also three normal cases.

39
Gene Reduction
  • Positive rules
  • If X57809 (1.58233) is present then myeloma.
    X57809 corresponds to rearranged immunoglobulin
    lambda light chainclassified correctly 51 of 75
    myeloma cases, with no false positives.
  • If M34516 is present then myeloma. M34516
    corresponds to omega light chain protein 14.1 (Ig
    lambda chain related)classified correctly 61 of
    75 myeloma cases, but also two normal cases.

40
Gene Regulatory Networks
  • Gene network construction
  • Requires temporal GED
  • Develops relationships between gene expression
    values across timesteps.
  • These relationships can then form a gene
    regulatory network
  • This network describes the excitation and
    inhibition which govern gene expression patterns

41
Gene Regulatory Networks
42
Gene Regulatory Networks
  • Boolean network model
  • Each gene receives one or several inputs from
    other genes
  • Sigmoid function models the gene as a binary
    element
  • Compute the output (time T1) from the inputs
    (time T) according to boolean logics
  • Time is discretized

43
Gene Regulatory Networks
  • Boolean gene network example
  • Input is at time T
  • Output is at time T1

44
Gene Regulatory Networks
  • Process to construct Liang networks
  • Train the ANN on pairs of gene expressions values
    from the training set
  • Train the network between T pattern and T1
    differences between expected and observed
    patterns
  • Train on all pairs in the training set and
    calculate percentage of correct values
  • Single layer networks reduce complexity and
    improve transparency

45
Gene Regulatory Networks
  • All Boolean network time series terminate in
    specific, repeating attractor patterns.
  • These can be visualized as basin of attraction
    graphs.
  • All trajectories are strictly determined, and
    many states converge on one attractor.
  • Stability of gene networks.
Write a Comment
User Comments (0)
About PowerShow.com