Bioinformatics - PowerPoint PPT Presentation

About This Presentation
Title:

Bioinformatics

Description:

Bioinformatics Richard Tseng and Ishawar Hosamani Outline Homology modeling (Ishwar) Structural analysis Structure prediction Structure comparisons Cluster analysis ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 25
Provided by: Rich398
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics


1
Bioinformatics
  • Richard Tseng and Ishawar Hosamani

2
Outline
  • Homology modeling (Ishwar)
  • Structural analysis
  • Structure prediction
  • Structure comparisons
  • Cluster analysis
  • Partitioning method
  • Density-based method
  • Phylogenetic analaysis

3
Structural Analysis
  • Overview
  • Structure prediction
  • Structural alignment
  • Similarity

4
  • Tools for protein structure prediction
  • Protein
  • Secondary structure prediction SSEA
  • http//protein.cribi.unipd.it/ssea/
  • Tertiary structure prediction
  • Wurst http//www.zbh.uni-hamburg.de/wurst/
  • LOOPP http//cbsuapps.tc.cornell.edu/loopp.aspx

5
  • WURST( Torda et al. (2004) Wurst A protein
    threading server with a structural scoring
    function, sequence profiles and optimized
    substitution matrices Nucleic Acids Res., 32,
    W532-W535)
  • Rationale
  • Alignment Sequence to structure alignments are
    done with a Smith-Waterman style alignment and
    the Gotoh algorithm
  • Score function fragment-based sequence to
    structure compatibility score and a pure
    sequence-sequence component substitution score
  • Library Dali PDB90 (24599 srtuctures)

6
  • Tools for structure comparison
  • Pair structures comparison
  • TopMatch
  • Matras (http//biunit.naist.jp/matras/)
  • Multiple structures comparison
  • 3D-surfer
  • Matras (http//biunit.naist.jp/matras/)

7
  • TopMatch (Sippl Wiederstein (2008) A note on
    difficult structure alignment problems.
    Bioinformatics 24, 426-427)
  • Rationale
  • Structure alignment http//www.cgl.ucsf.edu/home/
    meng/grpmt/structalign.html
  • Similarity measurement
  • Input format
  • PDB, SCOP and CATH code
  • PDB structure directly
  • Exercise http//topmatch.services.came.sbg.ac.at/

8
  • 3D-surfer (David La et al.  3D-SURFER software
    for high throughput protein surface comparison
    and analysis. Bioinformatics , in press. (2009))
  • Rationale
  • Define a surface function
  • Transform the surface function into a 3D Zernike
    description function
  • Input format
  • PDB and CATH code
  • PDB structure directly
  • Exercise http//dragon.bio.purdue.edu/3d-surfer/

9
Cluster analysis
  • Goal
  • Grouping the data into classes or clusters, so
    that objects within a cluster have high
    similarity in comparison to one another but are
    very dissimilar to objects in other clusters.
  • Methods
  • Partitioning method k-means
  • Density-based method Ordering Points to
    Identify the Clustering Structure (OPTICS)

10
  • k-means
  • Rationale Partition n observations
    into k clusters in which each observation belongs
    to the cluster with the nearest mean
  • Exercise

http//cgm.cs.ntust.edu.tw/etrex/kMeansClustering/
kMeansClustering2.html
11
  • OPTICS
  • Rationle Partition observations based on the
    density of similar objects
  • Exercise

http//www.dbs.informatik.uni-muenchen.de/Forschun
g/KDD/Clustering/OPTICS/Demo/
12
  • Example Folding of Trp-cage peptide

13
Phylogenetic analysis
  • Overviews
  • Comparisons of more than two sequences
  • Analysis of gene families, including functional
    predictions
  • Estimation of evolutionary relationships among
    organisms

14
  • Theoretical tree
  • Parsimony method
  • Distance matrix method
  • Maximum likelihood and Bayesian method
  • Invariants method

15
  • Software
  • Collections of tools
  • http//evolution.genetics.washington.edu/phylip/so
    ftware.html
  • A web server version for tree construction and
    display
  • PHYLIP, http//bioweb2.pasteur.fr/phylogeny/intro-
    en.html
  • Interactive tree of life, http//itol.embl.de/
  • Mostly common used stand alone software
  • PHYLIP, tool for evaluating similarity of
    nucleotide and amino acid sequences.
  • http//evolution.gs.washington.edu/phylip.html
  • TreeView, tool for visualization and manipulation
    of family tree.
  • http//taxonomy.zoology.gla.ac.uk/rod/treeview.htm
    l
  • Matlab - bioinformatics tool box

16
  • Example Alignment phylogenetic tree of Tubulin
    family
  • Searching homologous sequences of Tubulin (PDB
    code 1JFF) from RCSB protein databank
  • Blast for pair sequence alignment
  • Clustalw for comparative sequence alignment
  • Evaluating protein distance matrix
  • using Protdist of PHYILIP (Particularly, Point
    Accepted Mutation (PAM) matrix is used)
  • Clustering proteins using Neighbor of PHYILIP
    (Neightboring-Joint method is considered)

17
  • Example n-distance phylogenetic tree
  • Evaluating n-distance matrix
  • n-distance method
  • Clustering proteins using Neighbor of PHYILIP
    (Neightboring-Joint method is considered)
  • 16S and 18S Ribosomal RNA sequenecs of 35
    organisms

18
Summary
  • Homology modeling
  • Tools for structure prediction and comparisons
  • Tools for phylogenetic tree construction

Thanks for your attention!!
19
(No Transcript)
20
  • Protein distance matrix

1Z5V_A 3CB2_A 1JFF_B 1FFX_B 1TUB_B 1Z2B_B
1Z5V_A 0 0.000010 1.349411 1.349411 1.303115 1.345634
3CB2_A 0.000010 0 1.350506 1.350506 1.303115 1.346730
1JFF_B 1.349411 1.350506 0 0.000010 0.000010 0.010729
1FFX_B 1.349411 1.350506 0.000010 0 0.000010 0.010729
1TUB_B 1.303115 1.303115 0.000010 0.000010 0 0.006725
1Z2B_B 1.345634 1.346730 0.010729 0.010729 0.006725 0
21
  • Tubulin family tree

22
  • n-distance method
  • Frequency count of n-letter words
  • n-dsiatnce matrix
  • Advantage
  • Identify fully conservative words located at
    nearly the same sites
  • Effecient

MREIVHIQAGQCGNQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERINVY
YNE
23
(No Transcript)
24
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com