BIO-TRAC 25 (Proteomics: Principles and Methods) - PowerPoint PPT Presentation

About This Presentation
Title:

BIO-TRAC 25 (Proteomics: Principles and Methods)

Description:

Transmembrane Helix: TMHMM, TMAP. 2D Prediction (a-helix, b-sheet, Coiled-coils): PHD, JPred ... Transmembrane Helix (http://www.cbs.dtu.dk/services/TMHMM/) 29 ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 54
Provided by: wuc
Category:

less

Transcript and Presenter's Notes

Title: BIO-TRAC 25 (Proteomics: Principles and Methods)


1
Tutorial Bioinformatics Resources
  • BIO-TRAC 25 (Proteomics Principles and Methods)
  • October 10, 2003
  • NIH, Bethesda, MD
  • Zhang-Zhi Hu, M.D.
  • Senior Bioinformatics Scientist,
  • Protein Information Resource
  • National Biomedical Research Foundation, GUMC

2
What is Bioinformatics?
  • Bioinformatics is the application of information
    technology to the analysis, organization and
    distribution of biological data in order to
    answer complex biological questions.
  • NIH Biomedical Information Science and Technology
    Initiative (BISTI) Working Definition (2002) -
    Research, development, or application of
    computational tools and approaches for expanding
    the use of biological, medical, behavioral or
    health data, including those to acquire, store,
    organize, archive, analyze, or visualize such
    data.

3
Bioinformatics Resources
  • The Molecular Biology Database Collection An
    Online Compilation of Relevant Database Resources
  • 2003 update http//www3.oup.co.uk/nar/database/
  • Nucleic Acids Research Database Issues (January
    Annually) (2003 - http//nar.oupjournals.org/conte
    nt/vol31/issue1/)
  • DBcat A Catalog of gt 500 Biological Databases
  • http//www.infobiogen.fr/services/dbcat/

4
Molecular Biology Database Collection
(http//nar.oupjournals.org/cgi/content/full/31/1/
1GKG120TB1)
5
The Molecular Biology Database Collection 2003
update (Baxevanis, A.D.)-- An online resource of
386 key databases of 18 categories
  • Major sequence repositories
  • Comparative Genomics
  • Gene Expression
  • Gene Identification and Structure
  • Genetic and Physical Maps
  • Genomic Databases
  • Intermolecular Interactions
  • Metabolic Pathways and Cellular Regulation
  • Mutation Databases
  • Pathology
  • Protein Sequence Motifs
  • Proteome Resources
  • Retrieval Systems and Database Structure
  • RNA Sequences
  • Structure
  • Transgenics
  • Varied Biomedical Content

6
Overview
  • Protein Sequence Analysis
  • I. Sequence Similarity Search and Alignment
  • II. Family Classification Methods
  • III. Structure Prediction Methods
  • Molecular Biology Databases
  • IV. Protein Family Databases
  • V. Database of Protein Functions
  • VI. Databases of Protein Structures
  • Proteomic Resources
  • VII. 2D-gel databases
  • VIII. Proteomic analyses

7
I. Sequence Similarity Search
  • Find a protein sequence text search
  • Based on Pair-Wise Comparisons
  • BLOSUM scoring matrix
  • PAM scoring matrix
  • Dynamic Programming Algorithms
  • Global Similarity Needleman-Wunsch (GAP/BestFit)
  • Local Similarity Smith-Waterman (SSEARCH)
  • Heuristic Algorithms (Sequence Database
    Searching)
  • FASTA Based on K-Tuples (2-Amino Acid)
  • BLAST Triples of Conserved Amino Acids
  • Gapped-BLAST Allow Gaps in Segment Pairs (NREF)
  • PHI-BLAST Pattern-Hit Initiated Search (NCBI)
  • PSI-BLAST Iterative Search (NCBI)

8
Sequence Search by Text or Unique ID
Entrez (http//www.ncbi.nlm.nih.gov/Entrez/)
(http//pir.georgetown.edu/pirwww/search/textsearc
h.html)
9
Pair-Wise Comparisons
  • Scoring matrix
  • Global and local
  • Similarity Dynamic Programming
  • (Needleman-Wunsch,
  • Smith-Waterman)

(http//www.ebi.ac.uk/emboss/align/)
10
FASTA Search
(http//pir.georgetown.edu/pirwww/search/fasta.htm
l)
(http//www.ebi.ac.uk/fasta33/)
11
Gapped-BLAST Search
(http//pir.georgetown.edu/pirwww/search/pirnref.s
html)
(http//www.ncbi.nlm.nih.gov/BLAST/)
12
A BLAST Result
13
PSI-BLAST Iterative Search
(http//www.ncbi.nlm.nih.gov/BLAST/)
14
PSI-BLAST
15
II. Family Classification Methods
  • Multiple Sequence Alignment and Phylogenetic
    Analysis
  • ClustalW Multiple Sequence Alignment
  • Alignment Editor Phylogenetic Trees
  • Searches Based on Family Information
  • PROSITE Pattern Search
  • Motif and Profile Search
  • Hidden Markov Model (HMMs)

16
Multiple Sequence Alignment
  • ClustalW (http//pir.georgetown.edu/pirwww/search
    /multaln.html)

17
Alignment Editor (Jalview)
(http//www.ebi.ac.uk/clustalw/)
18
Alignment Editor (GeneDoc)
(http//www.psc.edu/biomed/genedoc/)
19
Phylogenetic Analysis
Tree Programs (http//evolution.
genetics.washington.edu/phylip.html)
Tree Searches (http//pauling.
mbu.iisc.ernet.in/pali/index.html)
20
Phylogenetic Trees (IGFBP Superfamily)
(Radial Tree)
(Phylogram)
21
PROSITE Pattern Search
(http//pir.georgetown.edu/pirwww/search/patmatch.
html)
22
Profile Search
(http//bmerc-www.bu.edu/bioinformatics/profile_re
quest.html)
23
Hidden Markov Model Search
(http//www.sanger.ac.uk/Software/Pfam/search.shtm
l)
(http//smart.embl-heidelberg.de)
24
III. Structural Prediction Methods
  • Signal Peptide SIGFIND, SignalP
  • Transmembrane Helix TMHMM, TMAP
  • 2D Prediction (a-helix, b-sheet, Coiled-coils)
    PHD, JPred
  • 3D Modeling Homology Modeling (Modeller,
    SWISS-MODEL), Threading, Ab-initio Prediction

25
StructurePredictionA Guide
(http//speedy.embl-heidelberg.de/gtsp/flowchart2.
html)
26
Protein Prediction Server
(http//www.cbs.dtu.dk/services/)
27
Signal Peptide Prediction
(http//www.stepc.gr/synaptic/sigfind.html)
(http//www.cbs.dtu.dk/services/SignalP-2.0)
28
Transmembrane Helix
(http//www.cbs.dtu.dk/services/TMHMM/)
29
Protein Structure Prediction
(http//cmgm.stanford.edu/WWW/www_predict.html)
(http//restools.sdsc.edu/biotools/biotools9.html)
30
Structure Prediction Server
(http//cubic.bioc.columbia.edu/predictprotein/)
(http//www.compbio.dundee.ac.uk/WWW_Servers/JPred
/jpred.html)
31
3D-Modelling
(http//www.salilab.org/modeller/modeller.html)
(http//www.expasy.ch/swissmod/SWISS-MODEL.html)
32
IV. Protein Family Databases
  • Whole Proteins
  • PIR Superfamilies and Families
  • COG (Clusters of Orthologous Groups) of Complete
    Genomes
  • ProtoNet Automated Hierarchical Classification
    of Proteins
  • Protein Domains
  • Pfam Alignments and HMM Models of Protein
    Domains
  • SMART Protein Domain Families
  • Protein Motifs
  • PROSITE Protein Patterns and Profiles
  • BLOCKS Protein Sequence Motifs and Alignments
  • PRINTS Protein Sequence Motifs and Signatures
  • Integrated Family Databases
  • iProClass Superfamilies/Families, Domains,
    Motifs, Rich Links
  • InterPro Integrate Pfam, PRINTS, PROSITES,
    ProDom, SMART

33
Protein Clustering
(http//www.ncbi.nlm.nih.gov/COG/)
34
Protein Domains
  • Pfam (http//www.sanger.ac.uk/Software/Pfam/)
  • SMART (http// smart.embl-heid elberg.de/smart/
    show_motifs.pl)

35
Protein Motifs
  • PROSITE is a database of protein families and
    domains. It consists of biologically significant
    sites, patterns and profiles. (http//www.expasy.c
    h/prosite/)

36
Integrated Family Classification
  • InterPro An integrated resource unifying
    PROSITE, PRINTS, ProDom, Pfam, SMART, and
    TIGRFAMs, PIRSF. (http//www.ebi.ac.uk/interpro/se
    arch.html)

37
V. Databases of Protein Functions
  • Metabolic Pathways, Enzymes, and Compounds
  • Enzyme Classification Classification and
    Nomenclature of Enzyme-Catalysed Reactions
    (EC-IUBMB)
  • KEGG (Kyoto Encyclopedia of Genes and Genomes)
    Metabolic Pathways
  • LIGAND (at KEGG) Chemical Compounds, Reactions
    and Enzymes
  • EcoCyc Encyclopedia of E. coli Genes and
    Metabolism
  • MetaCyc Metabolic Encyclopedia (Metabolic
    Pathways)
  • WIT Functional Curation and Metabolic Models
  • BRENDA Enzyme Database
  • UM-BBD Microbial Biocatalytic Reactions and
    Biodegradation Pathways
  • Klotho Collection and Categorization of
    Biological Compounds
  • Cellular Regulation and Gene Networks
  • EpoDB Genes Expressed during Human
    Erythropoiesis
  • BIND Descriptions of interactions, molecular
    complexes and pathways
  • DIP Catalogs experimentally determined
    interactions between proteins
  • RegulonDB Escherichia coli Pathways and
    Regulation

38
KEGG Metabolic Regulatory Pathways
  • KEGG is a suite of databases and associated
    software, integrating our current knowledge
  • on molecular interaction networks, the
    information of genes and proteins, and of
    chemical
  • compounds and reactions. (http//www.genome.ad.
    jp/kegg/kegg2.html)

(http//www.genome.ad.jp/dbget-bin/show_pathway?hs
a00590874)
39
BioCyc (EcoCyc/MetaCyc Metabolic Pathways)
  • The BioCyc Knowledge Library is a collection of
    Pathway/Genome
  • Databases (http//biocyc.org/)

40
Protein-Protein Interactions DIP
(http//dip.doe-mbi.ucla.edu/)
41
Protein-Protein Interaction BIND
(http//www.bind.ca/)
42
BioCarta Cellular Pathways
(http//www.biocarta.com/index.asp)
43
VI. Databases of Protein Structures
  • Protein Structure and Classification
  • PDB Structure Determined by X-ray
    Crystallography and NMR
  • CATH Hierarchical Classification of Protein
    Domain Structures
  • SCOP Familial and Structural Protein
    Relationships
  • FSSP Protein Fold Family Database
  • Protein Sequence-Structure Relationship
  • PIR-NRL3D Protein Sequence-Structure Database
  • PIR-RESID Protein Structure/Post-Translational
    Modifications
  • HSSP Families and Alignments of
    Structurally-Conserved Regions

44
PDB Structure Data
(http//www.rcsb.org/pdb/)
45
PDBsum
Summary and Analysis (http//www.biochem.ucl.ac.uk
/bsm/pdbsum)
46
Protein Structural Classification
CATH Hierarchical domain classification of
protein structures (http//www.biochem. ucl.ac.uk/
bsm/cath_new/)
47
Protein Structural Classification
The SCOP database aims to provide a detailed and
comprehensive description of the structural and
evolutionary relationships between all proteins
whose structure is known, including all entries
in the PDB.
(http//scop.mrc-lmb. cam.ac.uk/scop/)
48
VII. Proteomic Resources
  • GELBANK (http//gelbank.anl.gov) 2D-gel patterns
    from completed genomes SWISS-2DPAGE
    (http//www.expasy.org/ch2d/)
  • PEP Predictions for Entire Proteomes
    (http//cubic.bioc.columbia.edu/ pep/)
    Summarized analyses of protein sequences
  • Proteome BioKnowledge Library (http//www.proteom
    e.com) Detailed information on human, mouse and
    rat proteomes
  • Proteome Analysis Database (http//www.ebi.ac.uk/p
    roteome/) Online application of InterPro and
    CluSTr for the functional classification of
    proteins in whole genomes
  • Expression Profiling databases GNF
    (http//expression.gnf.org/cgi-bin/index.cgi,
    human and mouse transcriptome), SMD
    (http//genome-www5.stanford.edu/MicroArray/SMD/,
    Stanford microarray data analysis), EBI
    Microarray Informatics (http//www.ebi.ac.uk/micro
    array/ index.html , managing, storing and
    analyzing microarray data)

49
2D-Gel Image Databases (1)
(http//gelbank.anl.gov/2dgels/index.asp)
50
2D-Gel Image Databases (2)
(http//us.expasy.org/ch2d/2d-index.html)
(http//us.expasy.org/cgi-bin/nice2dpage.pl?P06493
)
51
VIII. Proteome Analysis
(http//www.ebi.ac.uk/proteome)
52
Expression Profiling
  • Human and Mouse Transcriptome

(http//expression.gnf.org/cgi-bin/index.cgi)
(http//genome-www. stanford.edu/serum/)
53
Lab
  • Visit selected websites and analyze some protein
    sequences of
  • your own choices.
  • - List of Bioinformatics Resources of this
    tutorial available
  • http//pir.georgetown.edu/huz/bioinfo_resourc
    e.html
  • Try some of the following sequences for
    analysis
  • 1) well characterized proteins
    PIRA26366(CYP17), JS0747(Sp1)
  • 2) less characterized proteins
    PIRA59000(MATER)
  • TrEMBLQ9QY16(GRTH)
  • 3) hypothetical protein PIRT12515, T00338
    , T47130
  • SWISS-PROTQ9BWT7
Write a Comment
User Comments (0)
About PowerShow.com