Title: Protein Modularity and Evolution: An examination of organism complexity via protein domain structure
1Protein Modularity and EvolutionAn examination
of organism complexity via protein domain
structure
- Presented by
- Jennelle Heyer and Jonathan Ebbers
- December 7, 2004
2Presentation Outline
- Background Material
- - Protein Evolution, Theory of Domains,
- Gene Number
- Hypothesis
- - Using a model protein family
- Procedure/Methods
- - DPIP Program, Phylogenic Analysis
- Results
- Discussion/Conclusions
3Theories of Protein Evolution
A long time ago, in the primodial soup of life,
small polypeptides began to form
HDLC or TCP or. HDLC TCP HCLCTCP HCICTCP
TCP Functional proteins
HDLC or TCP or. HDLC TCP HCLCTCP HCICTCP
QZX Functional proteins
4Concept of Modularity
- Proteins consist of one or more domains that were
pieced together over time - Domain ? building blocks of proteins
- Defined as spatially distinct structures that
could conceivably fold and function in isolation
(Pontig and Russell, 2002) - Dictate the function of the protein
- Evolutionary pressure to conserve (sequence
and/or structure)
5Organismal Complexity
- The nematode, C. elegans, has 19,500 genes in its
genome
- Humans have between 20,000 and 25,000 genes in
their genome - HOW CAN THAT BE?
- Alternate splicing, multi-functional/network
proteins
6Hypothesis
- Gene products, proteins, can be multi-functional
with the introduction of domains - evolution does not produce innovation from
scratch. It works on what already exists, either
transforming a system to give it a new function
or combining several systems to produce a more
complex one (Jacob, 1946) - More complex or phylogenetically derived
organisms produce proteins with greater domain
complexity
7Hypothesis Part II
- Create a protein domain tool
- Position
- Partner domain
- General organization
- Protein evolution
- Using a variety of sequenced genomes
- Allow investigators to learn about domain of
interest and apply to research
8Kinesins A model protein family
- Motor proteins found in eukaryotic organisms
- Contain a conserved motor domain
- Bind and walk along microtubules
- Can carry a variety of cargo
- May contain multiple domains
http//www.mb.tn.tudelft.nl/projects/
9Kinesins A model protein family
From Reddy and Day, 2001
- Arabidopsis thaliana, a model plant species,
contains 61 kinesins - S. pombe 10, C. elegans 22, Drosophlia 25,
- Human and mouse 45
10Programming Approach
- Two programs used, BLAST and InterProScan, held
together with perl scripts - Give a domain sequence to PSI-BLAST, which will
identify proteins that have that domain. - One by one, give those protein sequences to IPR,
which identifies domains in the protein. - Create a listing of proteins and map the data
into a phylogeny. - Create a tree based on the phylogeny and domains
11Program Flowchart
Domain Sequence
BLAST
List of proteins with similar domains
InterProScan
Maketree
List of domains in every protein
Tree (includes domains)
12Program Details
- Database selection
- BLAST Refseq over nr
- InterProScan SMART database, only
- Threshold values
- BLAST Option to change, improve resolution
- InterProScan E-value at 0.99, up from 0.01
- Used Arabidopsis sequences as a control
- Name DPIP (Domain Placement in Proteins)
13Results
- A Quick Look at the Data
- Phylogenetic Approach
- Hypothesis I
- Qualitative Approach
- Hypothesis II
14A Quick L k
15Phylogenetic Approach
- More complex or phylogenically derived organisms
produce proteins with greater domain complexity - Trace domain characteristics on a preset tree
- Use MacClade tree drawing software
- Uses input data to create most parsimonious trace
- Characteristics Maximum domains
- Unique domains
16Maximum of Domains per Protein
Green 1 Black 3
17Number of Unique Domains per Organism
Blue 1 Pink 2 Dk. Blue 3 Yellow 5 Black
6 Dash - ???
18Phylogenetic Conclusions
- Inconclusive or null hypothesis supported
- Possible explanations
- Kinesins may have limited domain complexity due
to function or folding - Inherent bias in DPIP (refseq database)
- Future Work
- Testing other domains through same process
- Updating database
- Include measure for position (N/I/C)
19Qualitative Approach
- Create a protein domain tool
- Position
- Partner domain
- General organization
- Protein evolution
- Using a variety of sequenced genomes
- Compile data into a more informative table
20- Can I trace domain or protein evolution??
21Presence of FHA/PH domain in kinesins
Yellow Absent Blue - Present
22Conclusions
- DPIP program was created to answer two questions
- Does organismal complexity correspond with
protein complexity? - Can we create a tool for researched to better
understand domain in protein families? - For kinesins motor domains No and Yes
- For other domains????
Thanks to Webb Miller, Richard Cyr Claude
DePamphillis, Alexander Richter, Plant
Physiology, Biology, and Bioinformatics Depts.