Protein structures in the PDB - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Protein structures in the PDB

Description:

domains are the basic unit of structure classification. different domains in a protein are ... are purely phenetic--based. on characteristics of the structure, ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 25
Provided by: mattc5
Category:

less

Transcript and Presenter's Notes

Title: Protein structures in the PDB


1
Protein structures in the PDB
2
Domains
  • proteins can be modular
  • single chain may be divisible into smaller
    independent units of tertiary structure called
    domains
  • domains are the basic unit of structure
    classification
  • different domains in a protein are also often
    associated with different functions carried out
    by the protein.

3
Definition of domain
  • A polypeptide or part of a polypeptide chain
    that can independently fold into a stable
    tertiary structure...
  • from Introduction to Protein Structure, by
    Branden Tooze
  • Compact units within the folding pattern of a
    single chain that look as if they should have
    independent stability.
  • from Introduction to Protein Architecture, by
    Lesk

MBP Figure to go here
4
Motif (Supersecondary Structure)
  • there are certain favored arrangements of
    multiple secondary structure elements that recur
    again and again in proteins--these are known as
    motifs or supersecondary structures
  • a motif is usually smaller than a domain but can
    encompass an entire domain

greek key
beta-alpha-beta
5
Protein Taxonomy-The CATH Hierarchy
  • 1. Divide PDB structure entries into domains
    (using domain recognition algorithms--the domain
    is the fundamental unit of structure
    classification
  • 2. Classify each domain according to a five
    level hierarchy

Class Architecture Topology Homologous
Superfamily Sequence Family
the top 3 levels of the hierarchy are purely
phenetic--based on characteristics of the
structure, not on evolutionary relationships
the bottom two levels include some phyletic
classification as well-- groupings according to
putative common ancestry based on structural
similarity, functional similarity, and sequence
similarity
protein evolution is not well understood-- there
is to date no purely phyletic classification
system
6
Class
  • In the CATH hierarchy, Class simply describes
    what type of secondary structure is present.
  • There are only four classes
  • mainly a
  • mainly b
  • a b
  • few secondary structures
  • 90 of structures are trivial to assign at this
    level.

7
Architecture
  • Architecture is hard to define precisely
  • In CATH it is defined broadly as describing
    general features of protein shape such as
    arrangements of secondary structure in 3D space
  • It does not define connectivities between
    secondary structural elements--thats what the
    topology level does. It does not even explicitly
    define directionality of secondary structure,
    e.g. parallel or antiparallel beta-sheets.
  • in CATH, architectures are presently assigned
    manually, by visual inspection.
  • lets look at some architectures!

8
Some mostly beta architectures
9
Some mixed alpha-beta architectures
10
Topology (Fold)
  • if two proteins have the same topology, it means
    they have the same number and arrangement of
    secondary structures, and the connectivities
    between these elements are the same.
  • this is also sometimes called the fold of a
    protein.
  • in CATH, automated structure alignment is used
    to group proteins according to topology. We will
    discuss this later.
  • we will now look at some examples which
    illustrate differences in topology.

11
Topology differences in connectivity
  • example a four-stranded antiparallel beta-sheet
    can have many different topologies based on the
    order in which
  • the four beta-strands are connected.

greek key
up-and-down
12
Topology differences in handedness
  • example in a beta-alpha-beta motif, if the two
    parallel strands are oriented to face toward you,
    the helix can be either above or below the plane
    of the strands.

13
Visualizing protein topology--TOPS cartoons
  • up trianglesup-facing beta strands
  • down trianglesdown-facing beta strands
  • horizontal rows of trianglesbeta sheets (beta
    barrel would be a ring of triangles)
  • circleshelices
  • linesloops
  • if loops enter from top, line drawn to ctr.
  • if loops enter from bottom, line drawn to
    boundary

fold above is clearly an antiparallel
beta-sandwich
14
Visual summary of top three levels of CATH
hierarchy
CLASS
ARCHITECTURE
TOPOLOGY
15
Discovery of New Folds
  • structural taxonomy reveals that although
    structures are being solved more rapidly than
    ever, fewer and fewer of them have new folds!
    Will we get them all soon?

16
Homologous superfamily/Sequence family
  • The lowest two levels in the CATH hierarchy
    relate to common ancestry
  • some, but not all proteins with the same fold
    show evidence of common ancestry
  • the surest way of identifying common ancestry is
    that two proteins have sequences roughly gt30
    identical (sequence family level)
  • if protein sequences are not that similar, common
    ancestry may still be inferred on the basis of a
    combination of structural and functional
    similarity, and possibly weak sequence similarity
    (homologous superfamily level)

17
Multifunctional Superfolds
some architectures have many folds-- superarchite
cture
some folds have many homologous superfamilies, whi
ch means they are used for a variety of
functions. these are called superfolds
18
Common core
  • structures need not share exactly the same
    number, type and connectivity of secondary
    structural elements to be grouped into a single
    fold type.
  • in fact, evolutionarily related proteins often
    share a common core of structurally related
    elements but may differ in presence or absence of
    a secondary structure element or two.

19
Problems in Fold Classification
  • Structure space has a continuous aspect,
    especially in certain types of folds, which makes
    clustering structures into fold families
    difficult. This is an inherent problem for any
    classification method based on hierarchical
    clustering.
  • It seems reasonable to group as having the same
    fold proteins which share some common core but
    differ in addition/subtraction of a few secondary
    structure elements.
  • But this can lead to unnaturally large and
    diverse fold families via the Russian doll effect
    and motif overlap.

20
Russian Doll Effect
  • A continuous range of slight size differences
    will lead to clustering proteins of very
    different size. small--gt medium--gtlarge.

21
Motif Overlap
  • Motif overlap effects Sometimes two proteins
    will share a common core but one of them will
    share a slightly different (but not necessarily
    larger) common core with a third protein. A
    continuous range of overlapping common cores
  • AB--gtBC--gtCD will lead to grouping
    proteins that have no common core.

22
Comparison of SCOP and CATH Hierarchies
  • SCOP CATH
  • class class
  • architecture
  • fold topology
  • homologous superfamily
  • superfamily
  • family sequence family
  • domain domain

CATH more directed toward structural
classification, SCOP pays more attention to
evolutionary relationships
23
Another SCOP/CATH difference
  • in CATH, there is one class to represent mixed
    alpha-beta
  • in SCOP there are two
  • a/b beta structure is largely parallel,
    made of bab motifs
  • ab alpha and beta structure segregated to
    different parts of structure

24
SCOP and CATH
  • they have in common that they are hierarchical
    and based on abstractions
  • they both include some manual aspects and are
    curated by experts in the field of protein
    structure
  • are there automated methods for structure
    classification/comparison?
Write a Comment
User Comments (0)
About PowerShow.com