Valeri Barsegov - PowerPoint PPT Presentation

About This Presentation
Title:

Valeri Barsegov

Description:

single molecule spectroscopy of protein unfolding: biological relevance; pulling ... Bensimon, V. Croquette, ibid, 13, 266 (2003); S. Weiss, Science, 283, 1676 (1999) ... – PowerPoint PPT presentation

Number of Views:272
Avg rating:3.0/5.0
Slides: 30
Provided by: valerib
Category:

less

Transcript and Presenter's Notes

Title: Valeri Barsegov


1
Computer simulations of proteins all-atom and
coarse-grained models
  • Valeri Barsegov
  • Department of Chemistry
  • University of Massachusetts Lowell

YITP, Kyoto University, Japan (2008)
2
Outline
  • Introduction
  • single molecule spectroscopy of protein
    unfolding biological relevance pulling
    experiments (AFM, laser/optical tweezers, force
    protocols)
  • single molecule spectroscopy of unbinding
    biological relevance experimental probes
    resolution of forces, lifetimes, and extension
  • II. Molecular simulations of proteins
  • proteins structure, fold types, examples
  • all-atom Molecular Dynamics (MD) simulations
    force fields, examples, simulations of IR spectra
  • coarse-grained description of proteins
    approximations, examples
  • III. New direction - computer simulations using
    graphics cards
  • basic facts, computer architecture, algorithms
  • applications

3
I.1 Single-molecule dynamic force spectroscopy of
forced unfolding of proteins biological relevance
Fact 1 mechanically active proteins perform
their biological function in linear tandems of
head-to-tail (C-terminal-to-N-terminal)
connected protein domains
  • Examples
  • Titin contains tandems of immunoglobulin (Ig)
    domains, separated by short linkers sequences
    (muscle function)
  • Actin-crosslinking filamins contain rod-like
    tandem of ddFLN domains (cellular locomotion)
  • Fibronectin tandems consist of nonidentical Fn
    domains (extracellular matrix, cell elasticity)
  • Ubiquitin is a multimeric protein (Ub)n of n9
    identical Ub repeats (protein degradation,
    signaling pathways)

4
I.2 Single-molecule dynamic force spectroscopy of
forced unfolding of proteins AFM experiment
force-clamp mode
force-ramp mode
M. Rief, M. Gautel, F. Oesterhelt, J. Fernandez
H. Gaub, Science, 276, 1109 (1997) R. Zinober,
D. Brockwell, G. Beddard, A. Blake, P. Olmsted,
S. Radford D. Smith, Protein Sci., 11, 2759
(2002)
J. Brujic, R. Hermans, K. Walther J. Fernandez,
Nature Phys., 2, 282 (2006) J. Fernandez H.
Li, Science, 303, 1674 (2004)
5
I.3 Single-molecule dynamic force spectroscopy of
forced unbinding of proteins biological
relevance
6
I.4 Single-molecule dynamic force spectroscopy of
forced unbinding of proteins leukocyte rolling
on endothelium
J.-G. Geng, M. Chen, K.-C. Chou, Curr Med Chem,
11, 2153 (2004) L. M. Coussens, Z. Werb, Nature,
420, 860 (2002) Y. J. Kim, L. Borgis, N. M.
Varki, A. Varki, Proc. Natl. Acad. Sci. USA, 95,
9325 (1998) J. Weisel, H. Shuman, R. Litvinov,
Curr Opin Struct Biol, 13, 227 (2003)
7
I.5 Single-molecule dynamic force spectroscopy of
forced unfolding of proteins pulling force AFM
experiment
f-constant
f(t)rf t
t, s
f, pN
J. Weisel, H. Shuman, R. Litvinov, Curr Opin
Struct Biol, 13, 227 (2003) M. Schlierf, H. Li,
J. Fernandez, PNAS, 101, 7299 (2004) J.
Liphardt, D. Smith, C. Bustamante, Curr Opin
Struct Biol, 19, 279 (2000) J.-F. Allemand, D.
Bensimon, V. Croquette, ibid, 13, 266 (2003) S.
Weiss, Science, 283, 1676 (1999) E. Evans, PNAS,
98, 3784 (2001)
8
I.6 Single-molecule dynamic force spectroscopy of
proteins experimental resolution of unfolding
forces, times, and distances
  • Experimental resolution
  • protein extension ?X 1 nm
  • stretching force fS ? 100pN
  • force-quench fQ ? 5-10pN
  • relaxation interval T ? 10-100?s

J. Fernandez H. Li, Science, 303, 1674
(2004) I. Schwaiger, M. Schleicher, A. Noegel
M. Rief, EMBO Reports, 6, 46 (2005) J. Brujic,
R. Hermans, K. Walther J. Fernandez, Nature
Phys., 2, 282 (2006)
9
II.1 Molecular simulations of proteins levels of
structure of proteins
  • Amino acids in proteins (or polypeptides) are
    joined together by peptide bonds.
  • The sequence of R-groups along the chain is
    called the primary structure.
  • Secondary structure refers to the local folding
    of the polypeptide chain.
  • Tertiary structure is the arrangement of
    secondary structure elements in 3D
  • Quaternary structure describes the arrangement
    of a protein's subunits.

The PDB is the single worldwide repository of 3D
structure data of proteins and nucleic acids
35,000 structures as of August 2005.
(www.rcsb.org/pdb)
Other Web Resources
  • 1. NCBI
  • 2. The European Bioinformatics Institute (EBI)
    (www.ebi.ac.uk)
  • 3. The RNA world (www.imb-jena.de/RNA.html)

10
II.2 Molecular simulations of proteins secondary
and tertiary structure of proteins
F -57o , ? -47o right handed alpha-helix
Chain has directionality!
11
II.3 Molecular simulations of proteins secondary
and tertiary structure of proteins
F (-110o, -140o), ? (110o, -135o)gt beta-sheet
12
II.4 Molecular simulations of proteins
quaternary structure of proteins
Alpha-beta folds
Multi-domain proteins
a) Control protein b) Immunoglobulin (muscles) c)
Fibronectin d) Growth factor
Knotted proteins
13
II.5 All-atom classical Molecular Dynamics (MD)
simulations force fields
I. Potential for bonded interactions
VBL-bondlength potential, VBA-bond-angle
potential, VDIH-dihedral angle potential, VSS
disulfide bond potential
II. Potential for non-bonded interactions
VPP- protein-protein interaction potential, VWW-
wa-ter-water potential, VWP - water-protein
interaction potential
III. Software (open-source)
IV. Water models
  • GROMACS (force field OPLS and GROMOS )
  • NAMD (force fields CHARMM22, CHARMM27)
  • GROMACS (SPC, SPC/E, SPC-fw)
  • NAMD (TIP, TIP3P)

GROMACS (Univ. of Groeningen, Netherlands)
ftp//ftp.gromacs.org/pub/ NAMD (Univ. of
Illinois at Urbana Shampaign, USA)
http//www.ks.uiuc.edu/Research/namd/
14
II.6 All-atom MD simulations of proteins
examples of fibrinogen and A-knob-a-hole complex
of fibrin
  • Fibrin polymerisation 2,400 a.a., 48nm
  • essential for blood clotting
  • implicated in heart attack and stroke

15
II.7 All-atom MD simulations of proteins IR
spectroscopy of proteins
- infrared light (vibrations of bonds) Amide I
Amide II are the major bands -
conformationally sensitive - localized at
individual a.a site Amide I CO-stretching
(90)C-N-stretch (10) Amide II N-H-bending
(60)C-N-stretch (40)
Amide I
Krimm Bandekar, Adv. Prot. Chem., 38, 181
(1986) Woutersen Hamm, J. Phys Cond. Matt.
14, R1035 (2002) Venyaminov Kalnin,
Biopolymers, 30, 1243 (1990) Chergadze
Nevskaya, ibid, 15, 637 (1976)
16
II.8 All-atom MD simulations of proteins IR
spectroscopy of proteins
1. Vibrational exciton Hamiltonian
2. Transition dipole coupling (TDC)
3. Linear absorption spectrum
Cheatum et al, JCP, 120, 8201 (2004) Torii
Tasumi, JCP, 96, 3379 (1992) S. Mukamel,
Principles of Nonlinear Spectroscopy
17
II.9 All-atom MD simulations of proteins IR
spectroscopy of proteins
Assumptions used in the vibrational exciton
Hamiltonian - dynamics of in the
near-equilibrium state - fast bath relaxation
(fixed line broadening, ?) - fitting parameters
(diagonal energies, peak amplitudes, frequency
splitting) - energies/amplitudes are from ab
initio maps of N-methylacetamide, glycine
dipeptide analogs - transferability of ab
initio maps to larger proteins
Direct calculation of IR spectra of Amide I from
MD - Amide I ? CO-vibration with -
Correction Factor due to assumptions/harmonic
force field
Advantages of correlation functions - IR
obtained directly from classical MD - beyond
ensemble average - far-from-equilibrium regime
18
II.10 All-atom MD simulations of proteins IR
spectroscopy of proteins
Ubiquitin (1UBQ, 76 a.a) - water box (4,600
TIP3P, 47Å?51Å ?57Å) - 8 trajectories (t4ps,
dt0.1fs, NVE) at T300K - Ewald sum method
(long range electrostatics) - 12Å cutoff for
L-J forces
A?16-22 (3?KLVFFAE, 21 a.a) - water box (2000
TIP3P, 44Å?41Å ?36Å) - - Ewald sum method 12Å
cutoff for L-J forces - 12 trajectories
(t8ps, dt0.1fs, NVE) at T300K
Correction Factor0.985 (CHARMM22)
Chung et al, PNAS, 102, 612 (2005)
Cheatum et al, JCP, 120, 8201 (2004)
19
II.11 Coarse-grained (CG) descriptions of
proteins building the CG model
I. Coarse-grained model for P-selectin
  • Step 1 creating structure file of Ca centers
    of mass of residues from PDB structure of
    P-selectin (www.rcsb.org)
  • mimicking hydrogen bonds
  • modeling S-S bonds

Step 2 computing potential energy of ob-tained
conformation of P-selectin
Step 3 follow Langevin Dynamics
K. Dill et al, Protein Sci, 4, 561 (1995) D.
Thirumalai, D. Klimov, PNAS, 97, 2544 (2000)
J. Bryngelson et al, Protein, 21, 167 (1995)
M. Karplus, A. Sali, Curr Opin Struct Biol, 5,
58 (1995) Kolinski, J. Skolnick, Polymer, 45,
511 (2004)
20
II.11 Coarse-grained (CG) descriptions of
proteins force field
I. Scales of energy/length/mass/time
- hydrophobic interaction (1.25 kcal/mol)
-bond length (3.8 Å)
- the timescale (3ps )
- residue mass ( )
II. Harmonic connectivity potentials
III. Dihedral angle potential
turn
ß-sheet
a-helix
21
II.11 Coarse-grained (CG) descriptions of
proteins force field
IV. Hydrogen bond potential
V. Potential for native contacts
bij-contact interaction matrix -contact
distance (Kolinski et al, JCP, 98, 7420 (1993))
VI. Nonbonded potential
VII. Unfolding/unbinding trajectories

22
II.12 Coarse-grained (CG) descriptions of
proteins forced rupture of the P-selectin-sPSGL
noncovalent bond
N-terminus of P-selectin
C-terminus of sPSGL-1
23
III.1 Computer simulations using graphics cards
basic facts
  • CPU
  • Advantages
  • can perform very sophisticated flow control
    (IF/THEN cycles, conditionals, etc.)
  • single CPU cores are faster (3.0GHz) or faster
  • a lot of well-tested (commercial) software is
    available
  • Disadvantages
  • has no more than 6 cores (today)
  • parallel programming on CPU is difficult
  • data exchange b/w nodes in a cluster occurs
    through relatively slow network
  • GPU
  • Advantages
  • up to 240 cores (GeForce 280, Tesla C1060)
  • easy to write parallel codes with CUDA language
    (extension of C)
  • memory bandwidth is high because all cores are
    local
  • Disadvantages
  • single core clock is not as fast as CPU core
    (0.5GHz)
  • cant be used for applications with
    sophisticated flow control
  • not many software available for GPU (started in
    2006)

24
III.2 Computer simulations using graphics cards
hardware
  • GPU
  • highly parallel
  • multythreaded
  • manycore processor
  • Historically, GPU
  • was designed for compute-intensive, highly
    parallel computation
  • more transistors are devoted to data processing
    rather than data caching and flow control
  • well-suited for problems that involve
    data-parallel computations, i.e. the same program
    is executed on many data elements in
    parallel (MD, coarse-grained simulations).

25
III.3 Computer simulations using graphics cards
programming mode
  • CUDA
  • consist of a minimal extension to the C language
  • parallel programming model and software
    environment
  • designed to overcome the challenge of creating
    software that transparently scales on manycore
    processors
  • Example
  • vecAdd() function is called N times on GPU.
  • ltltlt1, Ngtgtgt means that the procedure runs in one
    1D block with N threads.
  • i threadIdx.x is a way for thread to identify,
    which element of the vector it should work with.

26
III.4 Computer simulations using graphics cards
software organization
Thread hierarchy
  • thread index is a 3D vector, so that threads can
    be identified using a 1D, 2D or 3D index forming
    1D, 2D or 3D thread block
  • 2. multiple blocks can be organized into 1D or 2D
    grid. Each block can be identified within grid
    using 1D, 2D or 3D block index
  • 3. all threads in one block are doing the same
    thing with different data
  • 4. threads can synchronize and pass data to each
    other within block using shared memory
  • 5. threads can pass the data to the CPU through
    the GPU global memory

27
III.5 Computer simulations using graphics cards
software organization
Memory hierarchy
  • each thread has it own local memory (in cache)
    for storing temporary variables
  • each block has shared memory (in cache) for
    synchronizing the threads within block
  • device has global memory that can be accessed
    from any thread on the GPU
  • local memory and shared memory are much faster
    than global, but they are available only locally
    and exists only during the lifetime of a thread
    or a block.
  • 5. global memory is relatively slow, and can be
    also accessed from CPU.

28
III.6 Computer simulations using graphics cards
hardware model
Hardware organization
  • each device have N multiprocessors
  • multiprocessors can share data only through
    device (global) memory
  • A multiprocessor have M processors (ALUs)
  • 4. number of threads that can run at the same
    time is equal to NxM for GeForce 8800GT, M8,
    N14 (number of processors 112!!!)
  • 5. one block can run only on one multiprocessor
    so the number of blocks in program should be at
    least equal to the number of multiproces-sors on
    the device.

29
III.7 Computer simulations using graphics cards
applications
  • MD and CG simulations are suitable for GPU
  • same potential (force field) for all atoms
  • (beads)
  • integration scheme is explicit
  • systems have huge number of atoms (beads)
  • Example
  • Rouse chain model of homopolymer
  • Lennard-Jones potential (self-avoidance)
  • 1,000,000 time steps for each chain
  • Intel Xeon 2GHz Dual Core (CPU, 350) vs
    GeForce 8800 GT (GPU, 130)
Write a Comment
User Comments (0)
About PowerShow.com