Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily - PowerPoint PPT Presentation

About This Presentation
Title:

Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Description:

Only template and target extracted. Around 30 % similarity between template and target ... Using extracted template and target alignment. Sequence for template ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 34
Provided by: tri5158
Category:

less

Transcript and Presenter's Notes

Title: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily


1
Structure Prediction and Modeling of a Eukaryotic
Member of the Major Facilitator Superfamily
  • Gaurav Narale

2
Major Facilitator Superfamily (MFS)
  • MEMBRANE TRANSPORT
  • Largest secondary transporter protein family
    known so far with more than 1000 members
    identified.1
  • Use a solute gradient to drive the translocation
    of substrates such as ions, sugars, amino acids,
    peptides and other hydrophilic solutes.2
  • Typically 400-600 amino acids long.
  • 12 transmembrane ?-helices, with both the N- and
    C-termini in the cytosol.3
  • Two six-helix halves connected by a central loop.
  • Found in all three kingdoms of living organisms.

3
Identifying Templates and Targets
  • TEMPLATES - Two known structures
  • Lactose Permease (LacY) E. Coli
  • Glycerol-3-Phosphate Transporter (GlpT) E. Coli
  • Sequence identity between the two is negligible
    (9).
  • CE algorithm for structural alignment indicates
    that they superimpose over most of their chain
    length (RMSD3.7Å)
  • 1st GOAL To find a Eukaryotic member of the MFS
    that shows enough sequence identity with one of
    the known structures to allow reasonable
    alignment.

4
Function and Mechanism of LacY and GlpT
Both use a solute gradient to drive translocation
of substrate - LacY mediates the coupled
transport of lactose and H - GlpT catalyzes the
exhange of glycerol-3-phosphate for phosphate
  • Alternating-Access Model
  • Outward-facing conformation exposed to the
    extracellular side.
  • Inward-facing conformation exposed to the
    cytoplasm.
  • Ribbon Representation
  • Amino-terminal domain (blue).
  • Carboxyl-terminal domain (green).
  • Bends and other irregularities in the ?-helices
    are indicated by deviations from ideally straight
    and continuous helical ribbon.

5
Identifying Templates and Targets
  • Lactose Permease (LacY)
  • Obtained protein pdb file from protein data bank
    (1PV6) and extracted amino acid sequence in FASTA
    format. www.rcsb.org/pdb
  • Searched for a TARGET with high sequence identity
    using NCBI BLAST. www.ncbi.hlm.nih.gov
  • General search against all organisms 2
    iterations, threshold 0.005
  • - hits were mainly bacterial proteins.
  • 2. Saved the results as a profile (PSSM)
  • 3. More sensitive search using the original
    sequence as well as the saved profile as input
    while limiting to a eukaryotic search 2
    iterations, threshold 0.01
  • Unable to identify a suitable target.

6
Identifying Templates and Targets
  • Glucose-3-Phosphate Transporter (GlpT)
  • Obtained protein pdb file from protein data bank
    (1PW4) and extracted amino acid sequence in FASTA
    format. www.rcsb.org/pdb
  • Searched for a TARGET with high sequence identity
    using NCBI BLAST. www.ncbi.hlm.nih.gov
  • General search against all organisms 2
    iterations, threshold 0.005
  • Obtained a suitable TARGET Glucose-6-Phosphate
    Translocase
  • Homo Sapien
  • 3. Utilized BLink to identify several eukaryotic
    close targets for use in multiple sequence
    alignments.

7
Multiple sequence alignment
  • Only template and target - initial review
  • Both templates, target and close targets
  • 15 proteins similar to the target selected from
    different species to get a better alignment
  • Only template and target extracted
  • Around 30 similarity between template and
    target
  • Well distributed alignment

8
Alignment using FUGUE
10 20 30 40
50 hs1pw4a ( 5 ) fkpaphkarlpaaeidptYrrl
rwqIflGIffGyaAYylVRkNFALAMpy QUERY g6pt
-------------MAAQGYGYYRTVIFSAMFGGYSLYYFNRKTFSFVM
PS
aaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaa
60 70 80
90 100 hs1pw4a ( 55 )
L-veqgfsrgDLGfALSGISiAygfSkfimgsvSdrsnPrvfLPaGLilA
QUERY g6pt LVEEIPLDKDDLGFITSSQSAAYAIS
KFVSGVLSDQMSARWLFSSGLLLV
aaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaa
110 120
130 140 150 hs1pw4a ( 104 )
AavMlfMGfvpwATssiavMfvlLflCGwfQGmGwpPCgrTmvhwwsqke
QUERY g6pt GLVNIFFAWSSTV----PVFAALWFL
NGLAQGLGWPPCGKVLRKWFEPSQ
aaaaaaaaa aaaa aaaaaaaaaaaaaaa aaaaaaaaa
a 160 170
180 190 200 hs1pw4a ( 154 )
rggivsVwncAhNvggGiPPllFllGmawfndwhAALYmPAfcAilvA
lf QUERY g6pt FGTWWAILSTSMNLAGGLGPILAT
ILAQSY-SWRSTLALSGALCVVVSFL
aaaaaaaaaaaaaaaa aaaaaaaaaaa
aaaaaaaaaaaaa 210
220 230 240 250 hs1pw4a
( 204 ) AfamMrdTpqsCglppiee-----ykndtakqifmq
yVlpnklLwyIAiA QUERY g6pt
CLLLIHNEPADVGLRNLDPMPSEGKKGSLKEESTLQELLLSPYLWVLSTG
aaaa
aaaaaa aaaaaaaaa
260 270 280 290
300 hs1pw4a ( 262 ) NvfVyLLRYGiLDwSPtylkev
KhfaldkSSwAYflYEyagipGTllCgw QUERY g6pt
YLVVFGVKTCCTDWGQFFLIQEKGQSALVGSSYMSALEVGGLVGSIAA
GY aaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaa
310 320 330 340
350 hs1pw4a ( 312 ) msdkv----------frgnrGa
TGvfFMtlVtiaTivywmnpagNptvdm QUERY g6pt
LSDRAMAKAGLSNYGNPRHGLLLFMMAGMTVSMYLFRVTVTSDSPKLW
IL aaaa
aaaaaaaaaaaaaaaaaa aaaaa
360 370 380 390
400 hs1pw4a ( 352 )
iCmivIGflIyGPvmLIglHAleLApkkAagtAagfTglfGylgGSvaAs
QUERY g6pt VLGAVFGFSSYGPIALFGVIANESAP
PNLCGTSHAIVGLMANVGGFL-AG
aaaaaaaaaa aaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa
410 420
430 440 450 hs1pw4a ( 402 )
aiVGytvdffgwdgGfmvMigGSilAvilLivVmigekrrheqllqelv
p QUERY g6pt LPFSTIAKHYSWSTAFWVAEVICAA
STAAFFLLRNIRTKMGRVSKKAE--
aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa33333
9
MPSA - only template and target
P_1P4W FKPAPHKARLPAAEIDPTYRRLRWQIFLG
IFFGYAAYYLVRKNFALAMPYLVEQG-FSRG GLUCOSE6HUMAN
-------------MAAQGYGYYRTVIFSAMFGGYSLYYFNRKTFSFV
MPSLVEEIPLDKD
. ..
.. P_1P4W DLGFALSGISIAYGFSKFIMGSV
SDRSNPRVFLPAGLILAAAVMLFMGFVPWATSSIAVM GLUCOSE6HUMA
N DLGFITSSQSAAYAISKFVSGVLSDQMSARWLFSSGLLLVG
LVNIFFAWS----STVPVF .
. .. ... .
. P_1P4W FVLLFLCGWFQGMGWPPCGRT
MVHWWSQKERGGIVSVWNCAHNVGGGIPPLLFLLGMAWF GLUCOSE6HU
MAN AALWFLNGLAQGLGWPPCGKVLRKWFEPSQFGTWWAILS
TSMNLAGGLGPILATI-LAQS .
. . . . .
P_1P4W NDWHAALYMPAFCAILVALFAFA
MMRDTPQSCGLP-----PIEEYKNDTAKQIFMQYVLP GLUCOSE6HUMA
N YSWRSTLALSGALCVVVSFLCLLLIHNEPADVGLRNLDPMP
SEGKKGSLKEESTLQELLL .
.. .. . ..
P_1P4W NKLLWYIAIANVFVYLLRYGILDW
SPTYLKEVKHFALDKSSWAYFLYEYAGIPGTLLCGW GLUCOSE6HUMAN
SPYLWVLSTGYLVVFGVKTCCTDWGQFFLIQEKGQSALVGSS
YMSALEVGGLVGSIAAGY .
. . . . .
. P_1P4W MSDKVFRGN--------RGATGVF
FMTLVTIATIVYWMNPAGN--PTVDMICMIVIGFLI GLUCOSE6HUMAN
LSDRAMAKAGLSNYGNPRHGLLLFMMAGMTVSMYLFRVTVTS
DSPKLWILVLGAVFGFSS .
. . .
P_1P4W YGPVMLIGLHALELAPKKAAGTAAGFT
GLFGYLGGSVAASAIVGYTVDFFGWDGGFMVMI GLUCOSE6HUMAN
YGPIALFGVIANESAPPNLCGTSHAIVGLMANVGGFLAGLPFSTI
AKHYSWSTAFWVAEV
. ... . . . . .
P_1P4W GGSILAVILLIVVMIGEKRRHEQLLQ
ELVP GLUCOSE6HUMAN ICAASTAAFFLLRNIRTKMGRVSK
KAE--- .
.
10
Extracted template-target
P_1PW4 -------FKPAPHKARLPAAEIDPTYRRLRWQIFLGI
FFGYAAYYLVRKNFALAMPYLVE gi2765461e
--------------------MAAQGYGYYRTVIFSAMFGGYSLYYFNRKT
FSFVMPSLVE
. P_1PW4
QGFS---RGDLGFALSGISIAYGFSKFIMGSVSDRSNPRVFLPAGLILAA
AVMLFMGFVP gi2765461e EIPLD--KDDLGFITSSQSAAYAISK
FVSGVLSDQMSARWLFSSGLLLVGLVNIFFAWSS
. . .
P_1PW4 WATSS--IAVMFVLLFLCGWFQGMGWPP
CGRTMVHWWSQKERGGIVSVWNCAHN--VGGG gi2765461e
TVP------VFAALWFLNGLAQGLGWPPCGKVLRKWFEPSQFGTWWAILS
TSMN--LAGG .
. .. P_1PW4
IPP-------LLFLLGMAWFN-----------DWHAALYMPAFCAILVAL
FAFAMMRDTP gi2765461e LGP-------ILATILAQSYS-----
-------WRSTLALSGALCVVVSFLCLLLIHNEP
. ..
P_1PW4 QSCGLPPIEEYKNDT-------------
------AKQIFMQYVLPNKLLWYIAIANVFVY gi2765461e
ADVGLRNLDPMPSEG--------------KKGSLKEESTLQELLLSPYLW
VLSTGYLVVF
. . . P_1PW4
LLRYGILDWSPTYLKEVKHFALDK-SSWAYFLYEYAGIPGTLLCGWMSDK
VFR------- gi2765461e GVKTCCTDWGQFFLIQEKGQSALV-G
SSYMSALEVGGLVGSIAAGYLSDRAMAKAGLSNY
. .
P_1PW4 -GNRGATGVFFMTLVTIATIVYWMNPAG
---------------NPTVDMICMIVIGFLIY gi2765461e
GNPRHGLLLFMMAGMTVSMYLFRVTVTSD-----------S--PKLWILV
LGAVFGFSSY
P_1PW4
GP-VMLIGLHALELAPKKAAGTAAGFTGLFGYLGGSVAASAIVGYTVDF-
FGWDGGFMVM gi2765461e GP-IALFGVIANESAPPNLCGTSHAI
VGLMANVG-GFLAGLPFSTIAKH-YSWSTAFWVA

P_1PW4 IGGSILAVILLIVVMIGEKRRHEQLLQE
LVP----------------------------- gi2765461e
EVICAASTAAFFLLRNIRTKMGRVSKKAE---------------------
----------
11
Checking alignment in MODELER
  • Using chk_align.top script
  • _aln.pos 210 220 230 240
    250 260 270
  • 1PW4 MRDTPQSCGLPPIEEYKND/T-----AKQIFMQYVLPNKL
    LWYIAIANVFVYLLRYGILDWSPTYLKE
  • G6PT IHNEPADVGLRNLDPMPSE-GKKGSLKEESTLQELLLSPY
    LWVLSTGYLVVFGVKTCCTDWGQFFLIQ
  • _consrvd
  • Problem near chain break
  • _aln.pos 210 220 230 240
    250 260 270
  • 1PW4 MRDTPQSCGLPPIEEYKND/----TAKQIFMQYVLPNKLL
    WYIAIANVFVYLLRYGILDWSPTYLKEV
  • G6PT IHNEPADVGLRNLDPMPSEGKKGSLKEESTLQELLLSPYL
    WVLSTGYLVVFGVKTCCTDWGQFFLIQE
  • _consrvd

12
Modeler Runs
  • Using extracted template and target alignment
  • Sequence for template extracted from structure
    using Insight
  • Missing residues in structure appear as chain
    breaks
  • Parameters
  • OUTPUT_CONTROL 1 1 1 1 1
  • STARTING_MODEL 1
  • ENDING_MODEL 5
  • LIBRARY_SCHEDULE 4
  • MD_LEVEL 'refine_1'

13
PROSA 2 runs
  • Used to evaluate models
  • Models with best scores from MODELER were
    compared using PROSA
  • Z value used for initial comparison
  • Graph used to identify location of major
    violations

14
Model Selection Criteria
  • MODELER log file
  • Minimum energy
  • Number of violations
  • Number of really bad violations
  • Location of violations with respect to alignment
    and structure
  • PROSA 2 log file
  • Z score closest to template
  • Peaks and troughs in graph relative to template

15
Adjusting the alignment
  • Comparison of structures obtained from modeler in
    Insight
  • Alignment violations clearly visible
  • Criteria for modifying alignment
  • Unequal number of residues in loop
  • Unsatisfied structural similarity constraints
  • Residues violating constraints as generated by
    modeler

16
(No Transcript)
17
1st run - adjustment in Insight
18
(No Transcript)
19
Loop Modeling
  • Modeler Run 2
  • Loop Modeling Run 1

20
Loop modeling
  • Generate models based on adjusted alignment
  • 25 models obtained
  • Models selected based on minimum energy and
    constraint violations
  • Parameters
  • OUTPUT_CONTROL 1 1 1 1 1
  • STARTING_MODEL 1
  • ENDING_MODEL 5
  • LIBRARY_SCHEDULE 2
  • MD_LEVEL 'refine_3
  • DO_LOOPS 1
  • LOOP_ENDING_MODEL 5
  • LOOP_MD_LEVEL 'refine_3

21
Loop Modeling Run 1Best 4 Models Picked
  • ID1, ID2
    1 5
  • Current energy
    192
  • PROSA Z score -6.60
  • ( Z score of template -7.3 )
  • ID1, ID2
    3 2
  • Current energy
    387
  • PROSA Z score -6.57
  • ID1, ID2
    4 2
  • Current energy
    363
  • PROSA Z score -6.76
  • ID1, ID2
    5 4
  • Current energy
    242
  • PROSA Z score -6.3

22
Violations - MODELER log file
ID1, ID2
1 5 Current energy
192.1849
RESTRAINT_GROUP NUM
NUMVI NUMVP RMS_1 RMS_2 MOL.PDF
S_i ----------------------------------------------
--------------------------------------------------
- 25 Phi/Psi pair of dihedral restraints 64
44 11 36.170 140.638 79.036 1.000
--------------------------------------------------
----------------------------------------------- F
eature 25 Phi/Psi
pair of dihedral restraints List of the
RVIOL violations larger than 6.5000
ICSR RESNO1/2 ATM1/2 INDATM1/2 FEAT
restr viol rviol RESTR VIOL RVIOL
7 1360 45D 46K C N 368 370 -68.99
-70.20 30.80 2.20 -62.90 150.55 19.23
7 46K 46K N CA 370 371 109.62
140.40 -40.80 8 1361 46K
47D C N 377 379 173.18 54.50 123.21
12.43 -63.30 132.44 18.20 8 47D
47D N CA 379 380 7.79 40.90
-40.00 9 1362 47D 48D C N 385
387 -138.58 -63.30 76.02 11.52 -63.30
76.02 11.52 9 48D 48D N CA
387 388 -29.45 -40.00 -40.00
12 1369 103F 104A C N 811 813 -69.81
-68.20 21.24 1.77 -62.50 165.18 26.73
12 104A 104A N CA 813 814 124.12
145.30 -40.90 13 1370 104A
105A C N 816 818 -169.75 -62.50 107.58
21.02 -62.50 107.58 21.02 13 105A
105A N CA 818 819 -49.29 -40.90
-40.90
23
1st loop model - violations in Insight
Residue 104
Residue 46
24
Loop Model Run 1 - adjustment
25
Loop Modeling 2
  • Refinement of Loop Model 1
  • Loop Modeling 2
  • Modeler Run 3

26
Loop Modeling Run 2Best 5 Models
  • ID1, ID2
    5 1
  • Current energy
    237.4322
  • PROSA Z score -5.82
  • ID1, ID2
    3 1
  • Current energy
    222.2522
  • PROSA Z score -6.27
  • ID1, ID2
    1 1
  • Current energy
    195.7286
  • PROSA Z score -6.32
  • ID1, ID2
    2 4
  • Current energy
    226.8002
  • PROSA Z score -6.09
  • ID1, ID2
    2 2
  • Current energy
    198.0359
  • PROSA Z score -6.15

27
Violations - MODELER log file
ID1, ID2
1 1 Current energy
195.7286
RESTRAINT_GROUP NUM
NUMVI NUMVP RMS_1 RMS_2 MOL.PDF
S_i ----------------------------------------------
--------------------------------------------------
- 4 Stereochemical improper torsion pot 156
1 2 1.943 1.943 16.723
1.000 25 Phi/Psi pair of dihedral restraints
67 40 11 34.260 132.074 73.358
1.000
--------------------------------------------------
----------------------------------------------- F
eature 25 Phi/Psi
pair of dihedral restraints List of the
RVIOL violations larger than 6.5000
ICSR RESNO1/2 ATM1/2 INDATM1/2 FEAT
restr viol rviol RESTR VIOL RVIOL
3 1430 45D 46K C N 368 370 -103.79
-118.00 33.92 1.76 -62.90 154.80 22.53
3 46K 46K N CA 370 371 169.89
139.10 -40.80 4 1431 46K
47D C N 377 379 -95.02 -70.90 59.16
2.00 -63.30 119.95 16.85 4 47D
47D N CA 379 380 -155.68 150.30
-40.00 5 1432 47D 48D C N 385
387 -63.33 -70.90 31.08 1.19 -63.30
160.16 19.77 5 48D 48D N CA
387 388 120.16 150.30 -40.00
9 1441 103F 104A C N 811 813 -122.41
-134.00 20.39 1.24 -62.50 166.47 30.50
9 104A 104A N CA 813 814 163.78
147.00 -40.90 10 1442 104A
105A C N 816 818 -64.90 -68.20 29.69
2.28 -62.50 156.71 25.57 10 105A
105A N CA 818 819 115.80 145.30
-40.90
28
Loop Model Violation Sites
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
Refinements in Final Model
  • Some regions can be realigned and refined further
    taking into consideration their energy
    violations.
  • Other tools could be used such as PROCHECK etc in
    addition to Modeler and PROSA to get further
    insight into energy details.
  • Structural alignment of model with other known
    transport protein structures might be of some
    help.
Write a Comment
User Comments (0)
About PowerShow.com