MrBUMP Molecular Replacement with Bulk Model Preparation Automated search model discovery and prepar - PowerPoint PPT Presentation

Loading...

PPT – MrBUMP Molecular Replacement with Bulk Model Preparation Automated search model discovery and prepar PowerPoint presentation | free to download - id: 7c634-MGRkN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

MrBUMP Molecular Replacement with Bulk Model Preparation Automated search model discovery and prepar

Description:

MrBUMP Molecular Replacement with Bulk Model Preparation. Automated search ... helper applications (e.g. Chainsaw) and bioinformatics tools (e.g. Fasta, Mafft) ... – PowerPoint PPT presentation

Number of Views:185
Avg rating:3.0/5.0
Slides: 26
Provided by: PeterB189
Learn more at: http://www.ccp4.ac.uk
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: MrBUMP Molecular Replacement with Bulk Model Preparation Automated search model discovery and prepar


1
MrBUMP Molecular Replacement with Bulk Model
Preparation Automated search model discovery and
preparation for structure solution by molecular
replacement
  • Ronan Keegan, Martyn Winn
  • CCP4 group, Daresbury Laboratory

Leuven, August 8th 2006
2
The aim of Mr Bump
  • An automation framework for Molecular
    Replacement.
  • Particular emphasis on generating a variety of
    search models.
  • Wraps Phaser and/or Molrep.
  • Also uses a variety of helper applications (e.g.
    Chainsaw) and bioinformatics tools (e.g. Fasta,
    Mafft)
  • Uses on-line databases (e.g. PDB, Scop)
  • In favourable cases, gives one-button solution
  • In unfavourable cases, will suggest likely search
    models for manual investigation

3
The pipeline - first steps
Target MTZ Sequence
Number of residues molecular weight Matthews
Coefficient. Estimated number of molecules in
a.s.u.

Target Details

Template Search
Generate a list of structures that are possible
templates for search models
4
Search for homologous proteins
  • FASTA search of PDB
  • Sequence based search using sequence of target
    structure.
  • Can be run locally if user has fasta34 program
    installed or remotely using the OCA web-based
    service hosted by the EBI.
  • Local search is done against the complete list of
    PDB sequences derived from ATOM records in the
    PDB structure files.
  • All of the resulting PDB id codes are added to a
    list
  • Not interested in the alignment to target at
    this stage.

5
Search for additional similar structures
  • Additional structure-based search (optional)
  • Top hit from the FASTA search is used as the
    template structure for a secondary structure
    based search.
  • Uses the SSM webservice provided by the EBI
    (a.k.a. MSDfold)
  • Any new structures found are added to the list.
  • Provides structural variation, not based on
    direct sequence similarity to target
  • Manual addition
  • Can add additional PDB id codes to the list,
    e.g. from FFAS or psiBLAST searches

6
Multiple Alignment
  • After the set of PDB ids are collected in the
    FASTA and SSM searches, their coordinate-based
    sequences are collected and put through a
    multiple alignment with the target sequence
  • Aims
  • Extract pairwise alignment between template and
    target for use in Chainsaw step. Multiple
    alignment should give a better set of alignments
    than the original pair-wise FASTA alignments
  • Score template structures in a consistent manner,
    in order to prioritise them for subsequent steps

7
Multiple Alignment
target
model templates
pairwise alignment
Jalview 2.08.1 Barton group, Dundee
currently support ClustalW or MAFFT for multiple
alignment
8
Template Model Scoring
  • Alignment Scoring
  • score sequence identity X alignment quality
  • Sequence identity
  • Ungapped sequence identity i.e. sequence identity
    of aligned target residues
  • Alignment quality
  • Dependent on the alignment length, the number of
    gaps created in the template alignment and the
    extent of each of these gaps.
  • The penalties given for gaps and the size of the
    gaps is biased so that alignments that preserve
    domains of the structure rather than spreading
    the aligned residues out score higher.
  • The top scoring models are then used for further
    processing

9
Domains
  • Suitable templates for individual target domains
    may exist in isolation in PDB, or in combination
    with dissimilar domains
  • In case of relative domain motion, may want to
    solve domains separately

10
Domains
  • Domains search
  • Top scoring templates from multiple alignment are
    tested to see if they contain any domains.
  • Uses the SCOP database. This only lists domains
    that appear more than once in the PDB.
  • The database is scanned to to see if domains
    exist for each of the PDBs in the list of
    templates
  • Domains are then extracted from the parent PDB
    structure file and added to the list of template
    models as additional search models for MR.

11
Multimers
  • Multimer search
  • Search for quaternary structures that may be used
    as search models.
  • Better signal-to-noise ratio than monomer, if
    assembly is correct for the target.
  • Multimeric structures based on top templates are
    retrieved using the PQS service at the EBI, and
    added to the list of search models
  • PQS will soon be replaced by the use of the PISA
    service at the EBI (Eugene Krissinel)

1n5a SPLIT-ASU into 4 Oligomeric files of type
TRIMERIC 1n5b SPLIT-ASU into 2 Oligomeric files
of type DIMERIC 1n5c SYMMETRY-COMPLEX Oligomeric
file of type DIMERIC 1n5d SYMMETRY-COMPLEX
Oligomeric file of type DIMERIC
12

Target MTZ Sequence
Target Details

Model Search

Model Preparation
Raw template structures not usually appropriate
for MR. Edit to create search model.
13
Search Model Preparation
  • Search models prepared in four ways
  • PDBclip
  • original PDB with waters removed, hydrogens
    removed, most probable conformations for side
    chains selected and chain IDs added if missing.
  • Molrep
  • Molrep contains a model preparation function
    which will align the template sequence with the
    target sequence and prune the non-conserved side
    chains accordingly.
  • Chainsaw
  • Can be given any alignment between the target and
    template sequences.
  • Non-conserved residues are pruned back to the
    gamma atom.
  • Polyalanine
  • Created by excluding all of the side chain atoms
    beyond the CB atom using the Pdbset program

14
Search Model Preparation
  • Ensemble for Phaser
  • Top scoring search models are superposed to
    create a ensemble model.
  • This may provide a better search model than any
    of the individual models on their own.
  • Currently the default is to use the top 5 scoring
    search models but plan to create dynamically
    based on MW and RMSDs of constituent search models

15

Target MTZ Sequence
Target Details

Model Search

Model Preparation

Molecular Replacement Refinement
16
Molecular Replacement and Refinement
  • The search models can be processed with Molrep or
    Phaser or both.
  • The resulting models from molecular replacement
    are passed to Refmac for restrained refinement.
  • The change in the Rfree value during refinement
    is used as rough estimate of how good the
    resulting model is.

final Rfree lt 0.35 or final Rfree lt 0.5 and
dropped by 20
?
success
?
marginal
final Rfree lt 0.5 and dropped by 5
?
otherwise
failure
  • MR scores and un-refined models available for
    later inspection.

17
Serial mode

Target MTZ Sequence
Target Details

Model Search

Model Preparation
Check Scores and exit or select the next model

Molecular Replacement Refinement
18
Parallel mode

Target MTZ Sequence
Target Details

Model Search
Start multiple MR jobs and exit when one finds a
solution

Model Preparation





Molecular Replacement Refinement
Molecular Replacement Refinement
Molecular Replacement Refinement
Molecular Replacement Refinement
Molecular Replacement Refinement
19
MrBUMP on compute clusters
  • MrBUMP can take advantage of a compute cluster to
    farm out the Molecular Replacement jobs.
  • Currently Sun Grid Engine enabled clusters are
    supported but support will be added for LSF and
    condor and any other types of queuing system if
    there is enough demand.

20
Pre-release version of MrBUMP
  • Pre-release made available in Jan 06
  • Simple installation
  • Currently runs on Linux and OSX.
  • Windows version almost ready.
  • Comes with CCP4 GUI .
  • Can also be run from the command line with
    keyword input
  • First citation in Obiero et al., Acta Cryst.
    (2006). F62, 757-760
  • Regular updates (currently version 0.3.1)

http//www.ccp4.ac.uk/MrBUMP
21
Example 1
1vlw aldolase from T. maritima 3 chains of
205aa. Data in C2221 to 2.3Å. Using
Molrep. Solutions based on 1fq0
Other models based on 1eun and 1eua Most, but not
all, work.
22
Example 2
1k6d alpha subunit of acetate CoA-transferase
from E.Coli Solutions based on 1ope
1ope - longer chains colinear with bacterial
alpha and beta subunits Domain 2 is therefore
best model. Whole chain also works because of
alignment / model editing.
23
A few observations ...
  • In difficult cases, success in MrBUMP may depend
    on particular template, chain and model
    preparation method
  • Nevertheless, may get several putative solutions
  • Ease of subsequent model re-building, model
    completion may depend on choice of solution
  • First solution or check everything?
  • Expectation that quick solution required - in
    fact, most users seem happy to let MrBUMP run for
    long time (hours, days)
  • Worth checking failed solutions!

24
Future developments
  • Windows support (requires installer)
  • Complexes (in progress)
  • Processing of multiple target sequences
  • Improved alignment
  • Multiple alignment against larger sequence
    database
  • Alignment from profile-based search
  • User-supplied alignment
  • Incorporate PISA multimer determining service
    (in progress)
  • Model generation
  • Identification of flexible loops
  • Normal mode generated conformations
  • Develop web-service version to allow CCP4i users
    to run jobs on CCP4 cluster

25
Acknowledgements
  • Ronan Keegan, CCP4 _at_ Daresbury
  • Thanks to authors of all underlying programs and
    services
  • Other suggestions from
  • Dave Meredith, Graeme Winter, Daresbury
    Laboratory.
  • Eugene Krissinel, EBI, Cambridge.
  • Eleanor Dobson, YSBL, York University
  • Geoff Barton, Charlie Bond, University of Dundee
  • Randy Read, Airlie McCoy, Cambridge
  • Funding
  • BBSRC (e-HTPX, CCP4)
  • See posters m34.p07 (Keegan, sic), m03.p03
    (Vagin),
  • m32.p06 (Remacle)

http//www.ccp4.ac.uk/MrBUMP
About PowerShow.com