www.wwpdb.org - PowerPoint PPT Presentation

1 / 65
About This Presentation
Title:

www.wwpdb.org

Description:

10:00 am. Welcome and Introductions KH. 10:15 Overview of recent wwPDB progress HB ... Incorrectly annotated as a Putative lantibiotic peptide ... – PowerPoint PPT presentation

Number of Views:308
Avg rating:3.0/5.0
Slides: 66
Provided by: Mart492
Category:
Tags: org | putative | qua | wwpdb | www

less

Transcript and Presenter's Notes

Title: www.wwpdb.org


1
www.wwpdb.org
September 29, 2008
2
Agenda
1000 am. Welcome and Introductions KH 1015
Overview of recent wwPDB progress HB 1035
Outreach HN 1055 NMR Task
Force JM 1115 Improvements in Data Deposition
and Processing KH 1145 New Projects
HB Noon Working Lunch 100pm Funding
Update All 130 Matters Arising Committee
membership Next meeting 200 Discussion
300 Executive Session 315 Feedback 330 Adjo
urn
3
Overview
Worldwide Protein Data Bank www.wwpdb.org
  • Helen Berman

4
wwPDBAC 2007 (on wwPDB Intranet)
5
wwPDBAC 2007 Recommendations
  • Structure factors and/or NMR restraints should be
    a prerequisite for receiving a PDB ID
  • Done
  • Inform the relevant journals of this new policy
  • Done adopted by some but not all
  • Validation
  • Establish additional X-ray crystallography and
    NMR validation procedures
  • In progress
  • Results should be made available to depositors
    immediately after submission. Upon depositor
    request, the validation reports should be made
    available to designated scientific journal
    editors
  • Possible now, journal policies have not as yet
    changed
  • Work to establish recommendations for additional
    experimental data deposition and release
    requirements
  • In progress

6
wwPDB AchievementsOctober 2007 - September 2008
  • Continued growth of archive  now more than
    50,000 structures
  • Website updates
  • Download statistics available
  • Publications and presentations
  • Enhanced complex molecule annotation
  • New Format document
  • Initiation of Common Annotation Tool development

7
Depositions
8
Depositions to the PDB by decade
Number of released entries
Year
9
Depositor locations
Download locations
RCSB PDB
PDBe
PDBj
10
(No Transcript)
11
PDB File Downloads
Last 12 months FTP 256,753,220 HTTP
47,102,103 Total 303,855,323
12
Outreach
Worldwide Protein Data Bank www.wwpdb.org
  • Haruki Nakamura

13
Outreach
  • wwPDB website
  • Simultaneous updating PDB archives
  • Publications
  • Professional society meetings
  • Presentations
  • Exhibit booth

14
wwPDB website
Deposition and Release Policies
Deposition and download statistics
Format Description
Meeting information and preliminary
recommendations
15
Simultaneous weekly update of PDB archive
  • In the past, PDBj site started to copy the latest
    data and load them to the local database system
    only after the RCSB-PDB archive was updated on
    Wednesday. Therefore, there was some delay in
    updating the database at PDBj. This frustrated
    potential PDBj users and they preferred to access
    RCSB-PDB.
  • From Sept. 2008, PDBj copies the latest data
    directly from the internal database in RCSB-PDB
    to pre-construct the PDBj database on Saturday
    midnight.
  • By receiving a mail sent from RCSB-PDB
    automatically after updating the public ftp-site
    on every Wednesday, the ftp-site at PDBj is also
    updated with little time delay.

16
Joint publications
  • K. Henrick, Z. Feng, W. Bluhm, D. Dimitropoulos,
    J.F. Doreleijers, S. Dutta, J.L.
    Flippen-Anderson, J. Ionides, C. Kamada, E.
    Krissinel, C.L. Lawson, J.L. Markley, H.
    Nakamura, R. Newman, Y. Shimizu, J. Swaminathan,
    S. Velankar, J. Ory, E.L. Ulrich, W. Vranken, J.
    Westbrook, R. Yamashita, H. Yang, J. Young, M.
    Yousufuddin, and H. Berman (2008) Remediation of
    the Protein Data Bank Archive. Nucleic Acids Res.
    36(Database issue) D426-D433.
  • J.L. Markley, E.L. Ulrich, H. Berman, K. Henrick,
    H. Nakamura, and H. Akutsu (2008) BioMagResBank
    (BMRB) as a partner in the Worldwide Protein Data
    Bank (wwPDB) New policies affecting biomolecular
    NMR depositions. J Biol NMR. 40 153-155.
  • S. Dutta, K. Burkhardt, G.J. Swaminathan, T.
    Kosada, K. Henrick, H. Nakamura, and H.M. Berman,
    Data deposition and annotation at the Worldwide
    Protein Data Bank, in Methods in Molecular
    Biology, 426 Structural Proteomics
    High-Throughput Methods, B.G. Kobe, Mitchell
    Huber, Thomas Editor. 2008, Humana Press Totowa,
    NJ.
  • C.L. Lawson, S. Dutta, J.D. Westbrook, K.
    Henrick, and H.M. Berman (2008) Representation of
    viruses in the remediated PDB archive. Acta
    Cryst. D64 874-882.

17
Interactions
  • Exchange visits
  • PDBe/RCSB PDB
  • PDBj/RCSB PDB
  • PDBj/BMRB
  • BMRB/RCSB PDB
  • BMRB/PDBe
  • Phone conference with site directors-twice a year
  • VTCs among staff
  • BMRB/RCSB PDB twice a month (ADIT-NMR)
  • MSD/RCSB PDB weekly
  • RCSB PDB/PDBj and BMRB/PDBj
  • BMRB/PDBe
  • Daily emails among staff
  • PDBe/RCSB PDB
  • PDBj/RCSB PDB
  • BMRB/RCSB PDB, PDBj, PDBe

wwPDB Retreat 2007
18
wwPDB Retreat
19
IUCr Osaka 2008
  • Joint exhibition stand
  • Presentations
  • Keynote lecture, What the Protein Data Bank tells
    us about the past, present and future of
    structural biology
  • Validation talk, Data Quality in the PDB Archive
  • QA at the Commission on
  • Biological Macromolecules
  • Specialized Participation
  • Small Angle Commission
  • Workshop on New Routes to Crystallographic Data
    Publication
  • COMCIFs

20
http//www.eccb08.org
A demonstration describing the wwPDB highlighting
the collaboration as well as services offered by
member sites
21
NMR Update
Worldwide Protein Data Bank www.wwpdb.org
  • John Markley

22
NMR structure depositions
  • Number of NMR structures deposited through
    ADIT-NMR (09/01/07-08/31/08)
  • BMRB -gt RCSB PDB 461
  • PDBj - BMRB -gt PDBj 112
  • Restraints remediation
  • Processing is virtually complete
  • Will be released as soon as it can be made
    consistent with the remediated chemical
    components dictionary

23
wwPDB policies and rules on NMR entries
  • Two types of NMR experiments will be
    distinguished in the PDB entries
  • Solution NMR
  • Solid-state NMR
  • NMR entries will have new PDB records
  • MDLTYP to indicate MINIMIZED AVERAGE
  • NUMMDL to specify number of models in entry
  • These changes are reflected in Format Guide 3.2

24
wwPDB policies and rules on NMR entries
  • The numbering of models is sequential, beginning
    with 1
  • All models in a deposition (ensemble members and
    minimized average, if provided) should be
    superimposed in an appropriate author determined
    manner, and only one superposition method should
    be used.
  • All models in an NMR ensemble and the minimized
    average structure, if provided, should have the
    same sequence and covalent structure (exact same
    number and type of atoms hydrogens and heavy
    atoms), and chemistry (e.g., protonation state)

25
Policies clarified by NMR Task Force August 26,
2008
  • PDB will accept minimized average structures only
    if they meet the above criteria for alignment and
    covalent structure
  • The number of models will not be limited in a PDB
    file
  • Chemical shifts deposition will become mandatory
  • Depositors are encouraged to avail themselves of
    third-party validation software prior to
    deposition of NMR structures

26
Improvements in Data Deposition and Annotation
Worldwide Protein Data Bank www.wwpdb.org
  • Kim Henrick

27
A year of VTCs and discussions
28
PDB Contents Guide Version 3.2
  • The goal was to further clarify all formats and
    procedures so as to create a more uniform archive

29
Process
  • Every record was reviewed for scientific
    correctness and clarity by wwPDB annotators
  • Some records were added and others expanded
  • Task Force members were consulted where
    appropriate

30
Added PDB Format Records
SPLIT for large structures to indicate number of
PDB entries NUMMDL number of MODELS in an
entry MDLTYP model types and if C-alpha only
chains REMARK 0 Re-refinement notice REMARK 475
Residues modeled with zero occupancy REMARK 480
Polymer atoms modeled with zero occupancy REMARK
620 Metal coordination REMARK 630 Inhibitor
Description DBREF1 / DBREF2 To match very long
UniProt Identifiers DBREF (standard format still
used)
31
Internal Documentation
32
Results
  • Complete new Format document produced and
    released to public September 15, 2008
  • Files will be processed according to this
    specification starting November 15, 2008
  • All files in archive will be brought up to this
    standard Q1 2009

33
X-ray Validation Task Force Workshop April 14-16,
2008 EBI, Hinxton, UK www.wwpdb.org/workshop/2008
/index.html
Randy Read (Chair), Paul Adams, Axel Brunger,
Paul Emsley, Robbie Joosten, Gerard Kleywegt,
Eugene Krissinel, Thomas Luetteke, Zbyszek
Otwinowski, Tassos Perrakis, Jane Richardson,
Will Sheffler, Janet Smith, Ian Tickle, Gert
Vriend
34
wwPDB Validation Task Force
This meeting of the X-ray Validation Task Force
was held to collect recommendations and develop
consensus on additional validation that should be
performed on PDB entries, and to identify
software applications to perform validation tasks.
Preliminary Outcomes
  • Workshop report to be published in Fall 2008
  • Candidate global and local validation measures
    were identified
  • These measures were reviewed in terms of the
    requirements of depositors, reviewers, and users

35
Remediation and Curation of Complex Chemistry in
the PDB
36
SCOPE
  • Inhibitor molecules annotate the chem comp
    dictionary and migrate details to PDB entries
  • Ribosomal (postranslational modifications) and
    non-ribosomal cyclic, modified and conjugated
    peptides consistently given a SEQRES , SOURCE
    annotate an entity look up table and transfer to
    PDB entries

37
2VUM
AMANITIN
38
recently shown to be gene product
Mapping to UNIPROT e.g. AMATX_AMAPH
(P85421) 2VUM cyclically permuted, and needs to
be corrected SEQRES 1 M 8 ASN HYP ILX TRX GLY
ILE GLY CSX to SEQRES 1 M 8 ILX TRX GLY ILE GLY
CSX ASN HYP to align with the gene sequence for
beta-amanitin from Amanita phalloides, and
alpha-amanitin from Amanita bispoigera. The
encoded sequence would be, Ile-Trp-Gly-Ile-Gly-Cys
-Asn-Pro Needs MODRES to
match gene product
AMANITIN
39
Cyclic, Modified and Conjugated Peptides May be
Ribosomal or Non-Ribosomal
Non-gene peptides e.g. actinomycin D i.e.
require a gene cluster Nonribosomal peptides
http//bioinfo.lifl.fr/norine/ or Novel
Antibiotics DataBase http//www.nih.go.jp/jun/NAD
B/search.html
40
Value to users
  • To understand unique and shared aspects of a
    particular occurrence
  • To find a specific system Some components of a
    PDB file, such as inhibitors and antibiotic
    peptides, might not be found or even be apparent
  • To study related ligands across different
    proteins

41
Challenges
  • Inclusion of non-standard amino acid,
    nucleotides, or other chemical groups in sequence
  • Non-linear (cyclic or branched) sequences
  • Microheterogeneity (some cases)
  • Non-uniform annotation of the same molecule in
    different PDB entries
  • Lack of annotation regarding the source and
    function of these molecules

42
Solutions
  • Analysis and classification
  • Identify antibiotics and inhibitors and group
    them into polymeric molecules or single
    components
  • Dictionary updates
  • Build single chemical components for appropriate
    cases
  • Update dictionary with source, function and other
    details
  • Remediation and future processing
  • Edit/revise files to include compound name,
    sequence, source and function for all antibiotics
    and inhibitors
  • Establish rules and procedures to make new
    annotations consistent

43
Single component vs. Polymeric
  • Single component antibiotics or inhibitors
  • Build component and retain subcomponent
    information annotate dictionary with details
    about molecule
  • Migrate details from dictionary to entry files in
    specific remarks
  • e.g. D-Phenylalanyl-L-prolyl-L-arginine
    chloromethyl ketone (PPACK)
  • Polymeric (peptide-like) antibiotics or
    inhibitors
  • Present sequence, compound name, and source
    information as any regular polymer
  • Include details about functions in specific
    remarks
  • e.g. post-translationally modified ribosomal
    peptides, non-ribosomal cyclic, modified or
    conjugated peptides

44
How many?
1300 identified PDB entries Antibacterial Antivir
al Antimicrobial Antifungal Antibiotic Overlap
with Anticancer Anti-inflammatory
Immunosuppressant Herbicide
  • Antibiotics
  • Single component 1000
  • Polymeric 300
  • Inhibitors
  • Natural and synthetic inhibitors of enzymes and
    other cellular processes
  • Single component 350
  • Polymeric350
  • Others
  • Toxins 120

45
THIOSTREPTON
46
4 PDB entries with 4 different representations
1e9w SEQRES THR ILE ALA DHA ALA DHA PYT 2jq7
SEQRES ILE ALA DHA ALA 1oln LINKed HETs ROP
incorrectly used 3cf5 is single molecule
TXX SEQRES should be
TZO THR TZB TSI TZO XAA QUA ILE ALA DHA
ALA XBB TZO DHA PYT Now matched in all 4
entries, TXX obsolete
THIOSTREPTON
47
THIOSTREPTON
_entity.pdbx_description Thiostrepton complex
bacterial natural product containing thiazole
rings that's used as a topical veterinary
antibiotic and also has promising antimalarial
and anticancer activity first isolated from
bacteria in 1955, thiostrepton has an unusual
type of antibiotic activity It disables protein
biosynthesis by binding to ribosomal RNA and one
of its associated proteins and interacts directly
with 23S rRNA nucleotides 1067A and
1095A _entity.type Polypeptide, sulfur
containing antibiotic _entity.details
Thiostrepton is a macrocyclic antibiotic
incorporating thiazoles and other atypical amino
acids. Patented in 1961, thiostrepton has been
used as an antibiotic and acts by binding to
ribosomes to prevent the binding of the EF-G
elongation factor and GTP to the 50S riobsomal
subunit. Thiostrepton is an inducer of tipA, a
gene that controls the bacterial transcription
regulators, TipAL and TipAS, members of the MerR
proteins that are central regulators in multidrug
resistance. Closely related to siomycin, a
recently discovered inhibitor of oncogenic
transcription factor - FoxM1. The
thiostrepton-resistant gene is also commonly used
as a selective marker for recombinant DNA/plasmid
technologies.
48
1 CAS 1393-48-2 ? 1 PUBCHEM 16130278
? 1 Merck Index 119295 149364 ? 1
RTECS XN6300100 ? 1 MDL number
MFCD00135828 http//www.mdli.com/ 1 EG/EC
Number 215-734-9 ? 1 ChemSpider 10469505
http//www.chemspider.com/ 1 URL
http//www.fermentek.co.il/Thiostrepton.htm ? 1
URL http//www.tebu-bio.com/file/product/170BIA
-T1158-1/ ? 1 URL http//www.bioaustralis.com
/pdfs/thiostrepton.pdf ? 1 Sigma Aldrich
T8902 http//www.sigmaaldrich.com/ 1 Chemical
Class macrolide ? 1 MESH Peptides,
Cyclic D04.345.566 ? 1 Pharm. Action
Anti-Bacterial Agent ? 1 Image
http//pubs.acs.org/cen/images/8239/8239notw4image
.gif ? 1 Image http//en.wikipedia.org/wiki/Ima
geThiostrepton.png ?
THIOSTREPTON
49
Alert - New Protein Modifications Thu, September
25, 2008 117 pm
John S. Garavelli UniProt/RESID
database micrococcin P1 SCTTCVCTCSCCT Bacillus
cereus strain ATCC 14579 UniProtQ812G9_BACCR,
Incorrectly annotated as a Putative lantibiotic
peptide Now believe that all the pyridinyl
polythiazole antibiotics, including micrococcin
P1, thiostrepton, thiocillin, GE2270 A and
sulfamycin B, are genetically encoded directly.
50
THIOSTREPTON
TZO THR
TZB TSI TZO XAA QUA ILE ALA DHA ALA XBB TZO DHA
PYT
SEQRES
QUA ILE ALA SER ALA SER CYS THR THR CYS ILE CYS
THR CYS SER CYS SER SER NH2
51
Inhibitors
52
CHYMOSTATIN
1ke2 SEQRES CSI LEU PHA 1bcs SEQRES CSI
LEU PHA 1m21 single HET group CHY 1wvm
single HET group CHY 1sgc single HET group
CST 5 PDB entries with 3 representations all
cases bound to Serine-OG CHY C31 H41 N7 O6 (OG
missing aldehyde) CST C31 H41 N7 O7 (OG present
carboxlyic acid) Convert all to pseudo SEQRES
with BIOLOGICAL SOURCE
53
Border-line ? PDB ID 1qr3 Inhibitor of
human leukocyte elastase from Streptomyces
resistomycificus Should this be a single
component or a polymeric? Sequence AIB ORN THR
AA3 AA4 PHE AA6 VAL
FR901277
54
Miri Hirshberg Hyunmi Sun Shuchismita Dutta
John Westbrook Jasmine Young Kim Henrick John S.
Garavelli UniProt
55
New Projects
Worldwide Protein Data Bank www.wwpdb.org
  • Helen Berman

56
Small Angle Scattering
  • Two-member annotator team reviewing possible SAXS
    and SANS templates
  • Attendance at SAS Commission to discuss
    deposition and publication requirements
  • Template recommendation expected in 2009

57
Common Deposition and Annotation Tool
  • Selected as the most important project going
    forward by participants of the 2007 wwPDB Retreat
  • Project timeline Concept in 2008, design and
    development 2009 - 2011 with delivery by 2012
  • Progress
  • wwPDB Directors adopted role of Steering
    Committee and initiated the project Concept Phase
  • Concept Team, representing the 4 partner sites,
    meet to create Scope Document (December 2008)
  • Steering Committee approved the Scope Document in
    May 2008
  • Core Team Kick Off meeting July 2008

58
Scope
  • wwPDB-wide project
  • Will allow full sharing of data load worldwide
    and eliminate individual points of failure
  • Will implement recommendations of NMR and X-ray
    Validation Task Forces
  • Will allow for data acquisition of coordinate,
    experimental and meta data for all methods
  • Will ensure quality, consistency and efficiency
    of data processing and annotation process

59
Assumptions
  • The deposition tools must be able to handle all
    current, agreed upon, data entry formats from the
    user community
  • The underlying system design will not be driven
    by existing formats
  • The product must provide an extensible framework
    enabling support for new experimental methods
    over its ten year life span
  • The project technical level will be set at a
    reasonable standard. Technology should not be
    bleeding edge nor declining.

60
Core Team Kick Off
61
Core Team Meeting Outcome
  • Establish Project Management Strategy for this
    project
  • Draft a conceptual design for the solution and
    identify critical components that need to be
    investigated
  • Identify the top three challenges and initiate
    study groups
  • Future system data model (John Westbrook and Tom
    Oldfield)
  • Technologies and strategies for data and
    statemanagement (John Westbrook)
  • Technologies and strategies for automation of
    thevalidation and annotation pipeline (Sameer
    Velankar) 

62
Path Forward
Adapt Agile Development to our environment as
appropriate.
Final design and Full Requirements realized
through incremental deliveries, using lessons
learned along the way.
63
Archiving of Raw Diffraction Data
  • Discussion at Commission on Biological
    Macromolecules
  • Outcome
  • Appoint working group to study requirements for
    archiving raw experimental data (Chair Judith L.
    Flippen-Anderson)

64
Funding Update
  • RCSB has received approval from NSB for
    funding through 2013
  • BMRB currently funded through Aug 2009 has
    submitted a competitive renewal application to
    the National Library of Medicine (U.S.
    National Institutes of Health) even if
    successful, the current budget will be reduced
    by 30
  • PDBj is going to be reviewed in this November,
    at the middle of the current project until Mar
    2011
  • EMBL-EBI (PDBe) Has 6 months bridging funds
    from Wellcome Trust to cover transition of team
    leader, 6 staff funded until 1-Dec-09

65
Worldwide Protein Data Bank www.wwpdb.org
Matters Arising
  • Committee membership
  • HPUB proposed revision
  • Industrial structures
  • Validation guidelines
Write a Comment
User Comments (0)
About PowerShow.com