Title: NYSGRC?NYSGXRC PI: Stephen K. Burley-CSO, SGX PI-of-Record: JB Bonanno, Rockefeller Mark R. Chance-AECOM S. Swaminathan-BNL Larry Shapiro-Columbia Medical School Andrej Sali-Rockefeller University Chris Lima-Weill Medical College Web Site:
1NYSGRC?NYSGXRCPI Stephen K. Burley-CSO,
SGX PI-of-Record JB Bonanno, RockefellerMark
R. Chance-AECOMS. Swaminathan-BNLLarry
Shapiro-Columbia Medical SchoolAndrej
Sali-Rockefeller UniversityChris Lima-Weill
Medical CollegeWeb Site http//www.nysgrc.org/
2NYC/SGX Division of Labor
PDB
3Work Flow NYC?SGX?NYC?Public Domain
- Target SelectionAndrej Sali plus NYC Members
- Uploading of Selected Targets from NYC ICE-DB?SGX
LIMS - Topoisomerase Cloning?E. coli
- Small Scale Expression (N- and C-His, C-Smit3/His
Tags) - Solubility Screening/Testing
- Biophysical Characterization/Domain Mapping with
MALDI-MS - Recloning of Various Constructs p.r.n.
- Large Scale Expression/Purification (1 liter E.
coli fermentation) - Quality Control/Quality Assurance (MS, DLS, CD,
Fl, UV/Vis Abs) - Crystallization Screening (2 temps, incomplete
factorial screens) - Diffraction Data Collection (APS, BNL)
- Transfer of Reagents and SGX LIMS Entries?ICE-DB
in NYC - Structure Determination/Refinement/QA/QC/Annotatio
n?PDB - All reagents and data will be made public via
ICE-DB/NIH
4Exploiting Genomic Diversity in Target Selection
Structural domains
?
?
?
?
?
?
?
Targets
?
?
Based on a 30 ID cutoff for good
quality homology modeling
?
?
?
?
?
Species
5NYSGRC Mission Statement
- Establish and exploit a technology platform for
high-throughput protein structure determination
with X-ray crystallography. - Targets are chosen from archaea, eubacteria, and
eukaryotes with the goal of providing one
experimental protein structure for each protein
sequence family defined at the 30 identity
level?PDB. - Preference is given to targets of high biological
or medical relevance, provided that they also
satisfy the 30 sequence identity criterion. - Perform high-throughput comparative protein
structure (a.k.a. homology) modeling on all
sequences within modeling distance (gt30
identity) of each experimental structure?MODBASE. - Provide accurate functional annotations and
supply molecular biology/protein reagents to
relevant experts.
6Topoisomerase Cloning of Expression Vectors
- 5min ligation reactions at 20C
- No requirement for restriction sites
- Initial experience with His tags (20/reaction)
- N- and C-terminal /- cleavage (polioviral
protease) - TA Topo cloning used at SGX
- Advantage 90 efficiency
- Disadvantage 5050 mixture
- Directional Topo cloning used by NYSGRC
- Advantage 90 inserts valid (x PCR errors)
- Disadvantage estimated 75 efficiency
7Chris Lima Vector Double Tag Strategy
- New fusion protein system with unique protease
- Protein of Interest(POI)--Smit3--His double tag
- Combined with directional Topo cloning
- Developed, tested and patented by Chris Lima, WMC
Co-PI for NYSGRC - Advantages
- Two stage purification Ni and Ion Exchange
- Clean removal of tags with protease that
recognizes Smit3 and cuts at fusion with POI - Commercially available from Invitrogen soon
- Technology transferred to SGX under lic. from WMC
896-Well Cloning and Protein Expression
- N- and C- terminal His tags and Smit3/His
Topoisomerase cloning - Qiagen 3000 Qiagen 8000 and Beckman Biomek FX
for PCR set-up, PCR fragment and plasmid
purification with magnetic beads, cherry picking - Modified Genetix Q-Pik for colony picking
- Genemachines Hi-Gro for E.coli expression at
small scale (1-2ml) - Solubility screening in 96-well format
- Clone confirmation by MALDI-MS following Ni ion
purification with Zip-tips (1mg yield) - MALDI-MS/proteolysis (trypsin, AspN, GluC,
chymotrypsin, subitlisin) for domain mapping
(integration with bioinformatics)
9BL21(pLysIce) Improves Cell Lysis Yields
Marker whole sup whole sup whole sup whole sup Mar
ker
POI
LysIce f/t sonication LysS f/t RIL f/t
Expression of Lambda lytic gene subset allows for
freeze-thaw lysis Advantages Faster,
Cheaper, Gentler, 96-well Compatible
10Expression of 262 Bacterial Genes ProvesUtility
of the Protein Families-Based Approach
Species Af to Tm
- Soluble proteins for 80 of gene sets
- 40 Species
- 1852 Gene Sets
Gene number
11Protein Engineering by N/C Truncations of Kinase
Oligo design
_____ _______HHH DGKLYVSSES RFNTLAELVH HHSTVADGLI
TTLHYPAPKR NKPTIYGVSP NYDKWEMERT
DITMKHKLGG GQYGEVYEGV WKKYSLTVAV KTLKEDTMEV
EEFLKEAAVM KEIKHPNLVQ - LLGVCTREPP
FYIITEFMTY GNLLDYLREC NRQEVSAVVL LYMATQISSA
MEYLEKKNFI HRDLAARNCL VGENHLVKVA DFGLSRLMTG
DTYTAHAGAK FPIKWTAPES LAYNKFSIKS
DVWAFGVLLW EIATYGMSPY
PGIDLSQVYE LLEKDYRMER PEGCPEKVYE LMRACWQWNP
HHHHHH HHHHHHH_ SDRPSFAEIH QAFETMFQES SISDEVEKEL
GKRGTRGGAG SMLQAPELPT KTRTCRRAAE
----
Solubility screen
Functional screen
12Protein QA/QC-Mass Spectrometry
- Analysis of all expressed proteins to verify
termini and MW - More extensive characterization of eukaryotic
proteins - Limited proteolysis MS to define domain
boundaries
13Phosphorylated Forms can be Separated by
Ion-Exchange Chromatography Human Kinase
IEF gel
MonoQ load
M/s fraction 6
14ReSurface Protein Engineering?Crystallizability(
both gene shuffling and GFP methods tried)
- Program that suggests mutations to alter protein
solubility and crystallizability - First version finished, plan to explore SNPs,
PPISP, crystal contacts - Accumulating data will be mined to improve method
CFTR NBD1 domain resurface suggestions
CONFIDENTIAL
15(No Transcript)
16(No Transcript)
17E. coli vs. Baculovirus SuggestsLow Threshold
for Using Insect Cells
Baculovirus Mouse Kinase
E. coli Mouse Kinase
E. Coli C. Elegans Kinase
- For this Kinase Set
- Less complex phosphorylation pattern from
baculovirus - Single phosphorylation is same as in vivo
- Bac-to-Bac System provides reasonable throughput
18Crystallization
- Customized liquid handling robotics
- Robbins 96- and single-channel pippettor
- TECAN
- Proprietary sitting drop plate
- DLS on all samples prior to crystallization
- Developing screens based on protein families and
statistical analyses using 4th generation
screen automated recipes - High-capacity storage system maintains
- 960,000 trials at each temperature
- Automated imaging/scoring of crystals
- 96,000 trials analyzed/day/temperature
19(No Transcript)
20Collaboration with Robodesign, Intl.
- Completed December 2001
- RoboStore (10,000 plates)
- RoboVision (programmable
- plate imaging and scoring)
21Integrated Storage/Imaging System
Capacity 10,000 plates (approx. 106 trials) _at_ 2
temps Capability 96,000 trials Examined/day
22Fermentation and Purification
- Fermentation at 1 liter scale
- Overnight induction at 20C
- Affinity purification on 5 ml nickel columns
- BioCad and AKTA Explorer machines
- Throughput is 60 75 pellets per week
23Diffraction Data Collection at SGX
- SGX-CAT at the APS (Sector 31)
- Collected first data sets in December 2001
- Automation will be completed during Q2 2002
- State-of-the-Art insertion device beamline
- Rapid data collection with small crystals
- High-redundancy SAD data collection
- Streamlined data reduction to Fobs
- T1 line to NYC (ICE-DB) will be installed
- during Q4 2002
24(No Transcript)
25SGX ID Beamlines (first 55 cases?2.0 Å)NYSGRC
BM/Wiggler (first 27 cases?2.3 Å)
- Average Resolution 2.0 Å, Range 1.3 2.9 Å
26Automated Structure Determination Platform
- ASDP J-S Jiang, BNL
- One day structure determinations
- lt24 hours from data collection
- to refinement of most (up to 95) of the
polypeptide chain - (limited by CPU speed and resolution limit of
diffraction data)
27NYC and SGX Computational Platforms
Target Selection, Homology Modeling
Computational Chemistry
- Fold Assignment
- Domain Prediction
- MODBASE/MOD
- Coding SNPs
-
- DOCK5, AMBER
- Virtual Screening pipeline
Structure Annotation
- Active site prediction
- Classifying protein
- active sites
- Structure similarity
- Distribution of
- reagents to experts
SGX LIMS DataMining
ReSurface Technology MALDI-MS/Domain Mapping
Technology
28NYSGXRC Summary
- Target Selection in Public Domain by NYC
institutions - Industrial processes to be performed in
industrial setting (SGX) - Transparent access to SGX-PSI progress reports
via ICE-DB - Free distribution of reagents from NYC
Institutions/NIH - Structure determination/annotation to occur in
Public Domain - Intellectual property from structures owned by
NYC Institutions - Full deposition and timely release of atomic
coordinates, Fs, interim results to PDB, etc.
by NYC Institutions (No SGX control!) - Benefits to PSI leveraging SGX investment,
increased throughput, roadmap leading towards the
production phase - Benefits to SGX continuous improvement of
platform, pooling of SGX and PSI experiences for
more rigorous statistical analyses, high PR
value, high quality academic collaborations,
potential benefit to the larger scientific
community