Title: A heavy price was paid for molecular biologys obsession with metaphysical reductionism' It stripped
1The Shifting Paradigm for Biology
- A heavy price was paid for molecular biologys
obsession with metaphysical reductionism. It
stripped the organism from its environment .
shredded it into parts to the extent that a sense
of the whole--the whole cell, the whole
multicellular organism, the biosphere--was
effectively goneOur task is to resynthesize
biology... The time has come for biology to
enter the nonlinear world. - Carl Woese, MMBR 2004
Biology of Complex Systems is a National Science
Priority Agencies should target investments
toward the development of a deeper understanding
of complex biological systems Scientific and
technological breakthroughs are expected in
diverse areas such as environmental
management OSTP/OMB interagency guidance memo
for FY 2005
2A Systems Approach is needed to Bring the Genome
to Life
Systems Biology studying biological systems by
systematically perturbing them (biologically,
genetically or chemically) monitoring the gene,
protein, and informational pathway responses
integrating these data and ultimately
formulating mathematical models that describe the
structure of the system and its responses to
individual pertubations (Ideker et al., 2001
Annu, Rev. Genom. Hum. Genet. 2343) Mission
Science Requires an inherent systems view.
3Integrated Approach to Shewanella Biology
MR-1 Genome Sequence, Bioinformatics
Information Synthesis Interpretation
Linked measurements
Imaging AFM, EM, CLSM Immuno-EM
Computational Biology
Concepts Hypotheses
Physiology Metabolism
Cellular networks, Modeling
Controlled Cultivation
Proteomics MS 2-D gel
Data Analysis Integration
- ?
- Perturbation
- mutation
- cultivation
Gene Expression Microarrays Reporters
4Knowledge of Microbial Processes and Communities
Can Lead to New Solutions for Global Change
- Recommendations
- To enhance microbiological solutions to global
change challenges, strengthen and expand ongoing
research efforts, and direct new resources for
basic research programs that - Integrate an understanding of microbiological
processes at all organizational levels, from
individual organisms to ecosystems - (2) Discover, characterize and harness the
abilities of microbes that play important roles
in transformations of trace gases and various
toxic elements. - Implement policies that promote effective
long-term research on the microbiology of global
change
5GTL Program and Facilities Responding to AAM
Recommendations
- To make progress, science should not accept the
limitations placed on discovery by traditional
methods, conventional approaches, or existing
infrastructure. - Although progress in microbial genomics is being
made at a fantastic rate, availability of
appropriate tools still places limits on
research. - Powerful, but expensive, modern equipment should
be housed in community facilities, open to
researchers who might not otherwise have access
to these technologies. - ______
- Microbiology in the 21st Century Where Are We
and Where Are We Going? American Academy of
Microbiology, 2004
6GTL Program and Facilities Responding to AAM
Recommendations
- Currently, ability to predict the responses of a
single microorganism from the sequence of its
genome can best be described as feeble, and our
ability to make predictions for an assemblage of
multiple organisms is even weaker. - The ecosystem predictive capability will come
from detailed work that links an understanding of
the genome to an understanding of gene
expression, protein function, and complex
metabolic networks. - A deep genomic understanding of these integrated
microbial processes will provide a better
understanding of our planet, our interaction with
it, and our ability to predict and influence its
future behavior. - Create centers that facilitate research
community access to post genomic analytical
capabilities - ________
- The Global Genome Question Microbes as the Key
to Understanding Evolution and Ecology, American
Academy of Microbiology, 2004 - http//www.asm.org/ASM/files/CCPAGECONTENT/docfile
name/0000026555/GENOMEweb.pdf
7Microbes and Biotechnology for DOE Missions R
D is Critical
-
- Why Microbes?
- Microbes and microbial communities are the
- Foundation of the biosphere and the planets
ability to sustain all life. - Masters at capturing, storing, and transforming
energy. - Most abundant and biochemically versatile life
forms possessing diverse and sophisticated
capabilities - Recent Discoveries have revealed the dominant
role of microbes in earth systems processes - Prochlorococcus is the dominant photosynthetic
organism on the planet. - Sequencing of the Sargasso Sea has revealed
immense diversity - Microbes exist in the deep subsurface and survive
on redox couples unrelated to photosynthesis - Microbes are catalyst for acid mine drainage
Improvements in cellulase molecular machine
function are needed to more efficiently convert
cellulose to glucose for conversion to ethanol.
Applications of microbial capabilities will
stimulate new biotechnology industries to put
these new processes into the marketplace.
Published in J. Am. Chem. Soc. 2002, 124,
11242-3
Bioenzymes have been captured in a ceramic
nanomembrane with improved function and
stability.
8GTL Systems Biology Molecules to Cells to
Communities
Communities
Single cells
Populations
Subcellular location and dynamics of machines
Gene expression in individual cells
Who is expressing what, when, where under what
conditions?
- Programmatic goals of understanding
- Proteins and Molecular Machines
- Regulatory processes
- Complex Microbial Communities
- and, developing the necessary computational and
information architectures
9Science enabled by the production and
characterization of proteins and molecular tags
- Investigating functions of unknown and
hypothetical genes from microbial and
meta-genomes - The flood of genomes, unculturable systems
- Understanding natural systems genomic potential
and behaviors - Oceans, terrestrial, deep subsurface
- Understand microbial community organization and
function - Investigating natural variations in critical
processes and designing and optimizing
functionality - Hydrogenases, cellulases, reductases
- Binding, products, biophysical parameters
- Enabling new single molecular investigation
techniques - Tags for Protein-protein and protein-DNA
interactions, e.g. monitoring molecular
structure, regulatory interactions,
post-translational modifications - Quantification standards for proteomics techniques
10The Unknown Gene Function Problem
- Hypotheticals account for gt50 of potential
protein-coding regions - Escherichia coli
- 4,400 genes, ½ characterized to any extent
- current rate of 10 IDs/month, complete in 20
years! - Hundreds of genome sequences available, nos.
increasing - Microbial population sequencing Sargasso Sea
sequencing (Venter et al.) alone added 1.2
million previously unknown genes - JGI 2004 Prokaryotes 235 Mb Microeuks 80Mb
Communities 340 Mb
111/3 of Hypothetical Genes are expressed in
Shewanella oneidensis MR-1
- - gene expression (microarrays) either a signal
was detected or not. - - protein expression gt 500 MS/MS runs, gt 2
million files! - 0 FP, identifications with gt90 confidence
level - Hypothetical ORFs expressed as both RNAs and
proteins - Original Total Total no. of expressed genes
() - genome no. of Transcriptome Proteome Both
- annotation genes
- All predicted ORFs
- 4468 3564 (80) 2252 (50) 2082 (47)
- Hypothetical ORFs
- 1623 1223 (75) 592 (36) 538 (33)
7 Additional Strains of Shewanella being
Sequenced by JGI
12Identifying Protein Function
- Bioinformatics
- Biochemistry
- Biophysics
- Structure
- Genetics (e.g., mutagenesis)
- Partners/interactions
- Co-expression patterns
GTL Facilities will Provide Necessary
Capabilities Scale
Facility I - generate hypothetical proteins for
biochemical screen etc. as part of annotation
for each genome sequenced, even for uncultured
organisms!
13Natural Systems of Interest to DOE Predict
behaviors and understand impacts of manipulations
Mar. 3-4 DOE/GTL Missions Science Workshop
14Biotech could be a significant source of H2 in a
stabilization regime.
Total US End-Use Energy in 2005
Direct biological production of H2.
Jae Edmonds, JGCRI
15BioH2 Production
- The development of methods for large-scale, i.e.
hundred of exajoules of energy per year,
cost-effective source of H2 using biotechnology
could lower costs of addressing climate change by
trillions of dollars! - If H2-using technology (e.g. fuel cells) can be
made cost effective - BioTechnology could become a major source of
energy - Lower land-use emissions
- Lower agricultural prices.
Jae Edmonds, JGCRI
16Hydrogenase Research Biophotolytic Hydrogen
Production
17Bacteriorhodopsins via Ocean Metagenomics
- Rhodopsins are light absorbing pigments
consisting of membrane proteins retinal - Discovery of new rhodopsin via genome analysis of
marine bacterioplankton (Béjà et al. 2000) - Protein expressed in E. coli, bound retinol
formed an active light-driven proton pump - 782 new rhodopsin-like photoreceptors,
representing 13 distinct subfamilies, identified
in Sargasso Sea sequence (Venter et al. 2004) - Do all variants absorb light? Structure-function
relationships?
18Cellulosic Ethanol Processing Plant in a
Microbe?
Today we utilize food starch to make alcohol and
complex and costly processing of
cellulose Tomorrow we want to utlitize high
yield cellulose crops with integrated processes
in microbes to convert to alcohols and other fuels
Cellulose Today
Decrystallization Hydrolysis of Cellulose,
Hemicellulose, and Lignin Multiple Sugar
Metabolism Alcohol Synthesis
Tomorrow?
19Roadmap for Microbial Cellulosic Ethanol
Production Today Interim Goals
Endpoint
20Molecular Tags for Single Molecule Methods
- A much more direct, sensitive and quantitative
way to assess protein-DNA and protein-protein
interactions (compared to yeast 2-hybrid,
protein chips and mass spectrometry)
- GTL facilities provide an opportunity to bring
this technology to high throughput performance
(relatively large engineering effort) - proteins, tags
- Instrumentation, data capture analysis
21Single-Molecule Methods for the Characterization
of Protein-Protein Interactions and Expression
Levels in Shewanella oneidensis MR-1
Natalie R. Gassman1, Achillefs N. Kapanidis1,
Nam Ki Lee1,, Ted A. Laurence1,, Xiangxu Kong1,
and Shimon Weiss1,2,3 1Dept. of Chemistry and
Biochemistry, UCLA, 2Dept. of Physiology, David
Geffen School of Medicine, UCLA, 3California
NanoSystems Institute, Presenter
ngassman_at_chem.ucla.edu, presently Seoul National
University, presently Lawrence Livermore
National Laboratory
22SORTING MOLECULES USING E, S
FRET-efficiency ratio (E)
(CONFORMATION INFO)
Fluorescence-aided molecule sorter
FAMS
Kapanidis et al., PNAS 101, 8936 (2004)
23Protein-DNA InteractionDNA - Catabolite
Activator Protein (CAP) interaction
24Protein-Protein Interactions of Transcription
Regulation
Two-component signal transduction, which
regulates transcription by the alternative sigma
factor, s54, provides a diverse array of
protein-protein and protein-DNA interactions to
assay using single molecule methods. Our focus
will be on the two-component system involving
nitrogen regulatory protein (NtrC).
25Biological processes that can be dissected by
simultaneous monitoring of structure and
interactions
26Purification of NtrC
Protein Production is a Bottleneck for HT Single
Molecule Methods!
Overexpression
Affinity Chromatography
27GTL Program - Microbial Community Analyses
Cross-Cutting Processes
Microbial Communities
Marine
Soil
Subsurface
Photolytic H2 generation
Cross cutting processes/genes of Interest
Contaminant transformations
Biomass conversion
Example Analyses of Mission relevant Natural
Communities and Optimization of Processes require
a minimum production of 20,000 proteins per year
28DOE Missions Tie to Capabilities for Microbial
Systems
Ocean
Natural Systems Behavior
Terrestrial
Subsurface
Toxic Metal Reduction
Convert Cellulose to Fuels
Energy
Convert Sunlight to Hydrogen
29We can Achieve Dramatic Improvements in Quality,
Throughput, and Cost
- Genome projects demonstrated the power of
high-throughput technologies, computational
analyses, and data made available to all
scientists. - This approach can provide the needed gains for
GTL biology. - GTL pilot studies give us confidence in the
technologies. - Initial GTL systems biology research has
identified key issues and requirements. - The Approach
- Multidisciplinary teams,
- Scaleup and automation, focus on production
- Drive to dramatic production gains cost
reductions - Strict standards and methods for quality
improvements, - Characterization with advanced technologies such
as synchrotron methodologies, - Informatics and computing,
- Rapid and open availability of results to
scientists -- freeing scientists from production
activities
The learning curve for sequencing (above) is
being repeated for soluble and membrane proteins
(below)
30GenomicsGTL Facilities - Enabling Cellular
Systems Science
Cellular Activities Understanding how cells
respond to environmental cues
Cellular Components Providing the basis for
determining protein function and cellular
processes
Developing a predictive understanding of the
functions of cells and communities of cells
Understanding how molecular machines are formed
and how they function
A New Infrastructure for Biological Research
31Facility for Production and Characterization of
Proteins and Molecular Tags
- What It Will Produce
- Genomic information directly translated into
proteins - 10,000 to 25,000 proteins/yr
- 100,000 molecular tags/yr (initial targets)
- Biophysical and biochemical characterizations
- Research to support difficult protein and tag
production - Learning curve informed by successes and failures
- Research and development to import new
technologies and methods - Data, standards, and protocols
- What It Will Be
- 150,000-sq. ft. facility
- Advanced automated robots or production lines
- Advanced micro electro mechanical systems lab
on a chip, microfluidics - Synchrotron and other characterizations
- Informatics and computing infrastructure
- Cryogenic storage, handling, and shipping
- Comprehensive RD program (both at the facility
and distributed)
Tracking proteins with tags in live cells.
Characterizing proteins with synchrotron X-rays.
32Functions of the Facility for the Production and
Characterization of Proteins and Molecular Tags
(1) Inputs Gene Sequences
- Remote
- Characterization
- Facilities
- Specialty
- (1) Genomics
- Gene Synthesis
- Cloning
- Modification
- DNA Sequencing
- (2-3) Protein Production
- Cellular
- Cell Free
- Chemical Synthesis
- Labeling and Mutations
- Administration
- Management
- Staff Offices
- Conference
- User Offices
- (5) Affinity Reagent Production
- Libraries
- Synthesis
- (4) Characterization
- Quality Assurance
- Quality Control
- Biophysical/Biofunctional
- (8)
- Technology Research
- and Development
- High Production
- Automation
- Computing Programs
- (6)
- Computing/Information
- LIMS and workflow
- Data and Tools
- Simulation and Analysis
- Production Strategies
- (7)
- Cryogenic Archives
- Shipping
- Receiving
- Storage
- Outputs
- Proteins 2,3,7
- Affinity Reagents 5,7
- Characterizations 4
- Protocols and clones 1,3,4,5,6,7
- Data and Tools 1,2,3,4,5,6,7,8
University/Industry Technology RD Partners
33(1) Genomics and (2) Lab-on-a-Chip Preproduction
Screening Lines
Major Equipment Layout
(1)
(2)
Incubator
Rapid Translation Systems
LC
Dispensing And Harvesting Robots
Lab-on-a-chip Nanoliter Pre-production Screening
Technologies
Light Scattering
Public and GTL Data Bases
Computing And Information Tools and Systems
Lab on a Chip Micro- Electrophoresis Separations
Lab on a Chip Expression
Dispensing Robots
DNA sequencers
DNA synthesizers
PCR machines
Colony Picker
UV Absorbance
(2)
(1)
Successful Protocols To Mass Production
Cell-free Transcription And Translation
Expression Verification
Purification By multiple Conditions
Purity and Solubility
Annotated DNA sequence
Computed Methods protocols
Verify Sequence
Amplification With PCR
Ligation/ Purification
DNA Oligonucleotides synthesized
Clones
Cellular Vectors
In Vivo Expression
Colony Picker
Transfection/ Expression
Data and All Protocols To Data Base
(6-7)
(6-7)
Materials Archived for Production
Schedule Prescreening
Validation
Compute Amino Acid Sequence
Production Chemical Synthesis
Chemical Synthesis Micro Test
Characterization
Compute Feasibility and Variants
Process Diagram
- Outputs
- gt200 successful
- protocols/day
- Preliminary Characterizations
- Genome Annotation
Gene Sequence Inputs to Protein Production (500
starts/day)
Comparative Computational Determination of
Methods and Strategies (Process terabytes of
Data/day)
Preproduction Screens Utilizing Lab-on-a-Chip
Technologies (gt100,000 trials/replicates/day)
Prepare Genes for Multiple Production
Modalities (2000/day)
Production Targets
34(3) Protein Production and (4) Characterization
Lines
Major Equipment Layout
(3)
(4)
Production Robots
Automated Peptide Synthesis And Ligation Systems
Chemical Synthesis
Flow Systems
Analytical Robots
Dispensing And Harvesting Robots
Robot Incubator/ Shakers
Rapid Translation Systems
- Liquid
- Chromatography
- Capillary
- Electrophoresis
- Affinity Columns
Mass Spec
FTIR
UV Circular Dichroism
SAXS SANS
X-Ray Absorption Fine Structure
Cell-Free
UV Absorbance
Enzyme Activitiy
Size Exclusion Chromatograph LLS
Electron Microscopy
Fluorescence Emission Lifetime
Dispensing And Harvesting Robots
Multisample Fermentors
Automated Centrifuges
Cellular
(3)
(4)
In Vitro Insertion
Translation And Transcription
Incubation
Prepared Genes
Cell-free Synthesis
Extraction and Purification
Solubility Analysis
Biophysical Characterization
Validation QA/QC
Biofunctional Assay
Stability Assessment
Refolding Protocols
Protocols
In Vivo Vectors
Inoculate into Medium Culture
18 hour Incubation
Cellular Synthesis
Automated Data Logging LIMS Protocols and
Strategy Design Super Annotation
(6,7)
Chemical Synthesis of Peptides
Chemical Ligation
Gene Sequence
Amino Acid Labeling
Chemical Synthesis
Process Diagram
Multimodal Protein Production and
Purification (500 attempts/day) (Milligram
Quantities)
Mutants and Labeled Proteins (As Needed)
Primary, Secondary And Tertiary
Structural Characterizations (1000s/day)
Multiple BIofunctional Characterizations (10,000
s/day)
Proteins, Data, and Protocols To
Users (100s/day)
Production Targets
35Computing Environment For All Production Lines
(5) Affinity Reagent Production Line
Major Equipment Layout
Hardware and Software
(5)
Production Robots
Distributed Computing And DataBases
Central DataBases
Central Computing
Data Management And Modeling Tools
Screening Robots
- Liquid
- Chromatography
- Capillary
- Electrophoresis
- Affinity Columns
Analysis and Assays Robots
UV Absorbance
Isothermal Calorimetry
- Displays and
- Assays
- Phage
- Ribosome
- Bacterial
- Yeast
- Two-Hybrid
Production Robots
Surface Plasmon Resonance
- Solubility
- Dye Marker
- CD
- Mass Spec
Computing and Information Processes and Products
Comparative Analyses
Annotation and Production Strategies
Protocols
Simulations/ Workflow Planning
(6,7)
(5)
Post Synthesis Biotinylation or Secondary Tagging
Production of Proteins
Extract and Purify
Reagents And Data To Archives And DataBase
LIMS
Automated Data Capture
Data reduction And Archiving
User Access
Synthesizing Proteins With Secondary Tags
Engineer tag Into Protein Gene
Synthesize
Extract and Purifiy or Use In Situ
Characterization
Production Automation
Expert Systems
Facility Models
Virtual Facility
Acquire Libraries
Amplify Libraries
Present Target Protein
Affinity Determination
Display Selection
Produce Tag In Quantity
Library Method
Data Management
Data Interpretation
Modeling
Process Refinement
Linear and Constrained Peptides
Process Diagram
- Data Archiving
- Genome Annotation
- Cloning
- Purification
- Libraries
- Affinity Selection
- And Screening
- Characterization
- Etc.
- Infrastructure
- Machine Environment
- Storage Infrastructure
- Network
- Distributed Systems
Data, Protocols, and Affinity Reagents
to Users (1000s/day)
Multiple Affinity Reagents for Each
Protein (1000s of attempts/day)
Comprehensive Data and Protocols to Production
and Archives (1000s/day)
Production Targets
36Technology Options for Protein Production
June 14-16, 2004 GTL Technology Deep Dive
Workshop, Working Group on Genome-Based Reagents
37Technology Options for Molecular Tag Production
June 14-16, 2004 GTL Technology Deep Dive
Workshop, Working Group on Genome-Based Reagents
38Potential Protein Characterizations(slide being
worked)
- quality control
- assignment of possible function
- structural-activity relations
- high-throughput thermodynamic stability
- probing the folding landscape
- identification of reconstitution conditions
- computationally predict protein properties (more
efficient production) - discovering substrates (orphan enzymes)
- identification of co-factors (metals, NADH, ATP,
etc.) - identify long-term storage conditions (what keeps
the protein from aggregating or losing activity) - biological affect of post-translational
modifications - intermolecular interactions (dissociation
constants) - centralized and standardized characterization
assays for computational analysis - identification of DNA/RNA binding sites
- methodology for screening thousands of variants
for desirable enzymatic activities - feedback on protein production computational
analyses - screen small molecule natural product libraries
for modifiers (i.e., agonists and antagonists) - identify the ordered and disordered regions
within a protein (provides insights into binding
partners)
June 14-16, 2004 GTL Technology Deep Dive
Workshop, Working Group on Genome-Based Reagents
39Elements of Facility 1 Construction Project Scope
and Cost
40Factors in Central/Distributed and Lease/Buy
considerations
- DOE project management Order 413.3 applies
equally to built, leased, modified
central/distributed - The project delivers a fully functional facility
at CD4 - Lease Considerations (for identical facilities)
- Integrated cost analyses show cost of leased
exceeds cost of build by year 9 after start of
project. - Mortgages program dollars
- All RD and equipment costs remain identical
usually final building design follows final
equipment acquisition decisions - For existing buildings modification costs can be
large - Much of this facility is specialty space
characterization instruments, biological
isolation, cryogenic storage, etc.
41Factors in Central/Distributed and Lease/Buy
considerations
- Central vs. Distributed
- End to end performance would require duplication
of equipment, people, RD higher cost, lower
performance - JGI showed qualitative gains in quality,
throughput and cost only after consolidation and
strict focus on production goals - Critical mass in people, facilities, learning
- Total building costs will increase if space is
not contiguous - Distributed functions might include
- RD and methods development
- Specialty Characterizations
- Affinity reagent and detergent libraries
- Selected Affinity reagents