DNA Computing - PowerPoint PPT Presentation

About This Presentation
Title:

DNA Computing

Description:

DNA Solution of Hard Computational Problems. Practical Purposes ... Well kind of, but it is not instantaneous. AND, it is a ... www.jsonline.com/alive ... – PowerPoint PPT presentation

Number of Views:526
Avg rating:3.0/5.0
Slides: 52
Provided by: jmt7
Category:

less

Transcript and Presenter's Notes

Title: DNA Computing


1
DNA Computing
  • Charles Ormsby III
  • CSE 497
  • 4/15/2004

2
Outline
  • DNA Computing Characteristics
  • Different Approaches
  • Liptons Paper
  • DNA Solution of Hard Computational Problems
  • Practical Purposes
  • Future Work/Funding
  • References

3
  • DNA Computing Characteristics
  • (Advantages Disadvantages)

4
DNA Computation Characteristics
  • Parallel Processing
  • Processes all possible solutions simultaneously!
  • Well kind of, but it is not instantaneous
  • AND, it is a Physical Process!
  • Therefore, the molecular steps required to
    process the solution set can take weeks
  • But, we are finding ways improve time efficiency!
    More To Come

5
DNA Computation Characteristics
  • Read/Write Rate of DNA
  • DNA replication rate 500 base pairs per second
  • - 10 times faster than human cells
  • - Very low error rates
  • But only 1000 bits/sec? Compare to the data
    throughput of an average hard drive? SLOW!!!
  • Can anyone think of an advantage that DNA-based
    computers might have over the way todays PCs
    interact with memory?

http//www.arstechnica.com/reviews/2q00/dna/dna-2.
html
6
DNA Computation Characteristics
  • YES, copies of the replication enzymes can work
    on DNA in parallel
  • Bonus - Replication enzymes can start on the
    second replicated strand of DNA even before
    they're finished copying the first one. So
    already the data rate jumps to 2000 bits/sec
  • Electric computers are incapable of such a feat!

http//www.arstechnica.com/reviews/2q00/dna/dna-2.
html
7
DNA Computation Characteristics
  • Read/Write Rate of DNA (contd)
  • Look what happens after each replicating
    iteration
  • number of DNA strands increases exponentially
  • 2n after n iterations
  • Data rate increases by 1000 bits/sec per strand
  • After 10 iterations, replication rate 1Mbit/sec
  • And, after 30 iterations it increases to 1000
    Gbits/sec
  • This is well beyond the sustained data rates of
    the fastest hard drives!!!

http//www.arstechnica.com/reviews/2q00/dna/dna-2.
html
8
DNA Computation Characteristics
  • Data density A, T, C, G
  • Bases spaced every 0.35 nanometers
  • 1-dimension 18 Mbits per inch
  • 2-dimension Over one million Gbits per square
    inch (assuming one base per square nanometer)
  • Typical high performance hard drive
  • data density 7 Gbits per square inch
  • A factor of over 100,000 smaller!!

http//www.arstechnica.com/reviews/2q00/dna/dna-2.
html
9
DNA Computation Characteristics
  • Double stranded nature
  • - Every DNA sequence has a natural complement
  • If S ATTACGTCG
  • S TAATGCAGC, its complement
  • DNAs complementary nature makes it a unique data
    structure for computation and can be exploited in
    many ways, such as Error Correction

10
DNA Computation Characteristics
  • DNA Error Rates
  • Biological error rate 1/109 copied bases
  • Hard drive read error rate 1/1013
  • Error Correction Errors occur due to many
    factors, for examples
  • Incorrect insertions/deletions
  • Damage from thermal energy and UV energy from the
    sun
  • However, if the error occurs in one of the
    strands of double stranded DNA, repair enzymes
    can restore the proper DNA sequence by using the
    complement strand as a reference.
  • RAID 1 array

http//www.arstechnica.com/reviews/2q00/dna/dna-1.
html
11
DNA Computation Characteristics
  • The Statistics of Randomness
  • Pertaining to Adlemans method
  • All HDPPs paths are equally likely to be formed
    during the random production of sequences
  • In other words, over a large well distributed
    solution set, all solutions (or at least a great
    majority) should be present
  • This is key because in order for the DNA
    computer to arrive at the correct solution, the
    solution must first exist in the solution set
  • Statistics If only 99 of the solutions exist
    in the solution set than the method will have a
    successrate of only?

12
  • Different Approaches
  • Free Floating vs. DNA Chips

13
Free Floating
  • Approach 1 Bits of DNA float freely in a test
    tube
  • (pioneered by Leonard M. Adleman)

14
Free Floating
  • Advantages
  • - Strong general problem solving application
  • - Increased freedom in experimentation
  • i.e. Immediate scalability by amplification
  • (could the freedom also be also considered a
    disadvantage?)
  • - Can encode unique problems
  • - Scales very well
  • Can you think of any other advantages?

HAHA, neither could I
15
DNA-based Chips
  • Approach 2 A gold-plated square of glass (one
    inch square) anchors as many as a trillion
    individual strands of DNA to the glass.
  • Microarrays

http//www.dhgp.de/ethics/ethics02.html
16
DNA-based Chips
  • Advantages
  • - Easier to handle, specific orientation
  • - Keeps out impurities
  • - Serves as a building block to scale upwards
  • - Programmable interfaces (in the future)
  • - Very useful for storing information about
    Bio-agents
  • Business Quiz
  • Why is this approach more appealing to
    corporations and institutions who fund research?

17
DNA-based Chips
  • Can be manufactured!!!

18
  • Liptons Paper
  • DNA Solution of Hard Computational Problems
  • Lipton, Richard J., DNA Hard Solution of
    Computational Problems. Science, New Series, Vol.
    268, No. 5210 (April 28, 1995), 542-545

19
Richard Liptons DNA Solution of Hard
Computational Problems
  • Two factors limit any computers performance
  • Parallel processing capabilities
  • 3 grams of water ? 1022 molecules
  • Computations per unit time
  • 100 million instructions per second
  • Human Time vs. Computation Time

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
20
Richard Liptons DNA Solution of Hard
Computational Problems
  • State-of-the-Art Supercomputer
  • 100 million instructions per second
  • Biological computers are limited to only a
    fraction of an experiment per second
  • Doesnt the complexity of the experiment
    determine the difference?
  • However, DNA computers counter the instruction
    time disparity with parallelism

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
21
Richard Liptons DNA Solution of Hard
Computational Problems
  • Traveling Salesman Revisited
  • Conventional computer can solve tour with 70
    cities, but would fail with 100 or more cities
  • Even with 1023 parallel processors, Brute force
    is too inefficient
  • However, are DNA computers only advantageous for
    problems with very large solutions sets?
  • No, Adelmans work can be extended to produce
    solutions to all problems that are obtainable and
    unobtainable by traditional CPUs in much less time

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
22
Richard Liptons DNA Solution of Hard
Computational Problems
  • NP-complete ? The Satisfaction Problem
    (SAT)
  • SAT is a simple search problem, and was one of
    the first NP-complete problems
  • Consider
  • F (x V y) ? (Gx V Gy)
  • Current Best Method test all 2n solutions for
    n variables

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
23
Richard Liptons DNA Solution of Hard
Computational Problems
  • Truth Table
  • Current Best Method test all 2n solutions for
    n variables

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
24
Richard Liptons DNA Solution of Hard
Computational Problems
  • Initial Assumptions/Conditions
  • This model is simple and idealized
  • Ignores many known complex effects, but is an
    excellent first order approximation
  • Strands of DNA are just sequences
  • a1,, ak of the set A,C,G,T
  • Double stranded DNA are a pair of sequences
  • For i 1,,k given a1,, ak and b1,, bk both
    sequences of the set A,C,G,T a1 must
    complement b1, meaning A??T or C??G
  • Only consider strands with a length of 20
    nucleotides

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
25
Richard Liptons DNA Solution of Hard
Computational Problems
  • Five Simple operations the can be performed on
    test tubes that contain DNA strands
  • Possible to synthesize a large number of copies
    of any single strand
  • Annealing produces a double strand from a single
    strand and its complementary strand
  • Given a test tube of DNA, one can extract a
    strand that contains some simple pattern of
    length l
  • Using a Polymearse Chain Reaction (PCR), one can
    detect whether there are DNA strands at all in
    the test tube
  • All of the DNA in the test tube may be amplified
    by replicating the strands in the test tube

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
26
The Theory
  • One fixed test tube
  • The set in the test tube corresponds to the
    following graph Gn

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
27
  • All paths the travel from a1 to an 1 encode an
    n-bit binary string
  • At each stage, a path has exactly two choices
  • Unprimed node encodes a 1
  • Primed node encodes a 0
  • Therefore, the example path a1xa2ya3 encodes 01

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
28
The Solution Set Discovery
  • Encode graphs vertices in DNA
  • Encode edges in DNA
  • 3) Encode starting and ending points in DNA

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
29
Step 1 - vertices in DNA
  • The Graph is encoded in a test tube of DNA
  • Each vertex of the graph is assigned a random
    pattern of length l from A,C,G,T
  • Each encoding is referred to as the name of the
    vertex and is comprised of two parts
  • 1st half ? pi
  • 2nd half ? qi
  • Therefore, each vertex can be referenced by piqi

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
30
Step 2 - edges in DNA
  • Then, fill a test tube with the following
  • For each vertex, add many copies of a 5 ? 3
    DNA sequence of the form piqi
  • For each edge i ? j, put many copies of a 3 ?
    5 sequence that is of the form (GqjGpi)
  • If
  • Vertex i ATCGGCTACTCCTGACTTGA
  • pi ATCGGCTACT
  • qi CCTGACTTGA
  • Vertex j AGGTTCAGTCAGGCCTATTC
  • pi AGGTTCAGTC
  • qj AGGCCTATTC
  • Therefore, for edge I ? j a sequence like the
    following would be added
  • Gqj GGACTGAACT Gpi TCCAAGTCAG

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
31
Step 3 end points in DNA
  • Then, add the following DNA strands
  • Add a 3 ? 5 sequence of length l /2 that is
    complementary to the first half of the initial
    vertex
  • Similarly, add 3 ? 5 sequence of length l /2
    that is complementary to the last half of the
    final vertex
  • In other words, add Gp1 Gqn)
  • If initial vertex was
  • ACTTGCCATCTCCGATACTT
  • And the final vertex was
  • TCGCCTAATCTACGATCTTA
  • then add
  • TGAACGGTAG ATGCTAGAAT

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
32
Goal of Initial Solution Set
  • KEY That every legal path in Gn corresponds to
    a correctly matched sequence of vertices and
    edges
  • Any path through the graph must contain a
    sequence that alternates between vertex, edge,
    vertex, edge,...
  • Try this visual
  • Consider the edge v ? u, any path that passes
    through v and then passes through u must fit
    together like bricks

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
33
  • So, the top 5 ? 3 represents a series of
    vertices
  • Whereas, the bottom 3 ? 5 represents an edge
  • Furthermore
  • Vertex v is encoded as puqv
  • Edge uv is encoded as G qv G pu

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
34
  • Why is this ordering significant?

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
35
  • the end of the vertex and the beginning of the
    edge can anneal because they are complementary!
  • Similarly, the end of the edge and the beginning
    of the next vertex can anneal too!
  • High Probability of No inadvertant paths
  • Sequences are chosen at random
  • 2) The sequence lengths are large
  • After the annealing, all of the possible paths
    through the graph will be encoded into n-bit
    long DNA sequences

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
36
Similarity Between Sequences
  • At any given vertex in a path, the choice is
    simply left or right, therefore, all paths are
    similar
  • What does this mean?
  • All paths are equally likely to be formed
    during the random production of sequences
  • In other words, over a large well distributed
    solution set, all solutions (or at least a great
    majority) should be present
  • This is key because in order for the computer
    to arrive at the correct solution, the solution
    must first exist in the solution set
  • Statistics!

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
37
Extraction Operations
  • Notation
  • E(t,i,a), denotes all sequences in test tube t
    where i a
  • Perform one extract operation such that
  • checks for the sequence that corresponds to the
    name of xl if a 1,
  • and if a 0, it check for xl

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
38
Extraction Operations
  • Construct a series of test tubes
  • Values Present
  • t0 contains all sets 00,01,10,11
  • t1 E(t0, 1, 1) 10,11
  • t1 remainder of t1 00,01
  • t2 E(t1, 2, 1) 01
  • Pour t1 and t2 together to form t3
  • t3 t1 t2 01,10,11

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
39
Extraction Operations
  • 2) Construct a series of test tubes
  • Values Present
  • t4 E(t3, 1, 0) 01
  • t4 remainder of t4 00,10,11
  • t5 E(t4, 2, 0) 10
  • Pour t4 and t5 together to form t6
  • t6 t4 t5 01,10

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
40
Extraction Operations
  • 3) Check to see if there are DNA strands
    available in t6
  • Those left in t6 are the satisfying assignment!

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
41
Understanding How it Works
  • Test tube t3 consists of all the sequences that
    satisfy the first clause 01,10,11
  • and, similarly t6 consists of all those that
    satisfy the second clause and are contained in t3

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
42
More General Case
  • Any SAT problem on
  • n variables, and
  • m clauses,
  • can be solved with at most m extract steps
  • (with one detect step at end)
  • Liptons Acknowldegments
  • Operations are assumed perfect and without error

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
43
  • Practical Purposes

44
Purposes
  • Counter Bioterrorism/Monitor Genetic Progression
  • Institute for Countermeasures against
    Agricultural Bioterrorism (ICAB)
  • Plan
  • 1) Obtain DNA sequences from crops, animals,
    bio-agents, etc.
  • 2) Deploy DNA-chip technology to identify and
    characterize
  • 3) Build geo-referenced information system
  • 4) Predict and track the spread of bio-agents
    after introduction
  • 5) Create powerful DNA-based tools for
    monitoring and enhanced diagnosis
  • DNA microarrays DNA-based chips
  • - Can store 1,000 to 100,000 different
    diagnostic DNA sequences
  • Next generation will contain one million tags!

http//icab.tamu.edu/
45
Purposes
  • Predictive Gene Testing

http//www.dhgp.de/ethics/ethics02.html
46
  • Poker Playing

DNA Computing 7th International Workshop on DNA
Based Computers, Dna7, Tampa, Florida, June
10-13, 2001 Revised Papers
47
  • Weighted-Recursive Algorithms

DNA Computing 7th International Workshop on DNA
Based Computers, Dna7, Tampa, Florida, June
10-13, 2001 Revised Papers
48
Pessimism
  • 1) Too fragile and prone to error
  • 2) The field is dominated by hard-core
    enthusiasts who, will be forced to "slog through
    and do the heavy research" before there is a
    major breakthrough

Optimism
However, keep in mind the first commercially
available electronic computer was not well
received, and IBM in 1951 had to reinvent what
they spent millions of dollars and years working
on to fit customers needs (such as payroll)
http//www.jsonline.com/alive/news/0607dna.stm
49
The Future of DNA Computing
  • Commercial application by 2010
  • Alternative to traditional computing by 2020
  • Vision Today we have not one but several
    companies making "DNA chips," where DNA strands
    are attached to a silicon substrate in large
    arrays (for example Affymetrix's genechip).
    Production technology of MEMS is advancing
    rapidly, allowing for novel integrated small
    scale DNA processing devices. The Human Genome
    Project is producing rapid innovations in
    sequencing technology. The future of DNA
    manipulation is speed, automation, and
    miniaturization

http//www.jsonline.com/alive/news/0607dna.stm
50
Research Funding
  • Funding
  • National Science Foundation
  • Pentagon's Defense Advanced Research Projects
    Agency - Much of the military's interest arises
    from the increasing sophistication of encryption
    techniques that other countries can use to encode
    their data. As a result, Washington needs
    ever-more-powerful computers for code breaking

51
  • Internet References
  • http//chronicle.com/data/articles.dir/art-44.dir/
    issue-4.dir/14a02301.htm
  • http//www.jsonline.com/alive/news/0607dna.stm
  • http//www.arstechnica.com/reviews/2q00/dna/dna-1.
    html
  • Book/Papers References
  • Lipton, Richard J., DNA Hard Solution of
    Computational Problems. Science, New Series, Vol.
    268, No. 5210 (April 28, 1995), 542-545
  • DNA Computing 8th International Workshop on DNA
    Based Computers, Dna8, Sapporo, Japan, June
    10-13, 2002 Revised Papers (Lecture Notes in
    Computer Science, 2568)
  • DNA Computing 7th International Workshop on DNA
    Based Computers, Dna7, Tampa, Florida, June
    10-13, 2001 Revised Papers
  • Future References
  • http//www.nas.nasa.gov/
  • http//www.nas.nasa.gov/Research/Reports/reportsar
    chive.html
Write a Comment
User Comments (0)
About PowerShow.com