Vyacheslav V. Rykov - PowerPoint PPT Presentation

1 / 65
About This Presentation
Title:

Vyacheslav V. Rykov

Description:

(A, T) and (C, G) are complementary pairs. ... strand can (usually) only bind to a T (G) in the oppositely directed strand, the ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 66
Provided by: UNO67
Category:
Tags: rykov | vyacheslav

less

Transcript and Presenter's Notes

Title: Vyacheslav V. Rykov


1
DNA CODES GENERATION USING AN IMPROVED METRIC
Vyacheslav V. Rykov
2
Outline
  • DNA Hybridization/Cross Hybridization
  • DNA Codes
  • Nearest Neighbor Thermodynamics
  • complete computations
  • bounds
  • Overview of Applications and Purposes
  • DNA Bitstring Library
  • Biomolecular Computing

3
DNA Hybridization
  • DNA strands are modeled by directed 3-- 5
    sequences of letters from the alphabet A, C, G,
    T
  • (A, T) and (C, G) are complementary pairs.
  • Two oppositely directed DNA sequences are capable
    of coalescing into a duplex.
  • Because an A (C) in one strand can (usually)
    only bind to a T (G) in the oppositely directed
    strand, the greatest energy of duplex formation
    is obtained when the two sequences are
    reverse-complements (complements)

4
Orientation of single DNA strands is important
for hybridization.
5
A DNA Code
Coding Strands for Ligation
Probing Complement Strands for Reading
TACGCGACTTTC GAAAGTCGCGTA ATCAAACGATGC
GCATCGTTTGAT TGTGTGCTCGTC GACGAGCACACA ATTTT
TGCGTTA, TAACGCAAAAAT CACTAAATACAA
TTGTATTTAGTG GAAAAAGAAGAA, TTCTTCTTTTTC 5
3 5
3
Watson Crick (WC) Duplexes
5TACGCGACTTTC3
ATCAAACGATGC
Must Have
5GAAAGTCGCGTA3
GCATCGTTTGAT
TACGCGACTTTC
Cross Hybridized (CH) Duplexes
ATTTTTGCGTTA
Must Avoid
GCATCGTTTGAT
GAAAAAGAAGAA
6
(No Transcript)
7
x5 ggCaCaTcatAct 3
5 ggCaCaTcatAct 3 5 AggTTaaCcatct 3
y5agatgGttAAccT 3
5 ggCaCaTcatAct 3 3 TccAAttGgtaga 5
5agatgGttAAccT 3 y
8
DNA codes serve as universal components for
biomolecular computing. DNA codes are closed
under reverse-complementation. The strands in a
DNA code have such binding specificity that a
code strand will only hybridize with its
reverse-complement and will not cross hybridize
with any other code strand in the DNA code Such
collections of strands are crucial to the success
biomolecular computing and biomolecular
nanotechnology.
Basic idea is to have correct, parallel and
autonomous addressing
9
Characterization of synthetic DNA bar codes in
Saccharomyces cervisiae gene-deletion strains
(Eason et al., PNAS).
DNA codes for self-assembly of any components
that can be attached to DNA. Their size presents
the potential for increased complexity and
location control in nanostructures produced by
assembly that is driven by DNA duplex formation.
Fundamental physical limits and increasing costs
of fabrication facilities will force alternatives
to conventional microelectronics manufacturing to
be developed.
In self-assembly, weak, local interactions among
molecular components spontaneously organize those
components into aggregates with properties that
range from simple to complex
DNA memoryThe capacity and storage density of
such memories is potentially very large.
Information could be mined through massively
parallel template-matching reactions. In
addition, information could be processed based
upon context, and information matched
associatively based upon content.
10
DNA Computing
  • Interest into DNA computing was sparked in
    1994 by Len Adleman.

Adleman showed how we can use DNA molecules to
solve a mathematical problem. (Hamiltonian path
problem).
DNA computing relies on the fact that DNA strands
can be represented as sequences of bases (4-ary
sequences) and the property of hybridization.
In Hybridization, errors can occur. Thus,
error-correcting codes are required for efficient
synthesis of DNA strands to be used in computing.

11
DNA Computing Strand Engineering No
codeword-codewode CH (cc-CH) No codeword-probe CH
(cp-CH) No probe-probe CH (pp-CH)
A A A A A A A A C CT1 G G T T T T T T T T BEAD
PROBE (T1) T T T C C A A A A A F1 T T T T T G G
A A A BEAD PROBE (F1) T T T C T T A A C CT2 G
G T T A A G A A A BEAD PROBE (T2) A C T A A C A
A A AF2 T T T T G T T A G T BEAD PROBE (F2) C
A T A A A A C A CT3 G T G T T T T A T G BEAD
PROBE (T3)
A T C T T T T C A AF3 T T G A A A A G A T BEAD
PROBE (F3) C A A T C C A T T AT4 T A A T G G A
T T G BEAD PROBE (T4) C C T T C T A A A TF4 A
T T T A G A A G G BEAD PROBE (F4) A C T C C T A
A T AT5 T A T T A G G A G T BEAD PROBE (T5) T
C T C T C T A C TF1 A G T A G A G A G A BEAD
PROBE (F5)
Only Allowed Hybridizations
T T T T T T G G T T G GProbe(T1) G G T G G T T
T T T T TProbe(F1) T G G A A G G A A A A
AProbe(T2) G G T T T G A G G T A A
Probe(F2) G G A G T T G T G A A AProbe(T3)
C C A A C C A A A A A A T1 A A A A A A A C C A
C CF1 T T T T T C C T T C C A T2 T T A C C T
C A A A C C F2 T T T C A C A A C T C CT3
No cp-CH
T T G T G G A T T G A AProbe(F3) T T G A G A G
A G T G AProbe(T4) A G A G G A G A A A G
AProbe(F4) G A T G G T G A G A T GProbe(T5) G
T G T G T A G T G T TProbe(F5)
T T C A A T C C A C A A F3 T C A C T C T C T C
A A T4 T C T T T C T C C T C TF4 C A T C T C
A C C A T C T5 A A C A C T A C A C A C F5
No cc-CH
No pp-CH
12
DNA Computing Strand Engineering No codeword cp-CH
T T C A A T C C A C A A F3 T T G T G G A T T G A
AProbe(F3) T C A C T C T C T C A A T4 T T G A
G A G A G T G AProbe(T4) T C T T T C T C C T C
TF4 A G A G G A G A A A G AProbe(F4) C A T C T
C A C C A T C T5 G A T G G T G A G A T
GProbe(T5) A A C A C T A C A C A C F5 G T G T
G T A G T G T TProbe(F5)
C C A A C C A A A A A A T1 T T T T T T G G T T
G GProbe(T1) A A A A A A A C C A C CF1 G G T G
G T T T T T T TProbe(F1) T T T T T C C T T C C
A T2 T G G A A G G A A A A AProbe(T2) T T A C
C T C A A A C C F2 G G T T T G A G G T A A
Probe(F2) T T T C A C A A C T C CT3 G G A G T
T G T G A A AProbe(T3)
PROBE(F2)
G G T T T G A G G T A A
C A A C C A A A A A A- T T A C C T C A A A C C- T
T C A A T C C A C A A- T C A C T C T C T C A A -
C A T C T C A C C A T C
Yes WC bonding Yes, bitstring is F2 Good read
T1-F2-F3-T4-T5 1 0 0 1 1
PROBE(T2)
G G A G T T G T G A A
C A A C C A A A A A A- T T A C C T C A A A C C- T
T C A A T C C A C A A- T C A C T C T C T C A A -
C A T C T C A C C A T C
Darn! CH bonding No, bitstring is not T2 Bad read
T1-F2-F3-T4-T5
13
DNA Computing Strand Engineering No codeword
pp-CH, cc-CH
PROBE(F2)
pp-CH interferes with reading
G G T T T G A G G T A A
T T G A G A G A GT G
PROBE(T4)
PROBE(F2)
G G T T T G A G G T A A
bonding site competition
T T G A G A G A GT G
C A A C C A A A A A A- T T A C C T C A A A C C- T
T C A A T C C A C A A- T C A C T C T C T C A A -
C A T C T C A C C A T C
cc-CH interferes with separation and leads to
unwanted library strand interaction
T1-F2-F3-T4-T5
F1-F2-T3-T4-f5
C A A C C A A A A A A- T T A C C T C A A A C C- T
T C A A T C C A C A A- T C A C T C T C T C A A -
C A T C T C A C C A T C
T T T C C A A A A-A T T A C C T C A A A C C- T T
T C A C A A C T C C- T C A C T C T C T C A A - A
A C A C T A C A C A C
14
Watson-Crick Nearest Neighbor Computation
1.44
2.24
5 g g c a c a 3 3 c c g t g t 5
WC Duplex
5 g g c a c a 3 5 g g c a c a 3
NNFE8.42
5 g g c a c a 3 5 g g c a c a 3
1.84
1.45
1.45
15
Cross Hybridized Nearest Neighbor Upper Bound
Computation
1.45
1.28
5 g g C a C a T c a t A ct 3 5 A g
g T T a a C c a t ct 3
5 ggCaCaTcatAct 3 3 TccAAttGgtaga 5
5 g gC aCaTcatAct 3 3 Tc cA AttGgtaga 5
.27
1.84
0.88
NNFE
5 ggCaCaTcatAct 3 5 AggTTaaCcatct 3
NNFE
16
Intermolecular Interactions Duplexes
loop
loop
symmetric
asymmetric
2.90 (5.3)
2.20 (5.3)
17
Intramolecular Interactions
CAAGACTTTTTGGTAGTAAA
TTTCCCGGAAGGGAAATTCC
18
NNFE
NNFE
19
5 gg C a C a T c a t A ct 3 3 T cc
AA t t G g t a ga 5
5 gg C a C a T c a t A ct 3 5 A
gg TT a a C c a t ct 3
5 ggCaCaTcatAct 3 3 TccAAttGgtaga 5
Virtual Stacked Pairs
Virtual Duplex
5 ggCaCaTcatAct 3 5 AggTTaaCcatc 3
20
5 GGCACATCATACT 3
5 AGTATGATGTGCC 3
5 AGGTTAACCATCT3
5 AGATGGTTAACCT3
5 GGCACATCATACT 3
Neareast Neighbor Appr. Free Energy of duplex
formation (WC)
5 AGTATGATGTGCC 3
18.8
2.24
1.84
1.45
5 GGCACATCATACT 3
5 AGGTTAACCATCT3
5 ggCaCaTcatAct 3 3 TccAAttGgtaga 5
5 gg C a C a T c a t A ct 3 3 T cc
AA t t G g t a ga 5
1.28
1.45
5 ggCaCaTcatAct 3 5 AggTTaaCcatct3
5 gg C a C a T c a t A ct 3 5 A
gg TT a a C c a t ct 3
NNFE CH 6.45
0.88
1.84
21
correlation.737
Our FE bound
Precise FE
Length 16 435 random
22
Basic Notations
Let denote a set consisting of all vectors
(codewords) of length n built over
i.e.
Let such that
1) 2) 3)
Let be such that
is referred to as a Code of length n, size M, and
minimum distance d.
23
A sphere in centered at x having radius d
Volume of the sphere around x, of radius d
Spaces
A space is HOMOGENEOUS when the volume of a
sphere does not depend on where it is centered
i.e.
A space is NON - HOMOGENEOUS when the volume of a
sphere does depend on where it is centered.
24
Similarity
  • Sequence

is a subsequence of
if and only if there exists a strictly increasing
sequence of indices
Such that
is defined to be the set of longest common
subsequences of
and
is defined to be the length of the longest common
subsequence of
and
25
Example of LCS
Just what it says x
y LCS(x, y) C G A G LCS(x, y) 7
26
Insertion-Deletion Metric
Original Insertion-Deletion metric (Levenshtein
1966)
This metric results from the number of deletions
and insertions that need to be made to obtain y
from x .
For vectors that have the same length the number
of deletions that will be made is
likewise, the number of insertions that will be
made is
27
Better Metric ?
  • LCS is simple and easy to compute.
  • LCS essentially is a count of the number of base
    pairings between two sequences, and thus does
    approximate bonding energy.
  • Clue if two base pairs bond, but neither their
    neighbors to the right or left bond, it really
    doesnt contribute much.
  • We might call such inconsequential bonds lone
    bonds.

28
Lone Bonds
  • B B B B B B B B B B B B B B B B
  • B B B B B B B B B B B B B B B B
  • The red bonds are lone bonds that dont
    contribute to the binding energy.

29
Block LCS
  • The longest common subsequence SUCH THAT
  • If xi is matched to yj, THEN EITHER
  • xi-1 is matched to yj-1, OR
  • xi1 is matched to yj1

30
Longest Common Stacked Pair Subsequence
A common subsequence is called a common
stacked pair subsequence of length
between x and y if two elements
, are consecutive in x and consecutive in y
or if they are non -consecutive in x and or
non-consecutive in y, then and
are consecutive in x and y.
Let
, denote the length of the longest
sequence occurring as a common stacked pair
subsequence subsequence z between sequences x
and y. The number , is called a
similarity of blocks between x and y. The metric
is defined to be
31
Bounds in Coding Theory
We will be working in a NON-HOMOGENEOUS space
making the obtainment of exact formulas for
sphere volumes and code sizes VERY HARD.
6 L. M. G. M. Tolhuizen (1997) The Generalized
Varshamov-Gilbert Bound is Implied by Turans
Theorem, IEEE Transactions on Information Theory,
4305.
Varshamov-Gilbert Lower Bound on Code Size in
with any metric
32
Turan's Theorem
  • Let G be a simple graph on vertices and e
    edges. G contains an M-clique if

CLIQUES
33
The edge set of G is constructed as follows an
edge (x, y) exists in G if and only if d(x, y)
d. The first question is how many edges does G
have? This can be found by taking spheres of
radius d - 1 around each vector and counting how
many vectors are outside the particular sphere.
Since edges will be double counted, we must
divide by 2
34
From Turan to Varshamov-Gilbert
If
Then there exists a code of size M.
35
Let
Then
Hence there exists a code of size M and so
36
Stacked Pair Metric Bounds
The upper bound for the average sphere volume in
this metric will be
The Varshamov-Gilbert bound becomes
37
Bounds for Stacked Pair Metric
d 6
d 7
d 8
d 9
d 10
38
Insertion-deletion stacked pair
thermodynamic metric
Thermodynamic weight of virtual stacked pairs.
  • Can use statistical estimation of sphere volume.

39
LCS(x, y) is easy to compute O(n2).
Algorithm for LCS metric
  • Notation Let xi be symbol i of the sequence x.
  • Notation Let x(i) be the first i symbols of the
    sequence x
  • Notation Let LCS(i, j) LCS(x(i), y(j))

If x , then x3
C, and x(3)
40
Case 1sequences end with the same symbol
  • A C G C G T T A
  • C T G A T A C A
  • Get LCS of this and add 1 for the
  • As
    that have to match

41
Case 2sequences end with different symbols
  • A C G C G T T A
  • C T G A T A C C
  • Take the best LCS of these two

42
Solve Problem Recursively
  • If x(i) and y(j) end with the same symbol, say A,
    then LCS(x(i), y(j)) LCS(x(i 1), y(j
    1) A
  • If xi and yj do NOT end with the same symbol,
    then LCS(x(i), y(j)) maxLCS(x(i 1),
    y(j)),
    LCS(x(i), y(j 1))

43
Dynamic Programming
  • Inefficient we keep evaluating the same LCS(i,
    j) over and over.
  • Instead, use dynamic programming.
  • Fill in a table of LCS(i, j) values by i and j.
  • You only have to figure each LCS(i, j) once.
  • O(n2).

44
In terms of dynamic programming table
Cell we are trying to figure out
Information we use
45
Stacked pair metric
Algorithm for Stacked Pair Metric
  • The longest common subsequence SUCH THAT there
    are no lone bonds.
  • If xi is matched to yj, THEN EITHER
  • xi-1 is matched to yj-1, OR
  • xi1 is matched to yj1

46
Cannot break a block LCS
  • Big regular LCS
  • A C T G C T
  • G A C G C T
  • Break to get two smaller regular LCSs
  • A C T G C T
  • G A C G C T

47
Cannot break a block LCS
  • Big block LCS
  • G G T A G G
  • C C T A C C
  • CANNOT break to get two smaller block LCSs
  • G G T A G G
  • C C T A C C

48
Adding a single symbol to a string can have
effects arbitrarily far back
  • A C T C C C C T
  • G G G G G A C T G
  • A C T C C C C T G
  • G G G G G A C T G

These three bonds make the LCSP.
Add just one symbol, G, and the red bond must be
moved to make the new LCSP.
49
Case I
  • Requirement either xi ? yj or xi-1 ? yj-1 or
    both.
  • ResultLCSP(i, j) maxLCSP(i 1, j), LCSP(i,
    j 1)

Case II
  • Requirement Not case I, and LCSP(i 2, j 2)
    LCSP(i 1, j 1)
  • Result LCSP(i, j) LCSP(i 1, j 1) 2

C A T G A T
2
0
0
50
Case IV
Case III
  • Requirement Not case I or II, and LCSP(i
    1, j 1)
  • ResultLCSP(i, j) maxLCSP(i 1, j), LCSP(i,
    j 1)
  • Requirement Not case I, II, or III and
    LCSP(i 2, j 2) LCSP(i 1, j 1) 1
  • Result LCSP(i, j) LCSP(i 1, j 1) 1

51
Case V
  • Requirement Not case I, II, III, or IV, and
    xi2 ? yj2
  • Result LCSP(i, j) LCSP(i 1, j 1)

Case VI
  • Requirement Not case I, II, III, IV, or V.
  • Result LCSP(i, j) LCSP(i 1, j 1) 1

52
Two algorithms
53
Tail Equality
  • The tail equality of two sequences is the number
    of symbols they have at their ends that are the
    same.
  • That is, if x and y are sequences, x i and
    y j, t is the tail equality of x and y if
    xit ? yj-t, but xi-k yj-k for t
  • Tail equality is a function of two sequences,
    independent of any matching between them.

54
End Count
  • The end count of a matching of two sequences is
    the number of pairs of symbols at their ends that
    are matched.
  • That is, if x and y are sequences, x i and
    y j, and M is a matching between them, e is
    the end count of M if M does not match xie to
    yj-e, but does match xi-k to yj-k for e
  • End count is a function a PARTICULAR MATCHING
    between two sequences.

55
Tail equality of two sequences
Tail equality 3 A G C T C A T C T C Tail
equality 0 A G C T G A T C T A
End count of a matching
End count 2 A G C T C A T C T C End count
0 A G C T G A T C T A
56
  • The end count of a matching between x and y
    cannot exceed the tail equality of x and y.
  • Let LCSP(k)(i, j) be the length of the longest
    LCSP(i, j) achievable with a matching of end
    count k.
  • where e is the tail equality of x and y.

57
k 3 A C T G C T A T A C T G C T
A T
best of these two 3
58
Example Figure LCSP(3) A C T G C T A
T A C T G C T A T
best of these two 3
59
  • Substituting
  • O(n) worst case for one cell.
  • O(n3) for algorithm.
  • In practice, only 56 more time.
  • Efficient algorithm takes O(n) memory.
  • Simple algorithm takes O(n2) memory.

60
(No Transcript)
61
(No Transcript)
62
Code Generation
  • Start with empty code.
  • Repeatedly generate random codewords and add them
    if they meet the distance requirement.
  • When to stop?
  • After n trials?
  • When n trials in a row have failed?
  • When fewer than i of the last n trials have
    succeeded?
  • When the size of the code is near a maximum
    predicted by theory?

63
Markov Parameter
  • When generating the random sequences, one can
    pick a Markov parameter, f.
  • f is the probability one symbol differs from the
    one before.
  • f 0.0 means all As, Cs, Gs, or Ts.
  • f 0.75 means random one symbol does not
    affect the next no memory.

Optimal Fixed Markov
  • For some small codes it appeared f 0.62 yielded
    the biggest codes for a given number of trials.
  • A little bigger f seemed better for longer
    codewords.
  • Setting f to 0.62 yields a bigger code for a
    given number of trials than random f 0.75.

Adaptive Markov
  • Can do better yet with adaptive Markov.
  • Start f a 0.0, and increase it by a small amount
    whenever there are some number of consecutive
    failures up to 0.62.
  • Makes sure the codewords with lots of repetition
    get into the code.
  • They have a small ball size in the space
    containing all possible codewords.

64
Adaptive Markov Problem
  • Adaptive Markov generally is an improvement.
  • Hard to control.
  • If f doesnt get up to 0.62 early in the code
    generation, adaptive Markov does WORSE.

65
Empirical Relation Between Codeword Length and
Code Size
66
(No Transcript)
67
(No Transcript)
68
A DNA Computing Paradigm
The identification of maximal frequent sets in
data fields are the computational bottleneck in
association rule discovery. This is an important
problem and the independent sets and maximal
cliques problems fit this paradigm.
69
DNA Code and DNA Bitstring Library
A A A A A A A A C CT1 G G T T T T T T T T
BEAD PROBE (T1) T T T C C A A A A A F1 T T T
T T G G A A A BEAD PROBE (F1) T T T C T T A
A C CT2 G G T T A A G A A A BEAD PROBE
(T2) A C T A A C A A A AF2 T T T T G T T A G
T BEAD PROBE (F2) C A T A A A A C A CT3 G T
G T T T T A T G BEAD PROBE (T3) A T C T T T T
C A AF3 T T G A A A A G A T BEAD PROBE
(F3) C A A T C C A T T AT4 T A A T G G A T T
G BEAD PROBE (T4) C C T T C T A A A TF4 A T
T T A G A A G G BEAD PROBE (F4) A C T C C T A
A T AT5 T A T T A G G A G T BEAD PROBE (T5)
T C T C T C T A C TF5 A G T A G A G A G A
BEAD PROBE (F5)
1. A A A A A A A A C C -T T T C T T A A C C-C A
T A A A A C A C-T4-T5 2. A A A A A A A A C C -T
T T C T T A A C C-C A T A A A A C A C-T4-F5 3. A
A A A A A A A C C -T T T C T T A A C C-C A T A A
A A C A C-F4-T5 4. A A A A A A A A C C -T T T C
T T A A C C-C A T A A A A C A C-F4-F5 5. A A A A
A A A A C C -T T T C T T A A C C -A T C T T T T
C A A-T4-T5 6. A A A A A A A A C C -T T T C T T
A A C C -A T C T T T T C A A-T4-F5 7. A A A A A
A A A C C -T T T C T T A A C C -A T C T T T T C
A A-F4-T5 8. A A A A A A A A C C -T T T C T T A
A C C -A T C T T T T C A A-F4-F5 9. A A A A A A
A A C C-A C T A A C A A A A-C A T A A A A C A
C-T4-T5 10. A A A A A A A A C C-A C T A A C A A A
A-C A T A A A A C A C-T4-F5 11. A A A A A A A A C
C-A C T A A C A A A A-C A T A A A A C A
C-F4-T5 12. A A A A A A A A C C-A C T A A C A A A
A-C A T A A A A C A C-F4-F5 13. A A A A A A A A
C C-A C T A A C A A A A -A T C T T T T C A
A-T4-T5 14. A A A A A A A A C C-A C T A A C A A
A A -A T C T T T T C A A-T4-F5 15. A A A A A A
A A C C-A C T A A C A A A A -A T C T T T T C A
A-F4-T5 16. A A A A A A A A C C-A C T A A C A A
A A -A T C T T T T C A A-F4-F5 17. T T T C C A
A A A A -T T T C T T A A C C-C A T A A A A C A
C-T4-T5 18. T T T C C A A A A A -T T T C T T A
A C C-C A T A A A A C A C-T4-F5 19. T T T C C A
A A A A -T T T C T T A A C C-C A T A A A A C A
C-F4-T5 20. T T T C C A A A A A -T T T C T T A
A C C-C A T A A A A C A C-F4-F5 21. T T T C C A
A A A A -T T T C T T A A C C -A T C T T T T C
A A-T4-T5 22. T T T C C A A A A A -T T T C T T
A A C C -A T C T T T T C A A-T4-F5 23. T T T C
C A A A A A -T T T C T T A A C C -A T C T T T
T C A A-F4-T5 24. T T T C C A A A A A -T T T C
T T A A C C -A T C T T T T C A A-F4-F5 25. T T
T C C A A A A A -A C T A A C A A A A-C A T A A A
A C A C-T4-T5 26. T T T C C A A A A A -A C T A
A C A A A A-C A T A A A A C A C-T4-F5 27. T T T
C C A A A A A -A C T A A C A A A A-C A T A A A A
C A C-F4-T5 28. T T T C C A A A A A -A C T A A
C A A A A-C A T A A A A C A C-F4-F5 29. T T T C
C A A A A A -A C T A A C A A A A -A T C T T T T
C A A-T4-T5 30. T T T C C A A A A A -A C T A A
C A A A A -A T C T T T T C A A-T4-F5 31. T T T
C C A A A A A -A C T A A C A A A A -A T C T T T
T C A A-F4-T5 32. T T T C C A A A A A -A C T A
A C A A A A -A T C T T T T C A A-F4-F5
DNA LIBRARY DNA BITSTRINGS
DNA CODE
70
Example Independent Sets and Cliques
3
Edges in G are 1,2, 2,3, 3,4, 4,5,1,4,
2,5
Edges in G are 1,3, 1,5, 2,4, 3,5,
4
2
3
3
2
2
4
G
G
4
5
1
1
5
1
5
An independent set is a collection of vertices
that contains no edge. A clique is a subgraph
were every pair of vertices has an edge between
them. For a graph G, its complement G is the set
of edges not in G A maximal independent set in G
is a maximal clique in G, e.g., 1,3,5.
3
1
5
71
DNA Computing for Independent Sets and Cliques
3
3
2
2
4
G
G
4
5
1
1
5
72
1. A A A A A A A A C C -T T T C T T A A C C-C A
T A A A A C A C-T4-T5 2. A A A A A A A A C C -T
T T C T T A A C C-C A T A A A A C A C-T4-F5 3. A
A A A A A A A C C -T T T C T T A A C C-C A T A A
A A C A C-F4-T5 4. A A A A A A A A C C -T T T C
T T A A C C-C A T A A A A C A C-F4-F5 5. A A A A
A A A A C C -T T T C T T A A C C -A T C T T T T
C A A-T4-T5 6. A A A A A A A A C C -T T T C T T
A A C C -A T C T T T T C A A-T4-F5 7. A A A A A
A A A C C -T T T C T T A A C C -A T C T T T T C
A A-F4-T5 8. A A A A A A A A C C -T T T C T T A
A C C -A T C T T T T C A A-F4-F5 9. A A A A A A
A A C C-A C T A A C A A A A-C A T A A A A C A
C-T4-T5 10. A A A A A A A A C C-A C T A A C A A A
A-C A T A A A A C A C-T4-F5 11. A A A A A A A A C
C-A C T A A C A A A A-C A T A A A A C A
C-F4-T5 12. A A A A A A A A C C-A C T A A C A A A
A-C A T A A A A C A C-F4-F5 13. A A A A A A A A
C C-A C T A A C A A A A -A T C T T T T C A
A-T4-T5 14. A A A A A A A A C C-A C T A A C A A
A A -A T C T T T T C A A-T4-F5 15. A A A A A A
A A C C-A C T A A C A A A A -A T C T T T T C A
A-F4-T5 16. A A A A A A A A C C-A C T A A C A A
A A -A T C T T T T C A A-F4-F5
17. T T T C C A A A A A -T T T C T T A A C C-C
A T A A A A C A C-T4-T5 18. T T T C C A A A A A
-T T T C T T A A C C-C A T A A A A C A
C-T4-F5 19. T T T C C A A A A A -T T T C T T
A A C C-C A T A A A A C A C-F4-T5 20. T T T C C
A A A A A -T T T C T T A A C C-C A T A A A A C
A C-F4-F5 21. T T T C C A A A A A -T T T C T T
A A C C -A T C T T T T C A A-T4-T5 22. T T T C
C A A A A A -T T T C T T A A C C -A T C T T T
T C A A-T4-F5 23. T T T C C A A A A A -T T T C
T T A A C C -A T C T T T T C A A-F4-T5 24. T T
T C C A A A A A -T T T C T T A A C C -A T C T
T T T C A A-F4-F5 25. T T T C C A A A A A -A C
T A A C A A A A-C A T A A A A C A C-T4-T5 26. T
T T C C A A A A A -A C T A A C A A A A-C A T A A
A A C A C-T4-F5 27. T T T C C A A A A A -A C T
A A C A A A A-C A T A A A A C A C-F4-T5 28. T T
T C C A A A A A -A C T A A C A A A A-C A T A A A
A C A C-F4-F5 29. T T T C C A A A A A -A C T A
A C A A A A -A T C T T T T C A A-T4-T5 30. T T
T C C A A A A A -A C T A A C A A A A -A T C T T
T T C A A-T4-F5 31. T T T C C A A A A A -A C T
A A C A A A A -A T C T T T T C A A-F4-T5 32. T
T T C C A A A A A -A C T A A C A A A A -A T C T
T T T C A A-F4-F5
DNA Library
2( Coding Strands / 2) Coding Strands / 2
Bits
T T T T T G G A A A
24. T T T C C A A A A A -T T T C T T A A C C
-A T C T T T T C A A-F4-F5
T T T T G T T A G T
10.A A A A A A A A C C-A C T A A C A A A A-C A T
A A A A C A C-T4-F5
X1F or X2F
T T T T G T T A G T
T T T T T G G A A A
All subsets not containing 1,2
29. T T T C C A A A A A -A C T A A C A A A A
-A T C T T T T C A A-T4-T5
T T T T T G G A A AProbe(F1)
T T T T G T T A G TProbe(F2)

Edge 1,2 STM
9. A A A A A A A A C C-A C T A A C A A A A-C A T
A A A A C A C-T4-T5 10. A A A A A A A A C C-A C T
A A C A A A A-C A T A A A A C A C-T4-F5 11. A A A
A A A A A C C-A C T A A C A A A A-C A T A A A A C
A C-F4-T5 12. A A A A A A A A C C-A C T A A C A A
A A-C A T A A A A C A C-F4-F5 13. A A A A A A A
A C C-A C T A A C A A A A -A T C T T T T C A
A-T4-T5 14. A A A A A A A A C C-A C T A A C A A
A A -A T C T T T T C A A-T4-F5 15. A A A A A A
A A C C-A C T A A C A A A A -A T C T T T T C A
A-F4-T5 16. A A A A A A A A C C-A C T A A C A A
A A -A T C T T T T C A A-F4-F5 17. T T T C C A
A A A A -T T T C T T A A C C-C A T A A A A C A
C-T4-T5 18. T T T C C A A A A A -T T T C T T A
A C C-C A T A A A A C A C-T4-F5 19. T T T C C A
A A A A -T T T C T T A A C C-C A T A A A A C A
C-F4-T5 20. T T T C C A A A A A -T T T C T T A
A C C-C A T A A A A C A C-F4-F5 21. T T T C C A
A A A A -T T T C T T A A C C -A T C T T T T C
A A-T4-T5 22. T T T C C A A A A A -T T T C T T
A A C C -A T C T T T T C A A-T4-F5 23. T T T C
C A A A A A -T T T C T T A A C C -A T C T T T
T C A A-F4-T5 24. T T T C C A A A A A -T T T C
T T A A C C -A T C T T T T C A A-F4-F5 25. T T
T C C A A A A A -A C T A A C A A A A-C A T A A A
A C A C-T4-T5 26. T T T C C A A A A A -A C T A
A C A A A A-C A T A A A A C A C-T4-F5 27. T T T
C C A A A A A -A C T A A C A A A A-C A T A A A A
C A C-F4-T5 28. T T T C C A A A A A -A C T A A C
A A A A-C A T A A A A C A C-F4-F5 29. T T T C C
A A A A A -A C T A A C A A A A -A T C T T T T C
A A-T4-T5 30. T T T C C A A A A A -A C T A A C
A A A A -A T C T T T T C A A-T4-F5 31. T T T C
C A A A A A -A C T A A C A A A A -A T C T T T T
C A A-F4-T5 32. T T T C C A A A A A -A C T A A
C A A A A -A T C T T T T C A A-F4-F5
X1T and X2T
1. A A A A A A A A C C -T T T C T T A A C C-C A
T A A A A C A C-T4-T5 2. A A A A A A A A C C -T
T T C T T A A C C-C A T A A A A C A C-T4-F5 3. A
A A A A A A A C C -T T T C T T A A C C-C A T A A
A A C A C-F4-T5 4. A A A A A A A A C C -T T T C
T T A A C C-C A T A A A A C A C-F4-F5 5. A A A A
A A A A C C -T T T C T T A A C C -A T C T T T T
C A A-T4-T5 6. A A A A A A A A C C -T T T C T T
A A C C -A T C T T T T C A A-T4-F5 7. A A A A A
A A A C C -T T T C T T A A C C -A T C T T T T C
A A-F4-T5 8. A A A A A A A A C C -T T T C T T A
A C C -A T C T T T T C A A-F4-F5
73
DNA Library
1,2
2,4
1,3
2,5
1,4
3,4

1,5
3,5
2,3
4,5
Black ON, Red OFF Independent Sets in G
Black OFF, Red ON Cliques in G
74
Universal DNA Computer for any Graph on n Vertices
DNA Library
Every Graph G on n vertices has G union G all
possible pairs on n vertices. This enables the
construction of a universal device.
1,2
1,3
Each possible edge is an STM. Then depending on
the problem, the flow is directed by the
edges present (or absent) in the given graph
.
n-2,n

n-1,n
Edges in G ON, Edges in G OFF Independent Sets
in G when flow completed
Edges in G OFF, Edges in G ON Cliques in G
when flow completed
Write a Comment
User Comments (0)
About PowerShow.com