Genetics and Molecular Biology Tutorial II Computational Perspective - PowerPoint PPT Presentation

1 / 64
About This Presentation
Title:

Genetics and Molecular Biology Tutorial II Computational Perspective

Description:

Genetics and Molecular Biology Tutorial II -- Computational Perspective ... D-R Heterogeneous, DD Epistatic. AA Aa aa. BB 1 1 0. Bb 1 1 0. bb 1 1 1. reduced penetrance ... – PowerPoint PPT presentation

Number of Views:186
Avg rating:3.0/5.0
Slides: 65
Provided by: kevinpe4
Category:

less

Transcript and Presenter's Notes

Title: Genetics and Molecular Biology Tutorial II Computational Perspective


1
Genetics and Molecular Biology Tutorial II --
Computational Perspective
  • The goal is to introduce some topics to
    individuals with a minimal background in
    genetics/biology, and yet try to provide some
    examples of topics to maintain the interest of
    individuals with extensive biological/genetics
    backgrounds.

2
Outline
  • Gene structure
  • genomic structure vs mRNA structure
  • coding and noncoding exons
  • introns
  • primary transcript processing
  • aside -- nonsense mediated mRNA degradation
  • alternative splicing and differential
    polyadenylation
  • evolutionary conservation of coding and noncoding
    sequences

3
Outline
  • Genomic structure
  • repetitive sequences
  • LINES and SINES
  • example -- Y chromosome palindromes
  • C value paradox
  • genomes of model organisms
  • example
  • yeast genome and gene-chip
  • single/double knockouts
  • cross-species sequence similarities for putative
    function identification
  • example -- chaperonine

4
Fundamental Genetics and Probability Concepts
  • meiosis and sampling
  • patterns of inheritance
  • monogenic and complex inheritance
  • phenocopy
  • reduced penetrance
  • DNA variation
  • polymorphisms, SNPs, and mutations
  • positional cloning

5
Gene Structure
6
Transcript Processing
  • DNA -gt pre-mRNA -gt mRNA -gt protein

7
Nonsense mediated mRNA degradation
  • unknown mechanism
  • more rapidly degrades mRNA containing
  • Lykke-Andersen, mRNA quality control Marking
    the message for life or death. Current Biology,
    11, 2001.

8
Nonsense Mediated mRNA Degradation
9
Genome Structure -- repeat classes
10
C-Value ParadoxHartl, Molecular melodies in
high and low C, Nat. Rev. Genetics, Nov 20001
  • refers to the massive, counterintuitive and
    seemingly arbitrary differences in genome size
    observed in eukaryotic organisms
  • Drosophila melanogaster 180 Mb
  • Podisma pedestris 18,000 Mb
  • difference is difficult to explain in view of
    apparently similar levels of evolutionary,
    developmental, and behavioral complexity

11
Alternative Splicing
  • Every conceivable pattern of alternative
    splicing is found in nature. Exons have multiple
    5 or 3 splice sites alternatively used (a, b).
    Single cassette exons can reside between 2
    constitutive exons such that alternative exon is
    either included or skipped ( c ). Multiple
    cassette exons can reside between 2 constitutive
    exons such that the splicing machinery must
    choose between them (d). Finally, introns can be
    retained in the mRNA and become translated.
  • Graveley, Alternative splicing increasing
    diversity in the proteomic world. Trends in
    Genetics, Feb., 2001.

12
Classic View of Gene No Longer Valid -- Strachan
pg 185
13
Alternative Splicing Example -- Graveley 2001
14
Alternative PolyAdenylation
  • common in human RNA (Edwards-Gilbert 1997)
  • in many genes, 2 or more poly-A signals in 3 UTR
  • alternative transcripts can show tissue
    specificity
  • alternative poly-A signals may be brought into
    play following alternative splicing

15
Edwards-Gilbert. Nucleic Acids Res, 13, 1997
16
  • Evolution of the mitochondrial genome and origin
    of eukaryotic cells

17
Evolutionary Conservation of Coding and Noncoding
Sequences
  • Sequencing of H. sapiens and model organisms is
    basis for comparative genomics
  • Generally, functional solutions (encoded as
    genes) across organisms allows us to compare gene
    sequences and infer function
  • protein functional/structural region domains
  • Intergenic regions are generally not conserved
    (always exceptions)

18
Example - MKKS (UniGene Clusters)
  • human rat 87.4
  • human mouse 84.9
  • human cow 87.1
  • mouse rat 97.8
  • rat cow 91.0
  • mouse cow 85.1
  • frog rat 62.5

19
Example - MKKS
20
(No Transcript)
21
Computational Approach to Using Conserved Regions
  • Problem -- want to screen genes for mutations
  • Conventional approach -- screen all exons of a
    single gene
  • Alternative -- identify domains with in multiple
    genes, and screen domains first, to optimize
    screening time and resources

22
Cross-Species Similarities
  • yeast
  • gene chip for hybridization/expression
  • complete genome (first eukaryote)
  • singe knockouts and double knockouts

23
Fundamental Genetics
  • meiosis
  • Hs are diploid
  • meiosis produces haploid gametes
  • mechanism for transmission of genetic material to
    offspring
  • recombination by cross-over (Holliday structure)
    or by independent segregation of homologous pairs

24
Fundamental Genetics (Background for Linkage
Analysis)
  • Rule of Segregation
  • offspring receive ONE allele (genetic material)
    from the pair of alleles possessed by BOTH
    parents
  • Rule of Independent Assortment
  • alleles of one gene can segregate independently
    of alleles of other genes
  • (Linkage Analysis relies on the violation of
    Independent Assortment Rule)

25
Genetic Marker Prelude to LA
  • A genetic marker allows for the observation of
    the genetic state at a particular genomic
    location (locus).
  • A genotype is the measured state of a genetic
    marker.
  • May never be feasible to sequence cases directly.
  • An informative marker is often heterozygous,
    or polymorphic and enables the observation of
    the inheritance of genetic material.

26
Monogenic and Polygenic Diseases
  • monogenic (Mendelian) -- one gene
  • simple (dominant and recessive) Mendelian
    inheritance
  • direct correspondence between one gene mutation
    and one disorder
  • majority of disease genes found are monogenic
  • polygenic -- (complex) multiple genes
  • heterogeneity and epistasis
  • combinatorics
  • no longer have direct correspondence between one
    gene and disorder
  • majority of disorders are probably polygenic
  • complexity of organisms and observed pathways

27
...Mongenic and Polygenic Diseases
  • phenocopy
  • reduced penetrance
  • Example -- sickle cell anemia
  • classic recessive disorder
  • defect in red blood cells (hemoglobin)
  • but infant hemoglobin gene can leak
  • wide range of phenotypes

28
Examples
29
Examples
30
Example
31
BBS4 Pedigree
32
Hardy-Weinberg Equilibrium
  • Rule that relates allelic and genotypic
    frequencies in a population of diploid, sexually
    reproducing individuals if that population has
    random mating, large size, no mutation or
    migration, and no selection
  • Assumptions
  • allelic frequencies will not change in a
    population from one generation to the next
  • genotypic frequencies are determined in a
    predictable way by allelic frequencies
  • the equilibrium is neutral -- if perturbed, it
    will reestablish within one generation of random
    mating at the new allelic frequency

33
(No Transcript)
34
H-W
  • f(AA) p2
  • f(Aa) 2pq
  • f(aa) q2
  • (pq)2
  • (p2 q2 r2 2pq 2pr 2qr) (pqr)2

35
Dominant and Recessive Penetrance
Modeledpenetrance P(pt gt)
  • DD Dd dd
  • 1 1 0
  • DD Dd dd
  • 0.9 0.9 0.0
  • DD Dd dd
  • 0 0 1
  • DD Dd dd
  • 0 0 0.8

36
D-R Heterogeneous, DD Epistatic
  • AA Aa aa
  • BB 1 1 0
  • Bb 1 1 0
  • bb 1 1 1
  • reduced penetrance
  • 3,9,27,81,243 3n
  • AA Aa aa
  • BB 1 1 0
  • Bb 1 1 0
  • bb 0 0 0

37
Dom-Rec Heterozygous
Screen genes A, B?, b
38
Uninformative Marker
39
Informative Marker
40
  • Given the following observations family
    structure, affection status, genotypes, and
    disease allele frequencies. Assuming a model for
    the disease, can we calculate the probability
    that these observations fit an assumed model???

41
Linkage
42
Linkage Analysis
  • Goal find a marker linked to a disease gene.
  • LOD score log of likelihood ratio
  • LR?data k Pdata ?
  • theta estimate of genetic distance
    (recombination fraction) between marker and
    disease
  • proportion of recombinant gametes/total gametes

43
Linkage Analysis
  • Linkage analysis calculates the likelihood that
    the inheritance pattern of the phenotype
    (disease) is supported by the observed
    inheritance patterns (genotypes) in a pedigree.
  • few monogenic models, easy to test
  • more difficult to find models explaining
    inheritance in polygenic models
  • parameter maximization

44
Linkage Analysis Programs
  • FASTLINK - 2 point
  • O(n2), where n number of markers
  • GeneHunter - multipoint, 2 point
  • O(n2), where n number of people

45
Allele Sharing
  • tries to show that affected family members
    inherit the same chromosomal regions more often
    than expected by chance

46
Allele Sharing Example
Needs at least sibs.
47
Association Studies
  • Allelic association studies provide the most
    powerful method for locating genes of small
    effect contributing to complex diseases and
    traits. Daniels, Am J Hum Genet 621189-1197,
    1998.
  • Linkage analysis
  • genome wide screen, 400 markers 10 cM (10 MB),
    association needs 4000 polymorphic markers
  • generally need nuclear family or larger
  • Association finds linkage disequilibruim

48
Association Studies
  • Association is simply a statistical statement
    about the co-occurrence of alleles or phenotypes.
    Allele A is associated with disease D if people
    who have D also have A more (or maybe less) often
    than would be predicted from the individual
    frequencies of D and A in the population. Pg.
    286 Human Molecular Genetics 2, Tom Strachan

49
Examples
  • HLA-DR4 (antigen marker)
  • 36 in UK
  • 78 with rheumatoid arthritis
  • CF( RFLP markers XV2.c (X1,X2), KM19(K1,K2))
  • Marker Alleles CF(case) Normal(control)
  • X1, K1 3 49
  • X1, K2 147 19
  • X2, K1 8 70
  • X2, K2 8 25
  • CF associated with X1, K2 in 89 (Strachan)

50
Linkage Disequilibrium
  • linkage equilibrium (aka Hardy-Weinberg) is true
    if
  • P(gt1,gt1gt2,gt2) P(gt1,gt1)P(gt2,gt2)
    where P(haplotype)
  • case vs controls
  • TDT (heterozygous marker transmitted), HRR
    (untransmitted alleles as control)
  • allelic associations (outbred populations)
    maintained at only lt 1cM

51
Equilibrium
52
SNPs
  • Single-Nucleotide Polymorphisms
  • 1 every 1000 bp (estimated)
  • 2,972,052 SNPs submitted to dbSNP
  • dbSNP summary link
  • 50 of all SNPs are in question
  • 10 of UTRs have SNPs
  • 100,000 - 500,000 SNPs needed
  • Why dont we do this?

53
Homozygosity Mapping
54
Positional Cloning
55
Disease Gene Identification
  • SSCP -- single strand conformational polymorphism
  • PCR -- polymerase chain reaction
  • primers amplify template sequence
  • direct sequencing
  • BBS2 (Bardet-Biedl Syndrome)

56
BBS2 genetic mapping
C16
1 2 3 4 5 6 7 8 9 10 11 12
57
BBS2 genetic mapping
unaffected
affected
C16
1 2 3 4 5 6 7 8 9 10 11 12
58
BBS4 Gene (Direct Sequencing)
(Hs.26471)
59
BBS4 Deletion (by PCR)
exons 3 4
60
BBS4 Mutations (direct sequencing)
(R295P)
61
Summary
  • Disease Gene Identification
  • challenges
  • interval localization
  • genotyping and genetic markers, linkage analysis,
    allele sharing, association studies (SNiPs),
    homozygosity mapping
  • disease gene identification techniques
  • Take home
  • A complex disorder (with interacting genes) has
    yet to be characterized

62
Demo -- installing a database
  • A database organizes data
  • Most common
  • relational database (oracle, sybase)
  • perceived as a collection of tables,
  • where table is an unordered collection of rows
  • each row has a fixed number of fields, and each
    field can store a predefined type of data value
    (date, integer, string, etc.)
  • simplest
  • flat file

63
Databases
  • NCBI
  • BLAST
  • Amazon
  • Yahoo
  • Several of our own
  • genotypes
  • rat ESTs
  • eye clones from differential display
  • micro-array data

64
This space intentionally left blank
Write a Comment
User Comments (0)
About PowerShow.com