Genetics in Epidemiology - PowerPoint PPT Presentation

View by Category
About This Presentation
Title:

Genetics in Epidemiology

Description:

Genetics in Epidemiology Nazarbayev University July 2012 Jan Dorman, PhD University of Pittsburgh Pittsburgh, PA, USA jsd_at_pitt.edu * * Original Method for DNA ... – PowerPoint PPT presentation

Number of Views:428
Avg rating:3.0/5.0
Slides: 58
Provided by: JanDo6
Learn more at: http://www.pitt.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Genetics in Epidemiology


1
Genetics in Epidemiology
  • Nazarbayev University
  • July 2012
  • Jan Dorman, PhD
  • University of Pittsburgh
  • Pittsburgh, PA, USA
  • jsd_at_pitt.edu

2
Genetics in Epidemiology
  • Is important because
  • It focuses on heritable non-modifiable
    determinants of disease
  • It allows examination of gene-gene
    gene-environment interactions
  • It can contribute to personalized medicine
  • Is being transformed because
  • Human Genome Project is complete
  • Genetic variation can be now examined across the
    entire genome at a very low cost
  • Contribution of GWAS has been enormous in terms
    of identifying disease-susceptibility genes

3
Human Genome Project
  • February 2010 marked the 10th anniversary of the
    completion of the human genome project
  • Initial sequence was finished early because of
    advancements in genome sequence technology
  • Resulted in drastically reduced labor delivery
    costs

4
Human Genome Sequencing Costs
  • 2000 Human Genome Project
  • 3 billion
  • 2007 James Watson
  • 2 million
  • 2009 Illumina Helicos
  • 50,000
  • 2010 Illumina HiSeq
  • 10,000
  • 2014 Multiple companies
  • 1,000

5
(No Transcript)
6
Genetics in Epidemiology
  • Is there evidence of familial aggregation of the
    disorder (phenotype)?
  • Is a positive family history an independent risk
    factor for the disorder?
  • For many chronic disorders, a positive family
    history is associated with odds ratios between
    2-6
  • Is there evidence of heritability?
  • A heritability of 50 indicates that ½ of the
    variation in disease risk in a population is due
    to genetics

7
J Intern Med 200826316
8
Candidate Gene Approach
  • Are there potential candidate genes?
  • Genes that are selected based on known
    biological, physiological, or functional
    relevance to the phenotype under investigation
  • Approach is limited by its reliance on existing
    knowledge about the biology of disease
  • Associations may be population-specific
  • E.g., type 2 diabetes
  • Genes encoding molecules known to primarily
    influence pancreatic ß-cell or insulin action
  • ABCC8 (sulphonylurea receptor), INS, INSR, etc.

PLOS Bio 2003141
9
Alternative Approach
  • Genome-wide association studies (GWAS)
  • Hypothesis common genetic variants (gt5)
    common diseases (traits)
  • Limited number of variants, each with a small
    effect
  • No a priori hypotheses
  • Power to identify rare variants (1-5) is limited
  • First publication was in 2005
  • Complement factor H age-related macular
    degeneration
  • Require
  • Large, well-characterized populations
  • Genotyping across the entire genome
  • Sophisticated data analysis collaborate on
    this!!

10
Monogenetic vs. Common Disorders
11
GWAS
  • 2 tiered approach
  • 1st tier genotyping identifies the discovery
    set
  • 2nd tier discovery set genotyped in another
    population
  • Replication is a requirement for publication
  • 3rd tier rule out false positives false
    negatives
  • Requires consortia
  • Possible because
  • High-density genotype platforms
  • By 2007 chips contained 500,000 1,000,000
    markers
  • DNA samples were available from
    well-characterized epidemiological cohorts

12
GWAS Example
NEJM 2010 362166
13
GWAS Example
NEJM 2010 362166
14
GWAS Example
NEJM 2010 362166
15
GWAS
  • Have identified novel gene-disease (trait)
    associations
  • Most alleles are common (gt5)
  • Most have small effect sizes (OR 1.5)
  • Are providing insights into pathways of complex
    diseases

16
Published Genome-Wide Associations published for
249 traits
NHGRI GWA Catalog www.genome.gov/GWAStudies
17
Genetics Review
18
Anatomy of the Cell
19
Chromosomes, Genes DNA
  • Somatic cells are diploid - 46 chromosomes
  • 22 pairs autosomes 1 pair sex chromosomes
  • Each pair of autosomes is homologous
  • Contains the same genes in the same order
  • 1 is maternal, the other is paternal
  • Chromosome are composed of deoxyribonucleic acid
    (DNA)
  • Genome contains 3 billion base pairs (haploid)
  • 1 encode proteins
  • Genes are located on chromosomes

20
Human Karyogram
21
Figure of a Chromosome
22
DNA Double Helix
23
Base Pairs of a Double Helix
T
C
A
G
24
Structure of a Gene
A gene is a functional unit that includes
introns, exons enhancer promoter sequences
untranslated sequences at the 5 3 ends
25
Transcription Results in mRNA
26
Primary Transcript
27
mRNA Processing
28
From Genes to Proteins via mRNA
  • Proteins consist of 1 polypeptide chains
  • Polypeptides chains are made of amino acids
  • There are 20 amino acids
  • Their order in is determined by the mRNA sequence
    read in triplet
  • Genetic code
  • 64 combinations of 3 bases called codons
  • 3 are stop codons (UAA, UGA, UAG)
  • Genetic code is degenerate
  • Genetic code is universal

29
Genetic Code
30
mRNA Determines AA Sequence
31
Translation is Protein Synthesis
32
Post-Translation Modifications
33
Advancements in Biotechnology
34
Original Method for DNA Sequencing
35
Polymerase Chain Reaction (PCR)
  • Revolutionized molecular genetics
  • Exploits the in vivo processes of DNA replication
    to copy short DNA fragments in vitro within a few
    hours
  • Exponential increase of target DNA sequences
  • Highly sensitive need small amount of template
    DNA
  • DNA photocopier

36
PCR - Cycle 1
5 A C G T T A C C G T G A A C G T C T T A 3
Denaturation, 30 seconds H bonds dissolve at 95oC
3 T G C A A T G G C A C T T G C A G A A T 5
37
PCR - Cycle 1
5 A C G T T A C C G T G A A C G T C T T A 3
3 C A G A AT 5
Anneal primers, 30 seconds at 35-65oC Temperature
determined by sequence / length
5 A C G T T A 3
3 T G C A A T G G C A C T T G C A G A A T 5
38
PCR - Cycle 1
5 A C G T T A C C G T G A A C G T C T T A 3
T G C A A T G G C A C T T G C A G A A T 5
Extension of Primers, 30 seconds at 70-75oC Taq
polymerase - thermostable
5 A C G T T A C C G T G A A C G T C T T A
3 T G C A A T G G C A C T T G C A G A A T 5
39
Post-Genome Era
40
Human Genetic Variation
  • Single nucleotide polymorphisms (SNPs)
  • Tandem repeat Sequences
  • Microsatellites (lt8 bp)
  • Minisatellites (VNTRs 8-100 bp)
  • Copy number variants (CNVs 1Kb 1Mb)
  • Insertions deletions (indels 100bp 1Kb)
  • Note size limitations are arbitrary no
    biological basis definitions are not consistent
    across studies

41
SNPs
  • 10 million SNPs in human genome counting
  • Most common type of genetic variation
  • 2 alleles e.g., A ? T
  • Occurs across the entire genome in stable
    regions
  • Many SNPs are in linkage disequilibrium
  • SNPs close together are more likely to travel
    together in a block than SNPs far apart
  • Can use 1 tagging SNP per block cost
    effective

42
Linkage Disequilibrium
Haplotype Block
NEJM 2007 3561094
43
SNPs Tag Haplotype Blocks
NEJM 2007 3561094
44
International HapMap
  • Emerged as next logical step after sequencing
    human genome
  • Goal was to create a public genome-wide database
    of common genetic variants
  • Genotyped SNPs from 270 samples from
  • Nigeria, Utah, Han Chinese, Japanese
  • Phase I
  • Typed 1 million common SNPs (gt5) to characterize
    LD patterns
  • Phase II
  • Typed 3 million rare SNPs (1-5)

45
DNA Microarray
Used to genotype 500,000 1 million SNPs
46
International HapMap
  • Where are the SNPs?
  • 12 occur in protein coding regions
  • 8 occur in gene regulatory regions
  • 40 occur in non-coding introns
  • 40 occur in intergenic sequences
  • Regions of high linkage disequilibrium are
    similar across populations
  • HapMap was instrumental in facilitating GWAS

47
Tandem Repeat Sequences
  • 100,000 TRSs in human genome
  • Microsatellites (VNTRs)
  • Repeat units (8 100 bp)
  • Minisatellites
  • Repeat units (2 8 bp)
  • Eg., CAGCAGCAGCAGCAGCGACAG
  • More than 200 diseases genes indentified
  • E.g., Huntingtons disease, Fragile X syndrome

48
Copy Number Variants
  • Size is 1 Kb to 1 Mb
  • Duplications or deletions
  • Less is known about CNV
  • Term was introduced in 2004
  • Are ubiquitous reflect 12 of human genome
  • May span multiple genes
  • May change gene dosage or effect transcription
    and translation
  • Are creating a CNV map along with HapMap
  • Associated with autism, schizophrenia, lupus,
    Crohns disease, rheumatoid arthritis

49
Copy Number Variants
50
Indels
  • Insertions deletions
  • Size 100 bp to 1 Kb
  • Millions in genome
  • Introduced in 2006
  • Phenotype may depend on gene dosage
  • May occur within genes or in promoter
  • Also creating an indel map

51
Consequences of Genetic Variation
  • No change
  • In a non-coding region
  • In a coding region - genetic code is degenerate
  • Change in 1 amino acid of a protein
  • Change in multiple amino acids of a protein
  • A truncated protein
  • Change in gene expression
  • In a regulatory region or splice site
  • Next generation GWAS will be based on markers
    other than SNPs
  • Tandem repeats, CNV, indels

52
Genetic Variation Databases
Database Content Address
dbSNP SNPs covering the human genome http//www.ncbi.nlm.nih.gov/projects/SNPs
HapMap Catalog of variants from HapMap Project http//hapmap.org
1000 Genome Project Extension of HapMap aim to catalog 95 of variants with 1 freq www.1000genomes.org
UCSC Genome Bioinformatics Reference human genome sequence with annotation http//genome.ucsc.edu
Ensembl Genome browser, annotation, comparative genomics http//www.ensembl.org/index.html
53
Genetic Variation Databases
Database Content Address
GeneCards Database of human genes linked to relevant databases http//www.genecards.org
PharmGKB SNPs involved in drug metabolism http//www.pharmgkb.org
DGV Database of Genomic Variants, including CNV http//projects.tcag.ca/variation
SCAN SNP CNV annotation based on gene function expression http//www.scandb.org/newinterface
OMIM Online Mendelian Inheritance in Man over 12,000 genes http//www.ncbi.nlm.nih.gov/sites/entrez?dbomim
54
Genetic Variation Databases
Database Content Address
HuGE navigator Human genome epidemiology knowledge base http//hugenavigator.net/HuGENavigator/home.do
Best Pract Res Clin Endo Metab, 2012. 26119.
55
Collecting DNA
  • Sources of DNA
  • Blood samples
  • Buccal brushes
  • Saliva samples
  • Dried blood spots
  • Depends on
  • Conditions at time of collection
  • Resources available to process samples
  • What other biological samples will be collected
  • Long short term storage
  • Quality control

56
Saliva vs. Blood Samples
  • Considerations
  • Lower cost
  • More convenient acceptable to patients
  • Increases compliance
  • Lower mean yield of DNA
  • But quality is comparable
  • No difference in success from high throughput
    genotyping

57
Other Considerations
  • Informed consent
  • What analysis can be performed now?
  • What analysis can be performed in the future?
  • Who has control of the specimen?
  • Do you need to re-consent the participants due
    to IRB changes?
  • Will you inform participants of results?
About PowerShow.com