Pharmacogenomics presentation

About This Presentation

Transcript and Presenter's Notes

Title: Pharmacogenomics

1
Pharmacogenomics
Pharmacogenetics is an old discipline. One many
distinguish pharmacogenetics (the study of a
single gene) and pharmacogenomics (study of many
genes or entire genomes) or use pharmagenomics
for approaches that go beyond DNA to include mRNA
and proteins Today, it is possible to assess
entire pathways that might be relevant to disease
or to drug response at the DNA, mRNA and protein
levels. Eventually, the entire genome,
transcriptome and proteome will be
available. Therefore, parmacogenetics/-genomics
and disease genetics/genomic are undergoing
similar transitions, with a shift in focus from
Mendelian examples (one gene ? one disease) to
more complex modes of genetic causation.
2
Where do drugs interact with proteins?
This figure shows the paths that are taken by the
anti-epileptic drug phenytoin and the
angiotensin-converting enzyme (ACE) inhibitor
imidapril in the human body. Phenytoin is
absorbed into the bloodstream at the gut and
circulated through the liver to the brain. It
crosses the bloodbrain barrier where it binds
and inhibits its target, neuronal sodium
channels. It is pumped back out across the
bloodbrain barrier into the bloodstream by
multidrug resistance protein 1 (MDR1 , also
known as ABCB1) efflux pumps. At the liver,
phenytoin is metabolized by the cytochrome P450
enzymes CYP2C9 and CYP2C19, and it is eliminated
through the kidneys. Imidapril is a PRO-DRUG .
After its absorption from the gut into the
bloodstream it is hydroxylated in the liver to
the active metabolite imidaprilat. Imidaprilat
binds and inhibits ACE in the plasma. Imidaprilat
is also eliminated through the kidneys.

Goldstein et al. Nature Rev. Gen. 4, 937 (2003)
3
These associations were compiled from the
literature by using the keywords pharmacogenetics
OR pharmacogenomcis, association study
AND drug response, polymorphism AND drug
response. Therefore, the list omits many
polymorphisms and probably includes some false
positives. Most of the polymorphisms are either
in the drug target or in a protein that is in
the pathway in which the target acts.
Goldstein et al. Nature Rev. Gen. 4, 937 (2003)
4
The SNP Consortium
Goldstein et al. Nature Rev. Gen. 4, 937 (2003)
5
Haplotypes
The diagram shows 5 haplotypes. 12 SNPs are
localized in order along the chromosome. The
letters on the top indicate groups of SNPs that
have perfect pairwise linkage disequilibrium (LD)
with one another, and the numbers on the bottom
indicate each of the 12 SNPs. SNP 9 is the causal
variant, which in this simple example determines
drug response allele C results in a therapeutic
response, whereas allele G results in an adverse
reaction. In this example, the selection of just
one SNP from each of the groups AE would be
sufficient to fully represent all of the
haplotype diversity. Each haplotype can be
identified by just five tagging SNPs (tSNPs), and
the causal variant would be tagged even if it
were not itself typed. So, tSNP profiles that are
highlighted predict an adverse reaction to the
medicine. Normally, LD patterns are not so
clear-cut and statistical methods are required to
select appropriate sets of tSNPs.

Goldstein et al. Nature Rev. Gen. 4, 937 (2003)
6
Haplotypes
b The diagram depicts the same 12 SNPs, but with
different associations among them, as might
happen in a different population group. Because
patterns of LD are different, some patients would
be misclassified if the same five tSNPs were used
and interpreted in the same way. Using the same
SNP profiles as defined in population A,
haplotype profiles 1, 2 and 3 are predicted to
have allele C at the causal SNP 9 (a therapeutic
response), whereas haplotype profiles 4 and 5 are
predicted to have an adverse response. However,
because the pattern of association has changed,
the new haplotypes 6 and 7 are misclassified as
haplotype patterns 6 and 7 in population B.

Goldstein et al. Nature Rev. Gen. 4, 937 (2003)
7
Discovering genotypes underlying phenotypes
from mendelian diseases to complex diseases

Traditional view over the past decade, about
1200 genes causing human diseases or traits have
been identified, largely by positional
cloning. Identification of the gene ? knowledge
of relevant protein(s) ? often leads to
understanding of the molecular and physiological
basis of the disease phenotype. Successful
examples in positional cloning identifcation of
genes underlying chronic granulomatous
disease X-linked muscular dystrophies cystic
fibrosis Fanconi anemia ataxia
telangiectasia neurofibromatosis I Huntington
disease identification of genes underlying
hereditary predispositions to cancer, including
retinoblastoma breast cancer polyposis
colorectal cancer
Botstein Risch, Nature Gen. 33, 228 (2003)
8
Linkage mapping
Positional cloning begins with linkage
analysis. Families in which the disease
phenotype segregates are analyzed using a group
of DNA polymorphisms. Ideal method for diseases
with very clear diagnosis. The limit of
resolution remains the number of meioses in which
crossovers might have occurred. In favorable
cases (such as cystic fibrosis), the patterns of
crossovers in the region of the gene among the
cohorts studied leaves only a few predicted
genes, all within about 1cM (1Mb) as likely
candidates. In less favorable cases, there may
be as many as a few hundred predicted genes that
might be the relevant disease genes.

Botstein Risch, Nature Gen. 33, 228 (2003)
9
Linkage disequilibrium
Greater power in fine-mapping is obtained by
haplotype analysis, in which all markers are
considered simultaneously as haplotypes rather
than individually. Haplotype analysis allows the
inference of likely historical crossover points,
which localize the disease mutation. New
algorithms based on haplotype analysis are being
developed to estimate statistically the likely
locations of such crossovers and thus the likely
location of the disease mutation. The success of
linkage disequilibrium (LD) mapping depends
heavily on the degree of genetic heterogeneity
underlying a disease sample. Unless one or a few
mutations account for most instances of disease,
the signal will be too inconsistent to find
mutations. Some degree of heterogeneity is
tolerable and can be overcome by clustering of
disease chromosomes.

Botstein Risch, Nature Gen. 33, 228 (2003)
10
Lessons from cloned mendelian genes
HGMD lists 27.000 mutations in 1222 genes
associated with human diseases and
traits. In-frame amino acid substitutions are
the most frequent. Less than 1 are found in
regulatory regions.

These data provide overwhelming support for the
notion that mendelian clinical phenotypes are
associated primarily with alterations in the
normal coding sequence of proteins.
Botstein Risch, Nature Gen. 33, 228 (2003)
11
Criteria for amino acid replacements
Distinguish (1) biochemical severity of missense
changes, and (2) location and/or context of the
altered amino acid in the protein sequence. A
useful guide is the Grantham scale categorize
codon replacements into classes of increasing
chemical dissimilarity between the encoded amino
acids conservative moderately
conservative moderately radical radical stop
or nonsense. There is a clear relationship
between the severity of amino acid replacement
and the likelihood of clinical observation.

Botstein Risch, Nature Gen. 33, 228 (2003)
12
Clinical severity increases with severity of AA
substitution
Purple bars represent the ratio of frequencies of
the indicated class of change compared to
conservative changes for functional human genes
compared to pseudogenes. Orange bars represent
the ratio of the likelihood of clinical
observation for a conservative change versus the
indicated class of change. A nonsense change is
9 times more likely to present clinically than a
conservative amino acid substitution. For the
other changes, the ratios are 3, 2.3, and 1.8.

9 x
The same trend exists for the relative abundance
of the different types of substitutions found in
SNPs from human genes as compared with their
abundance in pseudogenes. Evolution selects
against radical changes!
Botstein Risch, Nature Gen. 33, 228 (2003)
13
Clinical significance correlates with degree of
cross-species evolutionary conservation
An obvious way to measure the importance of a
particular amino acid conservation across
species. The figure shows that the disease
probability decreases monotonically with the
number of amino acid differences among species.
In simple terms if evolution allows mutations
between species, this amino acid cannot be so
crucial.

Relative risks (log odds ratios) for the observed
versus the expected number of amino acid
changes. Purple severe diseases, Orange milder
disease mutations (G6PD).
Botstein Risch, Nature Gen. 33, 228 (2003)
14
Correlation of clinical severity and severity of
gene lesion
In numerous cases, genotype-phenotype correlation
has identified milder forms of disease that are
associated with less severe mutations. A classic
example is Duchenne (severe) and Becker (mild)
muscular dystrophy Duchenne is caused primarily
by frame-shift deletions, Becker is cause by
in-frame changes. Other examples hemolytic
anemia associated with globin
mutations hemochromatosis high penetrance
radical amino acid substitution low penetrance
milder amino acid substitution Gaucher disease
common milder mutation associated with fewer
clinical symptoms G6PD deficiency severity of
amino acid substitution correlates with
clinical significance

Botstein Risch, Nature Gen. 33, 228 (2003)
15
The future understand complex diseases
Classical linkage analysis and positional cloning
remain the methods of choice for identifying
rare, high-risk, disease-associated mutations,
owing to their clear inheritance
patterns. Knowledge of Human genome sequence
will certainly help. But simple mendelian
inheritance is often not so simple - multiple
different mutations are often identified in the
same or in different loci, with variable
phenotypic effects and highly variable associated
risks. - mutational or genotypic heterogeneity
can explain some of the clinical variability
observed in single-gene diseases, but usually not
all ? modifier genes, environmental
contributors. For non-mendelian diseases and for
diseases with multi-gene effects, all
contributing loci might be thought of as
modifiers as no single locus of large effect
exists.

Botstein Risch, Nature Gen. 33, 228 (2003)
16
large-scale SNP discovery projects
Two strategies map-based or sequence-based.
It is unclear which one will be more
effective. The private sequencing effort has
reported 2.1 million SNPs (Venter et al. 2001)
and the public SNP consortium has identified 1.4
million SNPs (Sachidanandam et al. 2001). Rates
of false-positives (10-15) are modest. Rates of
false-negatives (undetected SNPs) are more
problematic. Neither collection was based on the
sequences of many individuals ? many
lower-fequency (lt 10) SNPs were not detected,
especially those that are specific to a single
population.

Botstein Risch, Nature Gen. 33, 228 (2003)
17
fine-scale SNP discovery projects
Study A analyzed 313 genes (720 kb of genomic
sequence) for 84 ethnically diverse
individuals. Only 2 (or 6 excluding
singletons) of the SNPs identified are in
dbSNP suggesting that there exist many more SNPs
than the roughly 1.2 million unique SNPs in
dbSNP Study B analyzed 65 of the unique
sequence of chromosome 21 for 10
individuals. 36.000 SNPs were identified ? gt
6.4 million SNPs for whole genome. Only 45 of
the SNPs in dbSNP were found in this
study. Conclusion the number of SNPs in the
human genome (defined by a rare-allele frequency
of 1 or greater in at least one population) is
likely to be gt 15 million. Note there are only
30.000 genes.

Botstein Risch, Nature Gen. 33, 228 (2003)
18
fine-scale SNP discovery projects

The alternative strategy to map-based is based
on genes and sequence. Here, genotyping focuses
on SNPs identified in coding regions that alter
or terminate amino acid sequence, or disrupt
splice sites, or occur in promoter regions. The
table shows that we expect 50.000 100.000 such
gene-related SNPs. Based on results from cloned
mendelian disease, one can prioritize amino acid
replacements according to (a) the severity of the
alteration, and (b) the degree of evolutionary
conservation.
Botstein Risch, Nature Gen. 33, 228 (2003)
19
Can disease-associated alleles be predicted from
sequence?
Main feature that distinguishes a map-based
approach from a genome-based approach to
genome-wide association studies is degree to
which functional variants can be predicted on the
basis of sequence in, for example, coding and/or
conserved regions of the genome. Table 1 showed
that for mendelian phenotypes - most diseases
are the result of changes that cause loss or
alterations in encoded proteins. lt 1 of listed
mutations occur in regulatory regions (these
would be more difficult to predict from
sequence). The greatest risk of a disease
phenotype is associated with splice-site
mutations, deletions and insertions.

Botstein Risch, Nature Gen. 33, 228 (2003)
20
Can disease-associated alleles be predicted from
sequence?
Can this distribution of risks be extrapolated to
alleles of moderate to low relative risk which
are assumed to underlie complex disease
phenotypes?

Literature 18 changes 15 AA substitutions, 1
large deletion, 1 frameshift, 1 variation in
promoter region. This is not very different from
high risk diseases and is also biased to
substitutions.
Botstein Risch, Nature Gen. 33, 228 (2003)
21
Natural variation in human membrane transporter
genesidentify evolutionary and functional
constraints

Large-scale SNPs and Haplotype maps have only
analyed 24-40 chromosomes within an ethnic
population and therefore identified common
variants (gt 5) with good accuracy. These
screens could not identify less common variants
that may have more severe functional
consequences. Little is known about the
relative levels of genetic diversity within
classes of genes. Here focus on membrane
transporters which are important drug targets.
Leabman et al. PNAS 100, 5896 (2003)
22
Structure of Membrane Transporters

Transmembrane helices (25 residue long stretches,
purely hydrophobic prediction accuracy gt
90). Typically 12-14 TM helices align to form
pore. External domains are very variable in size.
Predicted secondary structures of two
representative membrane transporters from the ABC
and SLC superfamilies. The transmembrane topology
is schematically rendered.
Leabman et al. PNAS 100, 5896 (2003)
23
Membrane Transporters
Membrane transporters play critical role in many
biological processes - maintain cellular and
organismal homeostasis by importing nutrients
essential for cellular metabolism - export
cellular waste products and toxic componds. -
important in drug response they provide the
targets for many commonly used drugs - are major
determinants for drug absorption, distribution,
and elimination. Two major subfamilies - ABC
(ATP-binding cassette) transporters - SLC
(solute carrier transporters) take up
neurotransmitters, nutrients, heavy metals
... Here screen for variation in a set of 24
genes encoding membrane transporters.

Leabman et al. PNAS 100, 5896 (2003)
24
24 TM transporters with potential roles in drug
response
Transporters are grouped based on transporter
family (e.g., OCT1, OCT2, and OCT3 belong to the
SLC6 family CNT1 and CNT2 belong to the SLC28
family). Blue ovals transporters of SLC
superfamily red rectangles, ABC superfamily
green hexagon, P-type ATPase. Typical
substrates for each family of transporters are
listed. The direction of transport is indicated
by an arrow pointing into the cell (influx) or
out of the cell (efflux).

Leabman et al. PNAS 100, 5896 (2003)
25
Aims of SNP scan
Analyze 247 DNA samples of ethnically diverse
collection (100 European Americans, 100 African
Americans, 30 Asians, 10 Mexicans, 7 Pacific
Islanders). Identify SNPs. Aim 1 determine the
levels and patterns of genetic diversity - in
different ethnic groups - in different
transporter families - across different
structural regions of membrane transporters. Aim
2 combine population-genetic and phylogenetic
analysis to identify amino acid residues and
protein domains that may be important for human
fitness. Infer functional consequences of amino
acid substitution. To identify polymorphisms,
screen all exons plus 35 -100 bp of flanking
intronic sequence.

Leabman et al. PNAS 100, 5896 (2003)
26
Variation in transporter genes

680 biallelic SNPs, 2 tri-allelic SNPs. 91/477
SNPs were already deposited in dbSNP.
Leabman et al. PNAS 100, 5896 (2003)
27
Population specificity
421/680 SNPs are population specific. 248/421
are singletons occur only once among 494
chromosomes. (This explains why large-scale SNP
projects have sofar identified far less
SNPs). Of the 259 population-unspecific SNPs, 83
are present in all 5 populations. Few
population-specific alleles were found at high
frequency only 4/278 African American-specific
alleles had frequency gt 0.1 only 1/50
Asian-specific allele had frequency gt 0.1 The
European American population sample had no
population-specific allele (0/80) at fequency gt
0.05. The relatively high incidence of
moderately frequent population-specific alleles
in African Americans may facilitate
identification of ethnic-specific disease loci in
this population.

Leabman et al. PNAS 100, 5896 (2003)
28
Analysis of Nucleotide Diversity

On average, genetic variation in membrane
transporters (?) is similar to that in other
genes. Next study nucleotide diversity in TM
domains and in loop domains.
Leabman et al. PNAS 100, 5896 (2003)
29
Variation across structural regions
As expected, amino acid diversity (?ns) is
significantly lower in TM domains than in
loops. Consistent with observation that TM
domains are evolu-tionary more conserved than
loops suggesting that there are constraints on
TM domains of transporters.

EC evolutionary conserved EU evolutionary
unconserved
Agreement suggests that constraints on structural
regions of proteins (e.g. TM domains) occurs
across long and short evolutionary distances for
this set of proteins.
Leabman et al. PNAS 100, 5896 (2003)
30
ABC and SLC superfamilies
ABC and SLC superfamilies of transporters have
evolved to transport structurally diverse
biological molecules. TMDs of both superfamilies
contain residues and structural domains
responsible for substrate specificity. Only the
loops of the ABC transporters contain ATP-binding
domains. Observation ? is extremely low in TM
domains of ABC transporters, much lower than in
TM domains of SLC family members.

Leabman et al. PNAS 100, 5896 (2003)
31
Paralogue identification
Predicted secondary structures of two
representative membrane transporters (BSEP and
CNT1) from the ABC and SLC superfamilies showing
positions of nonsynonymous SNPs (leading to amino
acid mutations). The transmembrane topology
schematic was rendered by using the program
TOPO. Nonsynonymous amino acid changes are shown
in red.

Leabman et al. PNAS 100, 5896 (2003)
32
Evolutionary conservation
Surprisingly, the extent of amino acid diversity
did not parallel evolutionary conservation the
fraction of EU residues in the TM domains of the
ABC superfamily is significantly higher than in
the TM domains of the SLC superfamily. This
implies that a protein segment (TM domains of ABC
transporters) is more constrained within humans
than across species ? may be related to substrate
properties _______________________________________
__________________________ For the SLC
superfamily, ?NS-EC is significantly lower than
?NS-EU both for the TM domains and for the
loops. For the TM domains of the ABC superfamily,
?NS-EC ?NS-EU. This may reflect special
functional demands on the TM domain of this
superfamily. ? Again variation among humans does
not always parallel phylogenetic variation!

Leabman et al. PNAS 100, 5896 (2003)
33
Back to Pharmacogenomics
With the linkage of genomics with transcriptomics
proteomics, pharmacogenomics is undergoing a
similar shift in focus from Mendelian examples to
more complex modes of genetic causation. Candidat
e genes for variable drug response (1) genes
that code for drug-metabolizing enzymes (DME).
Most DME-encoding genes have polymorphisms that
have been shown to influence enzymatic
activity. (2) proteins involved in drug
transport. Drug transporters (e.g. ABC and SLC)
show considerable genetic variation including
many functional polymorphisms.

Goldstein et al. Nature Rev. Gen. 4, 937 (2003)
34
Future of Pharmacogenomics
To detect the effect of a gene variant that
explains 5 of the total phenotypic variation in
a quantitative response to a drug by typing 100
independent SNPs would require 500 patients to
provide an 80 chance of detection assuming an
experiment-wide false-positive rate of 5. The
behaviour of most drugs will be influenced by a
wide range of gene products (DMEs, transporters,
targets, and others), and in many cases the
importance of polymorphisms in one of the
relevant genes might depend on polymorphisms in
other genes. As a simple example, CYP1A2 and
N-acetyltransferase 2 act in different stages in
the pathway that metabolizes compounds in burnt
meat. Variants might interact to influence the
risk of colorectal cancer. The polymorphisms
indicate that regulatory variants have a far more
important role in variable drug response than
they do in Mendelian diseases.

Goldstein et al. Nature Rev. Gen. 4, 937 (2003)

Write a Comment

User Comments (0)

About PowerShow.com

Pharmacogenomics PowerPoint PPT Presentation