Various Career Options Available - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Various Career Options Available

Description:

Group the worm seqs that match the yeast query seq with a high P value (10-10 to ... From the group made in 2, choose a worm seq and make a search of the yeast ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 20
Provided by: imtec5
Category:

less

Transcript and Presenter's Notes

Title: Various Career Options Available


1
Basics of Comparative Genomics Dr G. P. S.
Raghava
2
  • AIM To understand Biology of Organisms
  • Importance More than 100 genomes sequenced, more
    than 250 in progress
  • Definition Comparison of set of proteins of
    one genome to another genome comparision of
    gene location, gene order and gene regulation
  • Application
  • Visualization of information on genome
  • Genome annotation (Prediction of gene, repeats,
    regulation region)
  • Evolutionary information (gene loss, duplication,
    horizontal gene transfer, ancestor)
  • Essential genes for cell survival
  • Classification of genes based on function
  • Tools and Databases

3
What is comparative genomics?
  • Analyzing comparing genetic material from
    different species to study evolution, gene
    function, and inherited disease
  • Understand the uniqueness between different
    species

4
Why Comparative Genomics ?
  • It tells us what are common and what are unique
    between different species at the genome level.
  • Genome comparison may be the surest and most
    reliable way to identify genes and predict their
    functions and interactions.
  • e.g., to distinguish orthologs from
    paralogs
  • The functions of human genes and other DNA
    regions can be revealed by studying their
    counterparts in lower organisms.

5
What is compared?
  • Gene location
  • Gene structure
  • Exon number
  • Exon lengths
  • Intron lengths
  • Sequence similarity
  • Gene characteristics
  • Splice sites
  • Codon usage
  • Conserved synteny

6
Few facts from genome comparision
  • High degree of conservation of microbial proteins
    (70 ancestral conserved region)
  • Protein related with ENERGY process are generally
    found all genomes
  • Proteins related to COMMUNICATION repersent
    repersent most distinctive function in each
    genome
  • INFORMATION related protein have complex
    behaviour
  • High frequence (10) non-orthologous gene
    displacement

7
Few Terminologies
  • Homology - Homology is the relationship of any
    two characters ( such as two proteins that have
    similar sequences ) that have descended, usually
    through divergence, from a common ancestral
    character. Homologues are thus components or
    characters (such as genes/proteins with similar
    sequences) that can be attributed to a common
    ancestor of the two organisms during evolution.

8
Homologoues can either be orthologues xenologues,
paralogues or.
  • Orthologues are homologues that have evolved from
    a common ancestral gene by speciation. They
    usually have similar functions.
  • Paralogues are homologues that are related or
    produced by duplication within a genome followed
    by subsequent divergence. They often have
    different functions.
  • Xenologues are homologous that are related by an
    interspecies (horizontal transfer) of the genetic
    material for one of the homologues. The functions
    of the xenologues are quite often similar.

9
Analogues
  • Analogues are non-homologues genes/proteins that
    have descended convergently from an unrelated
    ancestor. They have similar functions although
    they are unrelated in either sequence or
    structure.

10
Frequently used terms
  • Homology
  • Orthologous Common ancestral gene. They usually
    have similar functions
  • Paralogous duplication of gene within genome
    have usually different functions
  • Xenologous That are related by an interspecies
    (horizontal gene transfer) of the genetic
    material, have similar function
  • Analogous Not evolve from same ancestor
  • Similarity sequence similarity
  • Percent Identitity

11
Visualising Genome Information
12
Genome Annotation
  • The Process of Adding Biology Information and
  • Predictions to a Sequenced Genome Framework

13
All-against-all Self-comparison
  • How?
  • Making a database of the proteome
  • Use each protein as a query in a similarity
    search against the database
  • (BLAST, WU-BLAST or FASTA)
  • Generate a matrix of alignment scores (P or E
    value)
  • A conservative cutoff E value 10e-6
  • Why?
  • Number of Gene Families
  • This comparison distinguishes unique proteins
    from proteins arisen from gene duplication, and
    also reveals the of gene families.
  • Paralogs
  • Significantly matched pairs of protein sequences
    may be paralogs.

14
Between-Proteome Comparisons Why?
  • To identify orthologs, gene families, and domains
  • Orthologs (proteins that share a common
    ancestry function)
  • A pair of proteins in two organisms that align
    along most of their lengths with a highly
    significant alignment score.
  • These proteins perform the core biological
    functions shared by the two organisms.
  • Two matched sequences (X in A, Y in B) may not be
    orthologs
  • (Y and Z are paralogs in B, X and Z are
    orthologs)
  • Identify true orthologs
  • highest-scoring match (best hit)
  • E value lt 0.01
  • gt 60 alignment over both proteins

15
Between-Proteome Comparisons How?
  • Choose a yeast protein and perform a database
    similarity search of the worm proteome
    (WU-BLAST) a yeast-versus-worm search
  • Group the worm seqs that match the yeast query
    seq with a high P value (10-10 to 10-100), also
    include the yeast query seq in the group
  • From the group made in 2, choose a worm seq and
    make a search of the yeast proteome, using the
    same P limit
  • Add any matching yeast seq to the group made in
    2
  • Repeat 3 4 for all initially matched seqs in
    the group
  • Repeat 1-5 for every yeast protein
  • As 1-6, perform a comparable worm-versus-yeast
    search
  • Coalesce the groups of related seqs. and remove
    any redundancies so that every sequence is
    represented only once.
  • Eliminate any matched pairs in which less than
    80 of each seq is in the alignment

16
Figure 1   Regions of the human and mouse
homologous genes Coding exons (white), noncoding
exons (gray, introns (dark gray), and intergenic
regions (black). Corresponding strong (white) and
weak (gray) alignment regions of GLASS are shown
connected with arrows. Dark lines connecting the
alignment regions denote very weak or no
alignment. The predicted coding regions of
ROSETTA in human, and the corresponding regins in
mouse, are shown (white) between the genes and
the alignment regions.
17
Target Validation
  • Target validation involves taking steps to prove
    that a DNA, RNA, or protein molecule is directly
    involved in a disease process and is therefore a
    suitable target for development of a new
    therapeutic compound.
  • Genes that do not belong to an established
    family are critical to many disease processes and
    also need to be validated as potential drug
    targets.

18
(No Transcript)
19
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com