Extra credit problem - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Extra credit problem

Description:

Look at HMM alignment. Transmembrane Domains (TMHMM) http://www.cbs.dtu.dk/services/TMHMM ... Domains: Possibly a diverged tick' domain. Weak hit (below Noise ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 21
Provided by: lindaih
Category:
Tags: credit | do | extra | like | look | problem | ticks | what

less

Transcript and Presenter's Notes

Title: Extra credit problem


1
Extra credit problem
  • For the Functional and GO exercises,
  • Eukaryotic Annotation Course

2
Example 1
  • This protein from Ixodes scapularis has 386 amino
    acids, and was submitted by a member of the
    community.
  • gtexample_sequence_1
  • MKKKPISKVFVPRSRNSESDAIFKIPMAALKFTGLFWNTTCRPARLLSFL
    LKISIVTTQA KLLSDAFTYETVDMVLYGSRILTANVSFIIFALQERNLR
    NAIKDLSDKASFLLPLQRQRK IRTLSCSLACVSAIIIAVFLSGPAYVLF
    FTDKRLQTDLLSRFVAYLNEVCFAVVIWYPLC
    FMPILFVNVSQTFAELLSQYNEMIPKLFCTENHNIYSLNCKFRHSREQRH
    EMRRLLSVCG KIFAPCLFIWYGPTFLGCCAELSNFMRQSDAWVHRYYKA
    VTSAHGWAMFWGVSLAAHHVY ATGRASWDVLQDCTLRLPLDVGVHMELV
    MLKEDCRKIAMAFTIGGFYKLTLRTAFSVFSC
    MLTYAFVWYQIGPGSQPNVASHTNSD

3
Check the gene structure
4
BLASTP
http//www.ncbi.nlm.nih.gov/blast/Blast.cgi
5
BLASTP Result
Conclusion No significant homology
6
Pfam
gtexample_sequence_1
http//pfam.sanger.ac.uk/
7
Pfam Result
8
HMM score significance
For each Pfam family, there is a "trusted cutoff"
and a "noise cutoff", TC1 and NC1. TC1 is the
lowest score for sequences included in the family
(e.g. in the Full alignment). NC1 is the highest
score for sequences not included in the Full
alignment.
There are two HMMs for each Pfam entry one to
represent full length matches (ls model), and one
to represent fragment matches (fs model).
No significant hits, not even above the Noise
Cutoff for the 7TM_7 domain. Poor e-value.
9
Look at HMM alignment
10
Transmembrane Domains(TMHMM)
http//www.cbs.dtu.dk/services/TMHMM/
11
TMHMM Result
5 Transmembrane helices predicted
Red bars denote significant result.
12
Looking for a Signal SequenceSignalP
http//www.cbs.dtu.dk/services/SignalP/
13
SignalP Result
No signal sequence found.
14
Look for a Targeting SequenceTargetP
http//www.cbs.dtu.dk/services/TargetP/
15
TargetP Result
Result ambiguous
RC Reliability class, from 1 to 5, where 1
indicates the strongest prediction. RC is a
measure of the size of the difference ('diff')
between the highest (winning) and the second
highest output scores. There are 5 reliability
classes, defined as follows     1 diff gt
0.800    2 0.800 gt diff gt 0.600    3 0.600
gt diff gt 0.400    4 0.400 gt diff gt 0.200    5
0.200 gt diffThus, the lower the value of RC
the safer the prediction.
SP Secretory pathway, i.e. the sequence contains
SP, a signal peptide ambiguous result.
16
Superfamily
17
There is no significant Superfamily result,
either.
18
Other information
ClustalW alignment of other members of this
putative family
19
Viewing the phylogenetic tree
20
Discussion
  • BLASTP No significant matches.
  • Domains Possibly a diverged tick domain. Weak
    hit (below Noise Cutoff) to Pfam08395 7tm_7. 7tm
    Chemosensory receptor. This family includes a
    number of gustatory and odorant receptors mainly
    from insect species such as A. gambiae and D.
    melanogaster. They are classified as
    G-protein-coupled receptors (GPCRs), or
    seven-transmembrane receptors. They show high
    sequence divergence, consistent with an ancient
    origin for the family.
  • 5 transmembrane helices.
  • Possibly secreted, ambiguous.
  • Not closely matched to others in the putative
    family. Not grouped in a domain-based family
    calculation.
  • Whats a curator to do?

In this case, the conservative approach would be
to name the protein hypothetical protein. We
could call it transmembrane protein, putative.
Write a Comment
User Comments (0)
About PowerShow.com