CS177 homework assigned March 2 - PowerPoint PPT Presentation

About This Presentation
Title:

CS177 homework assigned March 2

Description:

can you make any observations about the multiple alignments? ... make it put out what you need for your query ... with dd ) to make it relevant for parsing ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 6
Provided by: barryz
Category:

less

Transcript and Presenter's Notes

Title: CS177 homework assigned March 2


1
CS177 homework assigned March 2
  • this can be either a group or individual
    assignment, whichever is easier for you
  • this will cover
  • multiple alignment
  • html and java
  • this is a lot of stuff, so it is due in 2 weeks
  • the balance should start shifting from assigned
    homework to doing your projects

2
vertical multiple alignment
  • pick a gene, find the human mRNA (ie, NM_XXXX)
    RefSeq, and query NCBI nucleotides using BLAST
  • see if you get hits to 5 or 6 different species
  • if not, try another gene
  • pick out one good hit (ie, low p value and pretty
    long) for each species (including the original
    human RefSeq)
  • submit these sequences to clustalw
  • visually identify one column on the alignment
    that exemplifies highly conserved, one for
    moderately conserved, and one for poorly
    conserved

3
horizontal multiple alignment
  • go to pfam site
  • http//www.sanger.ac.uk/Software/Pfam/
  • look it over until you are completely confused
  • here is my example
  • run through it
  • then do your own example
  • enter fibrin in keywords box
  • click on kringle
  • click on view species tree
  • click in box next to homo sapiens and then click
    view selected species alignment
  • meditate on what you are seeing
  • amino acids, not nucleotides
  • uses the one letter symbol format for amino acids
  • can you make any observations about the multiple
    alignments?
  • try it in a second browser window for gorilla and
    compare with human
  • ditto for mouse

4
java
  • take a look at the NCBI_STRUCTURES.java program
  • go to the web site from the last homework
  • http//java.sun.com/j2se/1.3/docs/api/index.html
  • see if you can find something in the web site
    that helps you make sense out of one or two
    things in the NCBI_STRUCTURES.java program
  • hint Look at the URLConnection class
  • just spend 30 or 40 minutes on this. dont get
    too frustrated now - you will have plenty of time
    for that once you get a real job
  • think of this as a growth experience that builds
    character

5
java and html - we will do this in class next
week, but you will have to do it on your own also
  • the NCBI_STRUCTURES.java program can be used as a
    prototype for this part
  • go to NCBI web site
  • view the html source code underlying the web
    page
  • locate the form action POST stuff
  • look at the stuff that happens between the ltform
    and the lt/form tags
  • copy the html source into a file, and change POST
    to GET, save as NCBI.html
  • open another browser window, and read in
    NCBI.html using the File menu
  • perform a query type of your choice
  • this will not actually work since POST is
    expected, but notice the stuff in the URL window
    that is exposed by using GET
  • copy the html source into a file, and change the
    NCBI URL into the URL for my cgi program
    testloop.cgi, save as NCBIecho.html
  • repeat the last 3 steps, and see if the echo is
    the same as the GET
  • modify NCBI_STRUCTURES.java
  • make it put out what you need for your query
  • modify the part that does the parsing (ie, the
    line with ltddgt) to make it relevant for parsing
    your output
  • hint figure out the modification by looking at
    the real output html source from a real query at
    the real NCBI site
  • if you cannot figure out how to modify the
    parsing, then at least comment it out entirely
    or you will not see any output!!
  • remember that the java program is run as
  • java NCBI_STRUCTURES inputfilename
  • inputfilename is the name of the input file that
    has 4or 5 gene names to test oout
Write a Comment
User Comments (0)
About PowerShow.com