Inside the Gene Sorter - PowerPoint PPT Presentation

About This Presentation
Title:

Inside the Gene Sorter

Description:

... do.configure INPUT SUBMIT n/a. near.do.advFilter INPUT SUBMIT n/a. near.count ... near.do.getSeqPage INPUT SUBMIT n/a. Note 'near' prefix of CGI-specific vars. ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 17
Provided by: jimk88
Category:
Tags: gene | inside | sorter | submit

less

Transcript and Presenter's Notes

Title: Inside the Gene Sorter


1
Inside the Gene Sorter
  • A moderately complex CGI application

2
Built on top of library modules
  • cheapcgi - creates widgets
  • cart - gathers input
  • web - UCSC look and feel
  • axtAffine - align two sequences
  • hash - in memory table keyed by string
  • linefile - fast line oriented parsing
  • ra - file full of var/value records, used to
    configure columns etc.

3
Gene Sorter Main Page
  • Controls up top followed by big table.
  • One of about a dozen pages produced by hgNear CGI
    script.

4
hgNear controls CGI vars
hgsid INPUT HIDDEN 97309948 org SELECT
n/a Human Human Mouse
Rat C. elegans D. melanogaster
S. cerevisiae db SELECT n/a hg18
hg18 hg17 hg16 near_search
INPUT TEXT submit INPUT SUBMIT
n/a near_order SELECT n/a expGnfAtlas2
expGnfAtlas2 blastp
pfamSimilarity geneDistance
genomePos nameSimilarity
near.do.configure INPUT SUBMIT
n/a near.do.advFilter INPUT SUBMIT
n/a near.count SELECT n/a 50
25 50 100 200
500 1000 all near.do.getSeqPage
INPUT SUBMIT n/a
  • Note near prefix of CGI-specific vars.
  • Note near.do prefix of buttons.
  • Presence of a button var in CGI is used to figure
    out which page to draw.
  • All near.do vars are ripped out of cart so page
    only drawn once.

5
Snippet of C for big dispatch
void doMiddle(struct cart theCart) / Write the
middle parts of the HTML page. This routine sets
up some globals and then dispatches to the
appropriate page maker. / else if
(cartVarExists(cart, customPageDoName))
doCustomPage(conn, colList) else if
(cartVarExists(cart, customSubmitDoName))
doCustomSubmit(conn, colList) else if
(cartVarExists(cart, customClearDoName))
doCustomClear(conn, colList) else if
(cartVarExists(cart, customPasteDoName))
doCustomPaste(conn, colList) else if
(cartVarExists(cart, customUploadDoName))
doCustomUpload(conn, colList) else if
(cartVarExists(cart, customFromUrlDoName))
doCustomFromUrl(conn, colList) else if
(cartVarExists(cart, orderInfoDoName))
doOrderInfo(conn) else if (cartVarExists(cart,
affineAliVarName)) doAffineAlignment(conn) el
se if (cartNonemptyString(cart, searchVarName))
doSearch(conn, colList) else if
(gotAdvFilter()) displayData(conn, colList,
knownPosFirst(conn)) else doExamples(conn,
colList) cartRemovePrefix(cart, "near.do.")
6
Gene Sorter Columns
7
columnDb.ra example
  • name num
  • shortLabel
  • longLabel Item Number in Displayed List/Select
    Gene
  • priority 1
  • visibility on
  • type num
  • name name
  • shortLabel Name
  • longLabel Gene Name/Select Gene
  • priority 2
  • visibility on
  • type knownName kgXref kgID geneSymbol
  • search fuzzy
  • searchLabel Known Gene Names
  • name proteinName
  • shortLabel UniProt
  • longLabel UniProt (SwissProt/TrEMBL) Protein
    Display ID

8
hgNearData
  • columnDb.ra lives in hgNearData directory.
  • Theres three levels of columnDb.ra files in
    three levels of dir heirarchy
  • root - applicable to all organisms
  • organism - override root for an organism
  • database - override for specific assembly
  • Can override specific fields as well as entire
    record. Always need at least name field.
  • genome.ra and orderDb.ra also live in hgNearData,
    as well as column.html files that describe the
    columns.

9
Routine to get active columns
struct column getColumns(struct sqlConnection
conn) / Return list of columns for big table.
/ char raName "columnDb.ra" struct column
col, next, customList, colList NULL struct
hash raList readRa(raName), raHash
NULL / Create built-in columns. / if (raList
NULL) errAbort("Couldn't find anything
from s", raName) for (raHash raList raHash
! NULL raHash raHash-gtnext)
AllocVar(col) col-gtsettings raHash
columnVarsFromSettings(col, raName) if
(!hashFindVal(raHash, "hide"))
setupColumnType(col) if
(col-gtexists(col, conn))
slAddHead(colList, col) /
Create custom columns. / customList
customColumnsRead(conn, genome, database) for
(col customList col ! NULL col next)
next col-gtnext setupColumnType(col)
if (col-gtexists(col, conn))
slAddHead(colList, col)
10
Column structure
struct column / A column in the big table. The
central data structure for hgNear. /
/ Data set during initialization that is
guaranteed to be in each column. / struct
column next / Next column. / char
name / Column name, not
allocated here. / char shortLabel
/ Column label. / char longLabel
/ Column description. / boolean on
/ True if turned on. / char
type / Type - encodes which
methods to used etc. / boolean
(exists)(struct column col, struct
sqlConnection conn) / Return TRUE if column
exists in database. / char
(cellVal)(struct column col, struct genePos
gp, struct sqlConnection conn) /
Get value of one cell as string. FreeMem this
when done. Note that gp-gtchrom may be NULL
legitimately. / void (cellPrint)(struct
column col, struct genePos gp, struct
sqlConnection conn) / Print one cell of
this column in HTML. Note that gp-gtchrom may be
NULL legitimately. / void
(labelPrint)(struct column col) / Print
the label in the label row. / void
(configControls)(struct column col) /
Print out configuration controls. /
11
Drawing big table
hPrintf("ltTABLE BORDER1 CELLSPACING0
CELLPADDING1 COLSd BGCOLOR\""HG_COL_INSIDE"\"
gt\n", totalHtmlColumns(colList)) / Print label
row. / hPrintf("ltTR BGCOLOR\""HG_COL_HEADER"\"gt
") for (col colList col ! NULL col
col-gtnext) if (col-gton)
col-gtlabelPrint(col) hPrintf("lt/TRgt\n") /
Print other rows. / for (gene geneList gene
! NULL gene gene-gtnext) if
(sameString(gene-gtname, curGeneId-gtname))
hPrintf("ltTR BGCOLOR\"D0FFD0\"gt") else
hPrintf("ltTRgt") for (col colList col
! NULL col col-gtnext) if
(col-gton) col-gtcellPrint(col,gene,conn
) if (ferror(stdout))
errAbort("Write error to stdout")
hPrintf("lt/TABLEgt")
12
Common column types
  • Lookup - works with database table keyed by gene
    name. Only a single string value allowed for
    each gene.
  • Association - allows arbitrary SQL query
    including gene name. Multiple values per gene ok.
  • Float - like lookup but with numerical values.
  • Distance - reports a value associated with two
    genes - selected gene and gene on current row.
  • expMulti - expression microarray data
  • See hgNearData.doc for more details.

13
Adding kgTxInfo columns
  • name geneCategory
  • shortLabel Gene Category
  • longLabel High Level Gene Category - Coding,
    Antisense, etc.
  • priority 2.6001
  • visibility off
  • type lookup kgTxInfo name category
  • name cdsScore
  • shortLabel CDS Score
  • longLabel Coding potential score from
    txCdsPredict
  • priority 2.6002
  • visibility off
  • type float kgTxInfo name cdsScore

14
Updating coding SNPs Column
  • name codingSnps
  • shortLabel Coding SNPs
  • longLabel Simple Nucleotide Polymorphisms in
    Coding Regions
  • priority 7.5
  • visibility off
  • type association knownToCdsSnp
  • queryFull select name,value from knownToCdsSnp
  • queryOne select value,value from knownToCdsSnp
    where name 's'
  • invQueryOne select name from knownToCdsSnp where
    value 's'
  • itemUrl http//www.ncbi.nlm.nih.gov/SNP/snp_ref.cg
    i?typersrss

15
knownToCdsSnp from hg17.txt
  • Make knownToCdsSnp table (DONE Nov 11, 2004,
    Heather)
  • ssh hgwdev
  • hgMapToGene hg17 snp knownGene knownToCdsSnp
    -all -cds
  • row count 165728
  • unique 34013
  • approx. 5 minutes running time

But no snp table in hg18. Try instead hgMapToGene
hg18 snp126 knownGene knownToCdsSnp -all -cds
16
Making intronSize column
  • A new column type
  • copy colTemplate.c to intronSize.c
  • Edit code
  • Add intronSize type to hgNear.c/.h
  • Add intronSize to columnDb
  • Make/test
Write a Comment
User Comments (0)
About PowerShow.com