PaLS: Pathways and Literature Strainer - PowerPoint PPT Presentation

About This Presentation
Title:

PaLS: Pathways and Literature Strainer

Description:

Studies of differential expression and, specially, gene ... EMBL-CRG Systems Biology Unit. -Edward R. Morrissey. Systems Biology DTC. University of Warwick ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 8
Provided by: andrs4
Category:

less

Transcript and Presenter's Notes

Title: PaLS: Pathways and Literature Strainer


1
PaLS Pathways and Literature Strainer
Filtering common literature, ontology terms and
pathway information.
Andrés Cañada Pallarés Instituto Nacional de
Bioinformática acanada_at_cnio.es
2
INTRODUCTION
-Studies of differential expression and,
specially, gene selection in the context of
classification and prediction with microarray
data, usually output lists of interesting
genes. -some of the members of those lists have
a function in common or do they belong to the
same metabolic pathway?
-PaLS takes a list or set of lists of gene or
protein identifiers and shows which ones share
certain descriptors -Variable selection with
microarray data (where number of
variablesgtgtnumber of samples) can lead to many
solutions. Different rounds of the same
algorithms often return different lists of
interesting genes. It is a problem for the
interpretability of the results.
-PaLS allows us to try to discover the major
biological themes that are shared among different
solutions. Even if the identity of genes in each
solution is different
3
FUNCTIONALITY
Run.1.component.1 NM_002358 NM_001786 NM_003258 N
M_001809 NM_003318 NM_020188 NM_004203 NM_004217
Run.2.component.1 NM_001809 NM_001826 NM_001827 NM
_003318 NM_020242 NM_003600 . . .
-Main input file. Text Plain -List or several
lists of gene/proteins -Each list can have its
own name -Type of identifiers accepted -Ensembl
Gene IDs -UniGene Cluster IDs -Gene names
(HUGO) -GenBank accessions -Clone
IDs -Affymetrix IDs -EntrezGene
IDs -RefSeq_RNAs -RefSeq_peptides -SwissProt
Names -Organisms accepted -Human -Mouse -Rat
4
FUNCTIONALITY
-PaLS has three different methods of filtering
annotations
1.- Filter descriptors referenced with more than
a given percentage, giving results for each list
separately. Intended to be used to discern which
list has some common published information that
shows that those genes/proteins share a similar
function.
2.- Group all lists in one list (removing
duplicates) and display those descriptors that
are more referenced in the global list. To see
commonalities even if they are not seen within
each list.
3.- Look for those descriptors that are
referenced by more than a given threshold of
identifiers in more than a given percentage of
lists. Looking for commonalities present within
and among sets of lists.
-Threshold values are part of input information
needed. Defaults to 50 -Lower values are
suggested
5
OUTPUT
Most time cosuming process is the first search.
After that, the user can change thresholds for
each type of descriptor and filtering method,
obtaining an answer in a short time (Redo
Analysis button, see figure later)
-Output are lists of those descriptors that
fulfill the threshold criteria selected by the
user. Every input identifier related to each
descriptor is linked to IDClight to present the
user as much information as possible.
-For lists of less of 100 nodes, graph plots that
describe the data structure of the lists are
created. These plots show the genes/proteins that
share at least one descriptor. The more
descriptors they share the closer they appear.
6
PRACTICE
-Data set from vant Veer et al (Gene expression
profiling predicts clinical outcome of breast
cancer. Nature, 415(6871), 530-536) -Lists of
genes obtained using our cnio application SignS
(Díaz-Uriarte, R)
-at 50 threshold, GO terms in most lists refer
to nucleus -at 40 threshold, the term cell
cycle appears in several of the lists. As
reported in the original vant Veer et al. paper,
genes involved in cell cycle are upregulated in
the poor prognosis signature -at 20 threshold,
the term mitosis appears in most of the lists
-If we examine PaLS results from Reactome at the
20 threshold we see cell cycle. Mitotic in
most of the lists. -The list 6th.
Cross-validation run shows E2F mediated
regulation of DNA replication
7
ACKNOWLEDGMENT
-Ramón Díaz-Uriarte. Structural Biology and
Biocomputing. CNIO
-Andreu Alibés. EMBL-CRG Systems Biology Unit.
-Edward R. Morrissey. Systems Biology DTC.
University of Warwick
THANK YOU
Write a Comment
User Comments (0)
About PowerShow.com