Title: 2st ISA-TAB workshop Outcome/Summary (to date)
12st ISA-TAB workshopOutcome/Summary (to date)
This workshop is funded by the BBSRC Tools and
Resources (WODS, BB/E025080/1), with
contributions from EBI and NERC Bioinformatics
Center
Workshops on Data Standards (WODS) EBI,
Cambridge, UK 16th, 17th and 18th June 2008
2Monday, 16thReviewing XSLT
3XSLT issues discussed and solutions
- FuGE extensions status
- ACTION (Andy) to create a portal on FuGE page
to list all the extensions, their status,
contacts, links and examples etc - This can help maximize interactions and advertise
the status of each extensions (like MIBBI does
for the checklists) - E.g. the RNAi group (that is building a FuGE
extension in this domain) needs to develop
something to describe a microtiter plate and the
work on the Array Design can be reused - If there will be a FuGE-derived MAGE-ML then
this group could reuse the ADF part or other
parts - ACTION (Javier,Helen) explore encoding of
microplate representation in the datafiles and
referencing from Assay - Problem is that we do not have a final decision
on where there will be a MGEDs extension of FuGE - (ACTION, Helen) to check with the MAGE list
MGED board
4XSLT issues discussed and solutions
- Namespace inconsistency
- Should we have FuGE controlled namespaces?
- The OBO Foundry is considering doing it for
ontologies - ACTION (Andy) to ask the list if desiderable
- namespace is a critical issue for xsl processing,
not so much for other parsing methods - this is probably a recommendation for those that
wish to use XSLT for presentation purpose a
dedicated page will be set up on the ISA-TAB
website to list such XSLT recommendations ACTION
(Philippe) - Annotation overloading
- Descriptions are used as term gathering fields
- We could recommend on the fly term creation
(collection of term as supplied by users)? - ACTION (Andy, Ally) add recommendations on FuGE
wiki (explain the use of FuGE) - Paper soon out on recommendations to extend FuGE
5XSLT issues discussed and solutions
- Name attribute optionality
- When this is missing XSLT uses the ID, giving a
less human readable transformations - We could recommend that name is used when
readability is required/preferred ACTION
(Philippe) to modify XSL templates - Way to categorize assays is not in FuGE
- How to code technology and endpoint to
categorize the assay (InvestigationComponent)? - It can be done implicitly, but would useful to
have these as explicit objects - However, as there will not be a FuGE v1.1, work
around to any issues or needs will be done via
recommendations ACTION (Andy) to add in FuGE wiki
- Reagents info are on ISA-TAB
- Flow cytometry examples have more
depth/granularity, e.g. all the reagents are
listed they have coded it via Material - - Even if FuGE recommends to do it via structured
Protocol (see Gel-ML) - ACTION (Andy, Ally) point them to design patters
on FuGE wiki - ISA-TAB can (somehow) deal with it via Protocol
Component field, just added
6XSLT scripts library next steps
- More FuGE ML files are needed to test the current
scripts - ACTION (Ally) to give more example to Philippe
from Symba - ACTION (Philippe) to send script to Ally
- ACTION (Philippe) to set up a XSLT page on
ISA-TAB to post all the scripts - Then the scripts will be tested with FuGE
extensions e.g. GelML - More example to test to evaluate and finalize the
scripts - ACTION (Frank, Philippe) collaboratively finalize
the scripts for GelML - Final comment on scope of ISA-TAB in relation to
FuGE - FuGE or other XMLx are more granular/expressive
- We got to accept that fact that when transform in
ISA-TAB we will/may loose/compress some of the
info
7MAGE-TAB to ISA-TAB converter
8Tuesday, 17thReviewing the ISA-TAB
9Investigation File Changes and decisions (1)
- Add Investigation PubMed ID and Investigation
Publication DOI to Investigation section - - Only for paper describing across Studies
- Studies section need to be singular (Study)
- Comment point Study header is sufficient to
separate the sections, no need for have
start/end repeatable block - If developers want to add a comment then this
would be this is a comment thingy. - - Comment must have as the first char
- - But in Study/Assay by adding a column comment
(see Table 5 in spec v0.3) for the users - Create a new section Study Publications where
we group publications attributes, moving the id,
description and date under Study Section - Create a new section Study Design Descriptors
10Investigation File Changes and decisions (2)
- All fields name are case sensitive
- Edit every field must have first letter upper
case - Section headers go all upper case
- To allow easy visualization when imported in
spreadsheet - File will be interpreted in a Unicode
- Any subsections within a repeatable block (Study)
must remain within the block - - But the order of the subsection within the
block can vary - Use the triplet (type, accession, source ref)
consistently, if ontology/CV is used, if not,
Name and Type are entered as free text (add
example in the spec) - Add Protocol Parameter Name, followed by
Protocol Parameter Type and Protocol Parameter
Type Term Accession, Protocol Parameter Type
Term Source REF
11Investigation File Changes and decisions (3)
- Correct to allow for multiple values, for
Protocol Parameter Type/Term/Source and Study
Design Type/Term/Source triplets - Add Protocol URI and Protocol Version fields
in the Study Protocol subsection - The pointer to external file(s) allows to users
to provide these in the format they wish - URI should be resolvable
- Ultimately these requirement are up to the
implementers similarly to make e.g. mandatory
other Protocol fields - Remove Protocol Component Parameter, Instrument
Component, Software triplets and Processing
fields - Add Protocol Component Name, Protocol
Component Types, Protocol Component Types Term
Accession, Protocol Component Types Term Source
REF - Used for listing, e.g. instruments, software,
reagents, operator - Semicolon separator ACTION (Marco) to provide
examples of options - Clean up all the field names and make them
unique by prefix them with the name of the
section, e.g. Investigation PubMed ID vs Study
PubMed ID
12Study/Assay File Recommendations
- The table represents a graph and each edge needs
to appear at least one, nodes do not need to be
repeated, e.g. - Microarray (technology) / Gene expression
(endpoint) tab - Document how to represent the case when when 2
different analysis protocols are applied to the
same set of data file - In this case we follow MAGE-TAB by repeating
vertically the data file names (only, not need to
repeat the previous columns) followed by a new
analysis protocol and output data file names - Factors Value can be referenced in both Study and
Assay tabs - But the same value cannot be in both tabs,
examples to be added
13Tuesday, 17thTools and implementations
14Scripts and tools plans
- From ISA-TAB to FuGE ML
- To be done (Phil, Ally wants this -)
- Map the ISAcreator java model to FuGE general
elements - Ally to help checking/validating mapping
- From FuGE ML to ISA-TAB
- Work in progrosses XSLT under development
- Philippe, Frank, Ally, Nigel and Andy
- ISA-TAB creation
- ISAcreator (will be open source)
- Other tools from participating systems.
- ISA-TAB validation
- - Common, minimum validation rules/scripts to be
defined/developed (e.g. structure, case
sensitive) - Use part of the ISAcreator configuration as
library - -gt Google doc with list of basic rules (to be
identified when creating the v1 spec) - -gt The ISAcreator config code will be stripped
down to the basic rules and posted on the ISA-TAV
sf site (SVN) - ISA-TAB and MAGE-TAB
- - Helen and Susanna to talk to MGED, ref to NIH
grant
15Wednesday, 18th Next steps and publication plans
16Release plans
- Release candidate 1, ISA-TAB v1
- Philippe, Marco to edit/add all the agree changes
in the spec - -gt done by June 27th
- Dave, Ally, Kieran check and review
- -gt done by July 18th
- All to read and comment/suggest
- -gt wiki pages will be set up on ISA-TAB site to
facilitate discussion - -gt all comments received by end of August
- Philippe fix the current ISA-TAB examples to
reflect new spec - -gt Release candidate 1, ISA-TAB v1 out by mid
Sept - This version will include details on fields in
Investigation file and the list of fields allowed
in the Study and Assay files - - The specific Assay files defined by the
participating communities, will be listed and new
can be added, without having to release new
versions
17Pending issue
- Reference system for SEND and CDISC (STDM)
- Take this discussion on the ISA-TAB list with
interested parties, Michael and Steve in
particular - Subject ID in STDM should be the same of Source
Name ISA-TAB (as add another subject ID column?)
then add the file as external - Each STDM file has a Study ID, Domain ID, Subject
ID (2 types of these, probabaly we can use the
UsubjID) and Idvar (column) and Idvarval (column
value)
18Publication and next workshop
- Publication suggested content
- Rationale and use case for ISA-TAB
- History from MAGE-TAB to ISA-TAB
- Present it as format (and interface to a format)
not a standard - -gt Describe scripts making it interoperable
with other formats - Example of implementations to date
- -gt Tools/systems that have output/input in this
format - -gt Also (simply) more real examples from
communities posted on this site - Start writing end of this year, to submit early
next year journal to be decided later - Next workshop would be a users meeting (in 2009)
- To fix minor issues, recommendations,
ambiguities, sharing development approaches,
components etc.