Linking research papers and research data: possibilities for a generic solution - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Linking research papers and research data: possibilities for a generic solution

Description:

Linking research papers and research data: possibilities for a generic ... Archaeology, Astronomy, Biochemistry, Biosciences, Chemistry, Physics, Social Policy ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 19
Provided by: graham53
Category:

less

Transcript and Presenter's Notes

Title: Linking research papers and research data: possibilities for a generic solution


1
WWW2006 repositories workshop
  • Linking research papers and research data
    possibilities for a generic solution

2
Outputs benefits
  • The identification of
  • workflows, norms and perceived problems in the
    use of source and output repositories
  • common attributes across disciplines
  • A generic technical specification for functional
    enhancements to source and output repositories,
    identified from a survey of active researchers
  • A pilot system that demonstrates the linking of
    holdings in a source repository (the UK Data
    Archive) to research papers stored in output
    repositories
  • The ability more conclusively to track the use
    and influence of one's published research
  • A structured means of surveying research
    publications and their associated source data
    across an entire discipline or within a specific
    research theme
  • An environment with added value output
    repositories that link to their sources and
    source repositories that link to their outputs
    will expand the opportunities for dissemination
    of research and scholarship.

3
Constituency
  • Seven scientific disciplines surveyed
  • Archaeology, Astronomy, Biochemistry,
    Biosciences, Chemistry, Physics, Social Policy
  • Academic researchers (staff PGs), independent
    researchers, government
  • 3,700 e-mail invitations despatched
  • 377 online respondents (10)

4
Response rates
Astronomy 64 16.9
Archaeology 63 16.7
Physics 63 16.7
Social Policy 62 16.5
Biochemistry 44 11.7
Biosciences 42 11.1
Chemistry 39 10.4
5
Endorsement - 1
The value of direct links from source to output data University academic staff University research assistant PG student Contract researcher Independent researcher Other Totals
Significant advantage 85 18 33 11 2 26 175
Useful 78 9 41 5 4 9 146
Interesting 24 4 5 3 0 5 41
Of no interest 9 0 0 0 0 1 10
Not sure 7 0 7 0 1 2 17
Other 1 1 0 0 0 1 3
Totals 204 32 86 19 7 44 392
6
Endorsement - 2
The value of direct links from output to source data University academic staff University research assistant PG student Contract researcher Independent researcher Other Totals
Significant advantage 83 19 39 12 3 19 175
Useful 84 9 35 5 3 15 151
Interesting 16 2 6 1 1 3 29
Of no interest 7 0 0 0 0 1 8
Not sure 2 0 2 0 0 1 5
Other 5 1 1 0 0 2 9
Totals 197 31 83 18 7 41 377
7
Who uses source repositories?
REPOSITORY Submitters per discipline
Archaeology Data Service 29 45
Brookhaven National Laboratories 4 6
CERN 13 21
Genbank 23 51
National Crystallography Service 8 20
Protein Structures 19 47
SuperCOSMOS 3 5
UK Data Archive 11 18
Uniprot 2 4
Other 99
8
Frequency of submission
9
Who uses the Other source repositories and what
are they?
  • Archaeology
  • Astronomy
  • (40 of total Other)
  • Biosciences
  • English Heritage, Portable Antiquities Database
  • NED (NASA-IPAC Extragalactic database), CDS
    (Centre de Donnees Stellaires), ADS (Harvard),
    SIMBAD,NASA data archives, Advanced Camera for
    Surveys Science Archive, Very Large Array
    Archive, European Southern Observatory Archive,
    etc. etc.
  • BioMagResBank

10
Source data formats
11
The 76 significant others?
  • latex.cc source code, .cif (crystallographic
    data), .pdb, .mtz, .pool, .root, .raw, .swf,
    .fla, .raw, .mpg, binary files, chemdraw cdx,
    xwin nmr files, .ps files, .fla, .swf, masslynx
    files, mathematica, derived data in PAw-format
    ntuples, raw mass spectrometry data, Kanga, X-ray
    diffraction data, kaleidagraphs, Atlas/ti
    hermeneutic unit files, C/shell scripts,
    Fourier induction decay files, spectra, TeX
    source (math), etc., etc., etc., etc..

12
Who assigns metadata?
Who assigns metadata to your research data? Academic staff Research assistant Postgraduate student Contracting researcher Independent researcher Other Totals
I decide which terms to use and I assign them 118 15 47 11 5 16 212
Research colleague(s) assign metadata on the team's behalf 34 6 6 3 1 5 55
Research support staff assign metadata on the team's behalf 13 3 1 3 0 2 22
Metadata are assigned by library/information services staff 4 0 0 0 0 0 4
Metadata are assigned by the repository administrators 29 2 0 2 0 4 37
Metadata are generated automatically 31 9 9 4 0 10 63
It is not known who assigns metadata 30 8 23 0 1 6 68
Other 12 5 9 1 1 9 37
Totals 271 48 95 24 8 52 498
13
Archaeology refer in some cases to use of
thesauri, Dublin Core, etc.
14
Key metadata
15
The Other metadata
  • Some examples
  • Archaeological period, artefact material,
    artefact type, conservation method
  • Celestial object, position and observation date
  • Chemical entity, chemical identifier (InChI)
  • Description of the instrument operating mode
  • Description of GIS processes applied, min/max
    co-ordinates, cell resolution for raster data
  • Description of experimental conditions under
    which the data was generated
  • Experimental method used
  • Protein sequence

16
Output repositories
What level of searching do you normally find sufficient when using an output repository? What level of searching do you normally find sufficient when using an output repository? What level of searching do you normally find sufficient when using an output repository? What level of searching do you normally find sufficient when using an output repository?
Simple - e.g. author, title, keyword, date 59.3 223
Advanced, using a range of fields and identifiers 21.5 81
Employing Boolean logic 7.2 27
Using a subject thesaurus or subject headings 1.3 5
No preference 8.2 31
Other (please specify) 2.4 9
17
Evolving strategy
  • These early indications from the StORe
    questionnaire confirm a
  • strategy in which
  • the pilot middleware could provide a broad core
    generic solution
  • the middleware must be capable of accepting a
    limited number of discipline-specific add-ons
  • a standard platform for metadata can be
    established to reflect a large proportion of
    practices and needs.
  • In addition, further analysis is determining that
  • cross-discipline data requirements must be met
    for output and source data
  • a range of different attitudes to data sharing
    will have to be supported by effective validation
    if repositories are to be accepted and effective
  • improved online support is expected to be the
    most appropriate and economical means of meeting
    expectations for help
  • there are indications of a considerable lack of
    awareness of repositories amongst academic staff
    and postgraduates.

18
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com