Breaking and remaking peer review with the SPIRES databases: Our Experience - PowerPoint PPT Presentation

About This Presentation
Title:

Breaking and remaking peer review with the SPIRES databases: Our Experience

Description:

Entire literature of High-Energy Physics (HEP) Many papers from related fields ... Provide an imprimatur of quality both for the cognoscenti and the amateurs. May 2003 ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 57
Provided by: wwwspires
Category:

less

Transcript and Presenter's Notes

Title: Breaking and remaking peer review with the SPIRES databases: Our Experience


1
Breaking and remaking peer review with the SPIRES
databases Our Experience
  • Travis Brooks
  • SPIRES Scientific Databases Manager
  • Stanford Linear Accelerator Center
  • Pat Kreitz
  • Director, Technical Information Services
  • Stanford Linear Accelerator Center
  • Thanks to Ann Redfield, Michael Peskin, Louise
    Addis, Heath OConnell, and Georgia Row for
    useful input.

2
Topics
  • Part I
  • History and current situation of SPIRES, arXiv,
    and Journals
  • Part II
  • Citation counting our experiences and views
  • Part III
  • Speculation for the future

3
Part I
  • Some history, some current data,
  • and some guesses

4
What is SPIRES?
  • Bibliographic records for over half a million
    papers
  • Entire literature of High-Energy Physics (HEP)
  • Many papers from related fields
  • Citations for e-prints and journal articles
  • Over 25,000 searches a day
  • Main site and personnel at SLAC
  • DESY, FNAL, Durham U., Kyoto U, IHEP (Moscow)

5
arXiv
  • Since 1991
  • Makes full-text available for download
  • Links to SPIRES citation lists
  • Allows revisions
  • Divides content into hep-th, hep-ph, hep-ex and
    many other categories

6
hep-th vs. hep-ex
  • Sharp distinction between Theory and experiment
  • Different from other disciplines
  • Difference between the publishing cultures of the
    HEP theorist and the HEP experimentalist

7
th vs. ex Publishing
  • Experiment
  • Large Collaborations (gt500 authors)
  • Difficult to referee
  • Reporting results
  • Theory (my focus)
  • Small collaborations (lt10 authors)
  • Self-contained papers
  • Conversational
  • hep-th and hep-ph similar

8
hep-th (Pr)eprints A Timeline
  • Mid 1960s preprints sent by authors to select
    groups
  • 1969 SLAC library began ppf (preprints in
    particles and fields) list
  • Created demand for distribution
  • Legitimized preprints/preprint libraries
  • Led to anti-ppf list

9
hep-th (Pr)eprints A Timeline
  • 1974 SPIRES-HEP database indexed preprints
  • Allowed more general, worldwide, distribution and
    retrieval of preprint titles
  • Still needed papers by mail
  • Preprints used conversationally
  • On WWW in 1991

10
hep-th (Pr)eprints A Timeline
  • 1991 arXiv.org allowed immediate and universal
    electronic access to full-text of preprints
  • Preprints became eprints
  • Demise of all HEP journals predicted

11
Preprints not new
  • arXiv is a logical extension of the movement
    towards preprints, not a bolt from the blue
  • Preprints have a long history of use
  • Preprints are more easily distributed today

12
History of hep-th arXiv
  • arXiv is busy
  • Over 90 of papers published in Phys. Rev. D
    after 1995 were submitted to arXiv
  • But authors still publish!
  • 75 of hep-th papers (prior to 2002) have been
    published

13
When are eprints published?
  • Difference between Phys. Rev. D publication time
    and eprint appearance time
  • 6,000 articles from June 1997-2003
  • Mode at 5 months
  • 17 negative times not shown

14
When are they published?
  • What caused the negative times?
  • Are the large delays from testing the waters?
  • Do researchers wait for peer review to determine
    if an article is worth reading?

15
When are papers read?
  • QWhen does most citing occur?
  • APlot the citations a published hep-th article
    receives after its arXiv submission
  • 8000 published papers in sample
  • Includes citations from journal papers and arXiv
    papers (essentially the same set)

16
Eprints, not journals
  • Journal lag time 5 months
  • Citation peak occurs after eprint release, not
    journal release
  • InferenceHEP theorists dont wait for the
    journal.

17
Current hep-th situation
  • Researchers read the arXiv to find out the latest
    scientific information
  • They base their work on what is in the arXiv
  • Scientific priority is given by arXiv time stamp,
    not journal submission date
  • They barely notice if it is published

18
HEP theorists viewpoint
  • arXiv is for immediate communication
  • A running scientific conversation
  • Overheard about a paper not sent to hep-ph
  • He didnt publish it, he just sent it to Phys.
    Rev. D

19
Journals Irrelevant?
  • 75 of hep-th papers (prior to 2002) have been
    published
  • Correlation between large cite counts and
    publication
  • Journals are still very much alive

20
Why do authors publish?(4 guesses)
  • 1-Inertia
  • There is no other system as developed or as
    trusted
  • Journals are ingrained in researchers psyches
  • But journals dont appear to be going away
    (quickly)

21
Why do authors publish?
  • 2-Feedback
  • Refereeing is useful for this paper and the next
  • The paper is already on arXiv while it is being
    refereed
  • But arXiv submissions generate comments and
    revisions as well

22
Why do authors publish?
  • 3-Professional Advancement
  • Do tenured/secure faculty publish fewer of their
    eprints?
  • Anecdotally Witten seven 50 cited papers as
    eprints only
  • In general interesting question to think about
  • If professional advancement is the sole purpose
    of peer-review, could we not do better?
  • Are we using the peer review process as a
    substitute for performance evaluation?

23
Why do authors publish?
  • 4-Archival value
  • Do authors believe that arXiv is a good archive?
  • Will arXiv only eprints still be around
    (readable, accessible) in 100 years?
  • Perception, not reality, matters here
  • E-only journals appear no different
  • Centralization, not media, should be the concern

24
Part II
  • Cite counts and the future

25
Cite Counting
  • Cite counts present a data-driven picture of the
    hep-th eprint culture
  • Much work already (by many here today)
  • Cites to HEP eprints from journal articles are
    high and rising (Brown 2001, Youngen 1998,
    others)
  • arXiv impact factor is similar to journals
    (Fabbrichesi and Montolli, 2001)
  • Many other studies (often using SPIRES-HEP data)

26
Cite Counting
  • Cite counting for bibliometric purposes seems
    reasonable (perhaps)
  • Cite counting for peer review purposes?
  • Services like SPIRES (free) and ISI (fee) make
    cite counts available to other researchers,
    hiring committees, and tenure review boards.

27
Cite Counts Peer Review?
  • Are citations the electronic answer to refereed
    journals?
  • Currently the only answer
  • Only one widely available
  • But not a very good answer
  • arXiv SPIRES cite counts are not Phys. Rev.
    Lett.

28
Cites Pros and Cons
  • SPIRES has been making citations available for
    over 25 years
  • We have noticed a few things about the process
  • Some good
  • Some bad
  • Some merely interesting

29
Advantages-Dynamic
  • Cite counts change with the field
  • Classics
  • New papers
  • Newly discovered classics
  • ExWeinbergs Standard Model paper
  • Few cites initially
  • Over 5,000 now
  • ExM. Peskins topcite reviews

30
Advantage-Fast
  • Cite counts begin immediately after appearance
  • Electronic publishing means peer review is the
    lag time
  • Lag time makes journals archivists rather than
    communicators
  • Led to the replacement of this function by
    arXiv/SPIRES/etc.

31
Advantage-Easy
  • SPIRES tracks citations with 4 staff members
  • Total staff is about 8
  • We are not that technically sophisticated
  • We are not even especially clever!
  • Still it is non-trivial

32
Disadvantage-Accuracy
  • Speed, ease rely on electronic processing
  • Accuracy or speed?
  • Reference lists in a paper change over an
    articles life
  • What counts as a cite?
  • Which version of the paper?

33
Disadvantage-Relevance
  • TheoryCitations are a measure of what scientists
    read
  • But Does Citing Reading ?
  • Simkin Roychowdhury (cond-mat/0212043 and
    cond-mat/0305150)
  • Students, general public

34
Disadvantage-Relevance
  • TheoryCites are a mark of quality
  • What about brilliant papers out of the
    mainstream?
  • Are papers really even referenced for scientific
    reasons?
  • Or are they referenced for sociologic reasons?
  • Or are references simply copied?

35
Disadvantage-Relevance
  • Tongue-in-cheek reasons for not citing prior work
    (humorous, but not far off)
  • If its old, foreignorold and foreign
  • They dont cite us either
  • Rain forest preservation through paper-saving
  • I figured if youre smart enough to read this
    paper, you already knew that!
  • from The Scientist

36
Interesting-Importance
  • People take it seriously
  • Funding, careers, reputations, etc. are perceived
    to depend in some way on SPIRES citation data

37
Interesting-Importance
  • We receive 50 emails a day, most of them
    revolving around incorrect, incomplete, or
    missing references
  • Usually from an author whose paper was cited but
    missed
  • Often marked URGENT
  • Occasionally with panicked explanations including
    the date that the review committee is meeting
  • Sometimes accusing SPIRES of sabotage, or
    otherwise expressing outrage at a missed citation

38
Importance is helpful
  • Importance shows that cite counting is useful (or
    at least used!)
  • Users of the information are motivated to help
    maintain it
  • SPIRES is almost open source
  • We help eliminate authors typos, they help
    eliminate our errors

39
helpful
  • SPIRES can replace bad cites with the correct
    ones
  • Corrects our errors
  • Corrects author errors
  • Even helps limit propagation of errors
  • Ex a Witten article with 1,300 cites had 100
    incorrect cites, all the same typo

40
but also worrisome
  • Responsibility lies with the maintainers of the
    citation counts
  • Previously in the hands of referees and editors
  • Self-citation
  • Boost counts artificially
  • Deception
  • We have had it happen

41
Citation Counts Summary
  • We do it, and it works
  • Fast, Easy, and Fluid
  • Valued by the Community
  • It is more than imperfect
  • Relevance and Accuracy
  • Does not yet replace traditional peer review

42
Part III
  • What would it take to truly change peer review?

43
To change peer review
  • Stakeholders in the peer review system
  • Editors
  • Referees
  • Authors
  • Readers
  • Fundamental differences between disciplines
  • hep-th and hep-ex are different in their adoption
    of eprints

44
To change peer review
  • Functions of peer review when divorced from
    communication
  • One must replace (or discard) all of these
  • Metrics for papers
  • Metrics for scientists
  • Metrics for truth?

45
Peer review good science ?
  • Peer review gives a seal of approval
  • Laypeople
  • Medicine, Environmental Science, etc.
  • Refereeing process is filled with examples of
    weakness
  • Yet it feels fundamentally sound
  • Publishers have taken this role of vetting
    science

46
Truth is more complex
  • Community acceptance determines scientific truth
  • Yesterdays sensation, todays calibration
  • The test of time is longer than the 6 month lag
    time for journal articles
  • Immediacy is needed for communication and
    conversation
  • But deliberation is needed for context and
    community judgment

47
An Opportunity
  • Place an article in the context of the
    surrounding work
  • Reference linking only a baby step
  • Degree to which a finding has been verified or
    contradicted by earlier or later work
  • Ex M. Peskins Topcites reviews at SLAC
  • The numbers are amusing
  • Context is the real value

48
Context
  • Another Example Particle Data Group
  • Reports data from all HEP experiments
  • Sorts and combines data
  • References to comments on validity
  • References to interpretations of the data

49
PDG Example
50
Opportunities
  • Intense scrutiny not possible for journals
  • Context is important
  • Amazon and google
  • Personalized and dynamic
  • Citebase
  • Torii

51
A New system
  • Any new system would need to do (at least) the
    following
  • React to changes in the scientific world You
    cannot read the same paper twice
  • Provide context as well as content
  • Be fast and easy enough to keep up with
    scientific conversations taking place on
    arXiv(es)
  • Provide an imprimatur of quality both for the
    cognoscenti and the amateurs

52
Summary
  • SLAC-SPIRES and arXiv helped transform the hep-th
    publishing environment
  • Journals play no role in communication
  • Journals are still widely used
  • Citation counting played a part in this
    transition
  • Counting is not a complete solution to peer
    review
  • New models of peer review are farther away
  • Should be richer than any current example

53
Why HEP Theory?
  • No proprietary/patent issues
  • Papers can be verified by hand, by any
    knowledgeable reader
  • Work is like a continuing dialog, each paper
    sparking new, creative ideas

54
Same basic style
  • Note that the basic publication style has not
    really changed
  • HEP Theory has not moved away from papers written
    by a few authors to more complex
    technology-enabled collaborations

55
Other Fields
  • HEP experiment has had more radical changes in
    working style
  • Worlds largest database (gt600TB)
  • Worldwide data processing grid
  • Close to 1000 authors on a paper
  • Technology used to push pre-paper scientific
    collaboration to new levels
  • Other fields might retain traditional journal
    roles while using unpublished research as
    additions and expansions rather than substitutes

56
Conclusions
  • HEP theorists have universally adopted eprints as
    the means of intra field communication
  • Peer-reviewed journals are still heavily used,
    but for different purposes
  • The needs of HEP theorists were very close to the
    traditional publication model
Write a Comment
User Comments (0)
About PowerShow.com