Reinventing Science Librarianship - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Reinventing Science Librarianship

Description:

Title: PowerPoint Presentation Author: cablake Last modified by: Laura Iandoli Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show (4:3) – PowerPoint PPT presentation

Number of Views:127
Avg rating:3.0/5.0
Slides: 26
Provided by: cabl7
Category:

less

Transcript and Presenter's Notes

Title: Reinventing Science Librarianship


1
Reinventing Science Librarianship
  • Education for New Roles
  • Catherine Blake
  • cablake_at_email.unc.edu
  • http//www.ils.unc.edu/cablake
  • University of North Carolina _at_ Chapel Hill

2
Source The DCC Curation Lifecycle Model
3
Creation
  • Jupiter has moons
  • Galileo, Sidereus Nuncius, 1610
  • Relative sizes of the Earth, Sun and Moon
  • Aristarchus's 3rd century BC
  • this image - 10th century AD

Source Wikipedia
4
Creation
  • Little Dipper microarray processors
  • Biology/pharmacology
  • The first beam in the Large Hadron Collider at
    CERN1 was successfully steered around the full 27
    kilometers of the worlds most powerful particle
    accelerator

Source http//www.scigene.com/products/little_dip
per.html http//mediaarchive.cern.ch/MediaArchive/
Photo/Public/2008/0809002/0809002_01/0809002_01-A5
-at-72-dpi.jpg
5
Acquisition Collection
  • Data acquired directly from scientists
  • Heterogeneous formats
  • multi-media
  • annotations on a spreadsheet
  • Varying quality
  • experimental settings
  • Student vs verified data

6
Identification Cataloging
  • Collectively identifying resources
  • Group think
  • Social bookmarking
  • Participatory cataloging
  • Eg UNC photographs

7
Storage Preservation
  • Storage
  • 92 on magnetic media
  • 5 exabytes of print, film, magnetic, and optical
    storage media produced about in 2002
  • Preservation
  • Heterogeneous
  • Changing hardware
  • Changing software

Image Source http//www.cray.com/products/index.h
tml http//www2.sims.berkeley.edu/research/project
s/how-much-info-2003/
8
Barriers to access removed
  • Environment
  • New source of information providers (Scientists,
    Granting agencies)
  • NIH Mandated access
  • Consequences
  • No single point of access
  • Different levels of access required
  • HIPPA compliance
  • Maintaining cultural norms

9
Use and Reuse
  • Data and Text Mining
  • Use data collected for a different purpose
  • Eg a side-effect of one drug becomes the purpse
    of another
  • Information Synthesis
  • Combine speculative information
  • Literature Based Discovery
  • Uncover transitive connections from text

10
Data Oriented Roles
  • Data Consultant
  • Share best practice regarding how to organize
    share data
  • Data Distributor
  • Scientists control the data, distributor makes
    the data available to others
  • Data Manager
  • Manager organizes and keep the data

11
New Roles
  • Data Service Provider
  • Data conversion and pre-processing
  • Data and Text Analyst
  • Scientist provides the data, analyst applies
    visualization, data and text mining tools.
  • Embedded Roles (Data Scientist)
  • Information Work flow

12
Data Oriented Roles
  • Information organization
  • Conceptual Modeling
  • Create and understand
  • ER diagrams
  • UML diagrams
  • Concept maps

13
Reference Model For an Open Archival Information
System
Sourcenost.gsfc.nasa.gov/isoas/presentations/oais
_tutorial_200005.ppt
14
Data Oriented Roles
  • Conceptual ?? relational models
  • Good database design
  • Normalization
  • Methods to enforce
  • data quality
  • referential integrity
  • Ongoing maintenance

15
New Roles
  • Text Mining A case study
  • All text is not created equal
  • Things that in the way
  • Page breaks
  • Figures
  • Tables
  • Special characters
  • Implications to preservation

16
Human readable form (PDF)
17
Data Services Case Study
18
Machine readable form
  • gtlt/TABLE
  • gtltP
  • gtScientists engage in the discovery process more
    than any other user population, yet their
    day-to-day activities are often elusive. The
    development of accurate models often requires
    that a scientist resolve conflicting
    evidence.lt/P
  • gtltP
  • gtOne activity that consumes much of a scientists'
    time is ltI
  • gtsynthesislt/I
  • gt, ltIMG
  • SRC"/giflibrary/12/ldquo.gif"
  • BORDER"0"gtthe dialectic combination of thesis
    and antithesis into a higher stage of truthltIMG
    SRC"/giflibrary/12/rdquo.gif"
  • BORDER"0"gt (ltI
  • gtMerriam-Webster's Collegiate Dictionarylt/I
  • gt, ltA
  • HREF"BIB24"
  • gt2004lt/A
  • gt). This dictionary definition reflects the
    alternative viewpoints that often occur when
    multiple empirical studies explore the same
    phenomena. The synthesis activity results in an
    overall findingnbsp-nbspa higher stage of
    truthnbsp-nbspwhich scientists achieve by

19
First phase pre-processing
  • gtlt/TABLEgt
  • ltPgtScientists engage in the discovery process
    more than any other user population, yet their
    day-to-day activities are often elusive. The
    development of accurate models often requires
    that a scientist resolve conflicting
    evidence.lt/Pgt
  • ltPgtOne activity that consumes much of a
    scientists' time is ltIgtsynthesislt/Igt, ltIMG
    SRC"/giflibrary/12/ldquo.gif BORDER"0"gtthe
    dialectic combination of thesis and antithesis
    into a higher stage of truthltIMG
    SRC"/giflibrary/12/rdquo.gif BORDER"0"gt
    (ltIgtMerriam-Webster's Collegiate Dictionarylt/Igt,
    ltA HREF"BIB24"gt2004lt/Agt). This dictionary
    definition reflects the alternative viewpoints
    that often occur when multiple empirical studies
    explore the same phenomena. The synthesis
    activity results in an overall findingnbsp-nbsp
    a higher stage of truthnbsp-nbspwhich
    scientists achieve by

20
Second phase pre-processing
  • Add Identifiers
  • break paragraphs into sentences
  • Add document, section, paragraph, sentence IDs
  • Replacements
  • symbols , references
  • Output
  • IdentifiersOne activity that consumes much of a
    scientists' time is synthesis the dialectic
    combination of thesis and antithesis into a
    higher stage of truth _BIB_24.
  • IdentifiersThis dictionary definition reflects
    the alternative viewpoints that often occur when
    multiple empirical studies explore the same
    phenomena.

21
Text Analytics
  • Clustering
  • Categorization
  • Association Rules

22
Visualization
NCI-funded research 1995-2001
23
Embedded Roles
24
Embedded Roles
  • Workflow
  • Deep understanding
  • Data formats
  • Access norms
  • Reward structures
  • Custom pre-processing

25
Closing Remarks
  • Not everyone will have every skill
  • Existing skills that will remain critical
  • Strong ties to faculty
  • Strong negotiating skills
  • Knowledge of standards and resources
  • The roles exist, its not clear where they will
    live within an institution

The ability to think like someone within a
discipline
Write a Comment
User Comments (0)
About PowerShow.com