What Is The Paraffin-Embedded Tissue Archive Project? - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

What Is The Paraffin-Embedded Tissue Archive Project?

Description:

Business Agreement with AIM Document Management to scan up to 600,000 Surgical ... Conversion of the microfiche information for uploading into the caTIES ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 8
Provided by: davidfens
Category:

less

Transcript and Presenter's Notes

Title: What Is The Paraffin-Embedded Tissue Archive Project?


1
What Is The Paraffin-Embedded Tissue
ArchiveProject?
  • Joint venture of the Biomedical Informatics
    Facility, the Department of Pathology and
    Laboratory Medicine and the Abramson Cancer
    Center
  • Business Agreement with AIM Document Management
    to scan up to 600,000 Surgical Pathology Reports
    (SPRs) of paraffin-block archived tissues
  • Plan to populate a relational database schema,
    caTIES, with information from the scanned SPRs
  • Collaboration with the caTIES development team at
    the University of Pittsburgh to add TIFF images
    of pathology reports into the caTIES database

2
Paraffin-Embedded Tissue Archive Project Why Is
It Important To caBIG?
  • There are vast quantities of paraffin embedded
    tissue in archives within Surgical Pathology
    Departments throughout the country
  • The UPenn Department of Pathology and Laboratory
    Medicine has approximately 600,000
    paraffin-embedded tissue samples that date back
    to the 1940s
  • The project will use microfiche that will be
    scanned to create TIFF images of all
    corresponding pathology reports, extract
    accession/subject IDs and full text using OCR
    (Optical Character Recognition)
  • caTIES and De-ID will be used to chunk pathology
    reports and parse specific text strings
  • The resulting dataset will be uploaded to the
    caTIES database, allowing use of the caTIES query
    interface and data model
  • Conversion of the microfiche information for
    uploading into the caTIES relational database is
    required to expose these tissue samples to the
    caBIG community

3
Paraffin-Embedded Tissue Archive Project How Do
We Plan To Achieve The Project Goals?
Borrowed from john.curtin.edu.au/society/justice
4
Paraffin-Embedded Tissue Archive Project What
Has Been Accomplished Thus Far?
  • All of Penns Paraffin related SPRs have been
    scanned by AIM Document Corp
  • All accession numbers for these SPRs have been
    manually keyed to ensure accuracy
  • Over 430,000 scanned SPRs have been received
  • For each of these SPRs we have received 3 files
    a TIFF image file of the SPR, a Word document
    file of the OCRed SPR, and a text file of the
    manually entered accession number from the SPR
  • A detailed QA analysis of over 1500 reports has
    been conducted

5
Paraffin-Embedded Tissue Archive Project Whats
Next?
  • Test Run
  • Run a roll rated good quality and a roll rated
    poor quality through caTIES to see how well
    they are coded
  • Populate the caTIES schema with all of the
    Paraffin files
  • Run all of the files through the caTIES pipeline
  • Conduct quality assurance testing on the coded
    files
  • Set up a link between the OCRed files and the
    TIFF image files

6
Paraffin-Embedded Tissue Archive ProjectGood
Quality OCRed File
7
Paraffin-Embedded Tissue Archive ProjectPoor
Quality OCR File
Write a Comment
User Comments (0)
About PowerShow.com