Digitization Workflow Management System for Massive Digitization Projects - PowerPoint PPT Presentation

About This Presentation
Title:

Digitization Workflow Management System for Massive Digitization Projects

Description:

Digitization Workflow Management System for Massive Digitization Projects The 2nd International Conference on Universal Digital Library 2006 (ICUDL 2006) – PowerPoint PPT presentation

Number of Views:186
Avg rating:3.0/5.0
Slides: 33
Provided by: admin1392
Learn more at: http://www.bibalex.org
Category:

less

Transcript and Presenter's Notes

Title: Digitization Workflow Management System for Massive Digitization Projects


1
Digitization Workflow Management System for
Massive Digitization Projects
The 2nd International Conference on Universal
Digital Library 2006 (ICUDL 2006) Mohamed
Yakout Noha Adly Magdy
Nagi mohamed.yakout_at_bibalex.org
noha.adly_at_bibalex.org
magdy.nagi_at_bibalex.org
  • Bibliotheca Alexandrina
  • November 19, 2006

2
Goals
  • Automate, track and manage the digitization
    workflow.
  • Flexibility in defining digitization workflow
    Phases.
  • Support dynamic evolution and deviations with a
    history tracking.
  • Flexibility integration with the LIS and Library
    Digital Repository.
  • Accept external partially digitized Jobs to start
    in the proper Phase within the digitization
    workflow
  • Simultaneous management of multiple projects with
    a diversity of materials (books, journals,
    manuscripts, audio, video, slides, etc)

3
Related Work
  • Manual workflow management using several software
    packages (MS Excel, MS SharePoint, MS Project)
  • Simple tracking workflow system with limited
    capabilities
  • Several integrated digitization activities
    (digital capturing, image processing, OCRing, )
    in one software
  • DOCWorks from CCS.
  • BookRestorer from i2s.
  • OUPS
  • Limitations
  • Tightly coupled with certain tools and do not
    allow easily other tools to be integrated.
  • No Resources Management (e.g. Workstations and
    users)
  • Lack of projects and collections management.
  • Manual files handling between the storage server
    and clients.
  • Lack of handling workflow exceptions, dynamic
    evolution and deviations except through manual
    intervention.

4
System Data Model
5
System Data Model
  • The object being digitized
  • Book for Naguib Mahfouz
  • Photos for an event
  • Map for Alexandria
  • Music sheet for Omar Khayrat

6
System Data Model
  • All types of materials in the system
  • Book Manuscripts
  • Map Journals
  • Audio Video

7
System Data Model
  • A task that should be applied within the
    digitization process
  • Scanning Processing
  • OCRing Encoding
  • Publishing Zipping for archiving

8
System Data Model
  • The system users with several roles
  • Digital lab operators
  • Shift operators
  • Administrator

9
System Data Model
  • Represents logical grouping for the Jobs
  • Nasser
  • AlexMed
  • AMEEL

10
System Data Model
  • The computer used to perform the Phase

11
System Architecture
12
System Architecture
13
System Architecture
14
System Handlers
ltPhase Name"Book Arabic OCR"gt ltPrePhasegt
ltPhysical Mode"UnRestricted"gt ltFolder
Name"OTIFF" Create"false"
ToDestination"false" NewName"OTIFF"
Mode"Restircted"gt ltFile
Name"OriginalFiles" Type"tif" Count""
ToDestination"false" Compare""/gt
lt/Foldergt . . lt/Physicalgt
lt/PrePhasegt ltPostPhasegt ltPhysical
Mode"UnRestricted"gt ltFolder Name"TXT"
Create"false" ToDestination"true"
NewName"TXT" Mode"Restircted"gt
ltFile Name"" Type"frf" Count"1"
ToDestination"true" Compare""/gt
ltFile Name"" Type"art" Count"1"
ToDestination"true" Compare""/gt
lt/Foldergt lt/Physicalgt ltDatabasegt
ltField Name"Font" DisplayName"Font Family "
/gt ltField Name"LrnPage" DisplayName"Learn
Page "/gt . . lt/Databasegt
ltReflectionCall Method"packageName.doSomething"
/gt lt/PostPhasegt lt/Phasegt
  • XML Phases Definition Handler
  • Pre-Phase and Post-Phase
  • Physical section
  • Database section
  • Reflection Call

15
System Handlers
ltPhase Name"Book Arabic OCR"gt ltPrePhasegt
ltPhysical Mode"UnRestricted"gt ltFolder
Name"OTIFF" Create"false"
ToDestination"false" NewName"OTIFF"
Mode"Restircted"gt ltFile
Name"OriginalFiles" Type"tif" Count""
ToDestination"false" Compare""/gt
lt/Foldergt . . lt/Physicalgt
lt/PrePhasegt ltPostPhasegt ltPhysical
Mode"UnRestricted"gt ltFolder Name"TXT"
Create"false" ToDestination"true"
NewName"TXT" Mode"Restircted"gt
ltFile Name"" Type"frf" Count"1"
ToDestination"true" Compare""/gt
ltFile Name"" Type"art" Count"1"
ToDestination"true" Compare""/gt
lt/Foldergt lt/Physicalgt ltDatabasegt
ltField Name"Font" DisplayName"Font Family "
/gt ltField Name"LrnPage" DisplayName"Learn
Page "/gt . . lt/Databasegt
ltReflectionCall Method"packageName.doSomething"
/gt lt/PostPhasegt lt/Phasegt
  • XML Phases Definition Handler
  • Pre-Phase and Post-Phase
  • Physical section
  • Database section
  • Reflection Call

16
System Handlers
ltPhase Name"Book Arabic OCR"gt ltPrePhasegt
ltPhysical Mode"UnRestricted"gt ltFolder
Name"OTIFF" Create"false"
ToDestination"false" NewName"OTIFF"
Mode"Restircted"gt ltFile
Name"OriginalFiles" Type"tif" Count""
ToDestination"false" Compare""/gt
lt/Foldergt . . lt/Physicalgt
lt/PrePhasegt ltPostPhasegt ltPhysical
Mode"UnRestricted"gt ltFolder Name"TXT"
Create"false" ToDestination"true"
NewName"TXT" Mode"Restircted"gt
ltFile Name"" Type"frf" Count"1"
ToDestination"true" Compare""/gt
ltFile Name"" Type"art" Count"1"
ToDestination"true" Compare""/gt
lt/Foldergt lt/Physicalgt ltDatabasegt
ltField Name"Font" DisplayName"Font Family "
/gt ltField Name"LrnPage" DisplayName"Learn
Page "/gt . . lt/Databasegt
ltReflectionCall Method"packageName.doSomething"
/gt lt/PostPhasegt lt/Phasegt
  • XML Phases Definition Handler
  • Pre-Phase and Post-Phase
  • Physical section
  • Database section
  • Reflection Call

17
System Handlers
ltPhase Name"Book Arabic OCR"gt ltPrePhasegt
ltPhysical Mode"UnRestricted"gt ltFolder
Name"OTIFF" Create"false"
ToDestination"false" NewName"OTIFF"
Mode"Restircted"gt ltFile
Name"OriginalFiles" Type"tif" Count""
ToDestination"false" Compare""/gt
lt/Foldergt . . lt/Physicalgt
lt/PrePhasegt ltPostPhasegt ltPhysical
Mode"UnRestricted"gt ltFolder Name"TXT"
Create"false" ToDestination"true"
NewName"TXT" Mode"Restircted"gt
ltFile Name"" Type"frf" Count"1"
ToDestination"true" Compare""/gt
ltFile Name"" Type"art" Count"1"
ToDestination"true" Compare""/gt
lt/Foldergt lt/Physicalgt ltDatabasegt
ltField Name"Font" DisplayName"Font Family "
/gt ltField Name"LrnPage" DisplayName"Learn
Page "/gt . . lt/Databasegt
ltReflectionCall Method"packageName.doSomething"
/gt lt/PostPhasegt lt/Phasegt
  • XML Phases Definition Handler
  • Pre-Phase and Post-Phase
  • Physical section
  • Database section
  • Reflection Call

18
System Architecture
19
System Architecture
20
System Architecture
21
System Architecture
22
System Modules
  • Check-In
  • Plug-in based for integration.
  • Creates the Job in the system
  • Assign the Job to any Phase
  • Check-Out
  • Java Reflection Call section of the XML Phases
    Definition
  • Ingest the Jobs digital objects into the
    repository

23
System Architecture
24
System Modules
  • Phases Manager
  • Request a new Job
  • Download the Jobs folders and files
  • Submit the Job back to the system to continue
    other Phases
  • Reject a Job and recommend another Phase in
    addition to specifying reasons.
  • Redirect a Job from the default Phase Sequence
  • Provide information on the files level to help
    solving problems

25
System Modules (Contd)
  • Reporting
  • Workflow Tracking
  • Pending Items
  • Late Jobs
  • Operators rates
  • Build Customized Report
  • Archiving
  • On different Medias with different size and on
    online storage
  • Administration

26
BA Digitization Workflow
27
(No Transcript)
28
Quality Assurance
  • Supported on two different stages
  • Maintain QA information on the files levels while
    moving from a Phase to another.
  • A QA Phase is defined in the Digitization Phase
    Sequence as the last Phase before the Archiving

29
Achieving Flexibility Using DWMS
  • The defined Phase Sequence for a Job Type is a
    guide, rather than a prescription.
  • The list of Phases can or can not be in the Phase
    Sequence. The operator can assign the Job to any
    of all of these Phases.
  • Jobs can be Forwarded dynamically to another
    Phase in the Phase Sequence.
  • Changes in the Phase Sequence affects the current
    and new Jobs in the system, leading to natural
    process evolution

30
Job Life Cycle
31
Future Work
  • Check-out plug-in for Fedora..
  • Check-in plug-ins will be implemented to support
    various metadata standards formats MODS, DC, VAR,
    etc.
  • Enhance the software interface with graphical
    tools to help design and follow the digitization
    process.

32
Thank You
  • mohamed.yakout_at_bibalex.org
Write a Comment
User Comments (0)
About PowerShow.com