Title: Digitization%20of%20the%20Lester%20S.%20Levy%20Collection%20of%20Sheet%20Music
1Digitization of the Lester S. Levy Collection of
Sheet Music
- Ichiro Fujinaga
- McGill University
- with
- Michael Droettboom, Karl MacMillan, G. Sayeed
Choudhury, Tim DiLauro, Mark Patton, Teal
Anderson - Levy Project II
- Digital Knowledge Center
- Sheridan Libraries
- Johns Hopkins University
2Contents
- Levy Project
- Levy Sheet Music Collection
- Digital Workflow Management
- Optical Music Recognition
- Gamera
- Guido / NoteAbility
3Lester S. Levy Collection
4Lester S. Levy Collectionlevysheetmusic.mse.jhu.e
du
- North American sheet music (17801960)
- Digitized 29,000 pieces (130,000 sheets)
- Began in 1994
- includes The Star-Spangle Banner and Yankee
Doodle
5(No Transcript)
6Lester S. Levy Collectionlevysheetmusic.mse.jhu.e
du
- North American sheet music (17801960)
- Digitized 29,000 pieces (130,000 sheets)
- Began in 1994
- includes The Star-Spangle Banner and Yankee
Doodle
- Database of
- metadata
- images of music (8bit gray)
- lyrics (first lines of verse and chorus)
- color images of cover sheets (32bit)
7Web Demo
8Digital Workflow Management
- Reduce the manual intervention for large-scale
digitization projects - Creation of data repository (text, image, sound)
- Optical Music Recognition (OMR)
- Gamera
- XML-based metadata
- composer, lyricist, arranger, performer, artist,
engraver, lithographer, dedicatee, and publisher - cross-references for various forms of names,
pseudonyms - authoritative versions of names and subject terms
- Music and lyric search engines
- Analysis toolkit
9Optical Music Recognition (OMR)
- Trainable open-source OMR system in development
since 1984 - Staff recognition and removal
- Lyric removal
- Stems and notehead removal
- Music symbol classifier
- Score reconstruction
- Lyric classifier?
- Optical Character Recognition (OCR)
10The problem
- Suitable OCR for lyrics not found
- Commercial OCR systems are often inadequate for
non-standard documents - The market for specialized recognition of
historical documents is very small - Researchers performing document recognition often
re-invent the basic image processing wheel
11The solution
- Provide easy to use tools to allow domain experts
(people with specialized knowledge of a
collection) to create custom recognition
applications - Generalize OMR for structured documents
12Introducing Gamera
- Framework for creation of structured document
recognition system - Designed for domain experts
- Image processing tools (filters, binarizations,
) - Document segmentation and analysis
- Symbol segmentation and classification
- Syntactical and semantic analysis
- Generalized Algorithms and Methods for
Enhancement and Restoration of Archives
13Features of Gamera
- Portability (Unix, Windows, Mac)
- Extensibility (Python and C plugins)
- Easy-to-use (experts and programmers)
- Open source
- Graphic User Interface
- Interactive / Batchable (scripts)
14Gamera Interface(screenshot in Linux)
15Gamera Interface(screenshot in Linux)
16Histogram(screenshot in Linux)
17Thresholding(screenshot in Linux)
18Thresholding(screenshot in Linux)
19Staff removal Lute tablature
20(No Transcript)
21Classifier Lute(screenshot in Linux)
22Staff removal Neumes
23Classifier Neums(screenshot in Linux)
24Greek example
25GUIDO Music Notation FormatH. Hoos, K. Renz, J.
Kilian
- A formal language for score-level
representation - Plain text readable, platform independent
- Extensible and flexible
- Adequate representation
- NoteServer Web/Windows
- GUIDO/XML
- NoteAbility (K. Hamel)
26NoteAbility Demo
27Conclusions
- Levy Collection
- Searchable Metadata
- Online images (public domain) of music and cover
- Digital Workflow Management
- Optical Music Recognition
- Gamera for domain experts
- Includes an easy-to-use interactive environment
for experimentation - Beta version available on Linux
- OS X and Windows version in preparation
28Acknowledgements
- National Science Foundation
- National Endowments for the Humanities
- Institute of Museum and Library Services
- The Levy Family
29OMR Classifier
- Connected-component analysis
- Feature extraction, e.g
- Width, height, aspect ratio
- Number of holes
- Central moments
- k-nearest neighbor classifier
- Genetic algorithm
30Overall Architecture for OMR
Image File
Staff removal Segmentation
Recognition K-NN Classifier
Output Symbol Name
Optimization Genetic Algorithm K-nn Classifier
Knowledge Base Feature Vectors
Best Weight Vector
Off-line
31Architecture of Gamera
Graphic User Interface (wxWindows)
Plugins (Python)
Plugins (C)
GAMERA Core (C)
32GUIDO An example
\beamsOff \cleflt"treble"gt \keylt"D"gt
f1/8. g1/16 a1/4. d21/8 d1/4. c1/8
e11/2 _1/4 f1/8. g1/16 c21/4. b11/8
a1/4. g1/8 e1/2 f1/4 f1/8. g1/16
a1/4. d21/8 d1/4. c1/8 e11/2 _1/4
f1/8 g c21/4. b11/8 a1/4. c1/8 ,