Title: Organization of and Searching in Musical Information (a.k.a. Music Representation, Searching, and Retrieval)
1Organization of and Searching in Musical
Information (a.k.a. Music Representation,
Searching, and Retrieval)
- Donald Byrd
- School of Informatics School of Music
- Indiana University
- 16 January 2007
2Overview
- 1. Introduction and Motivation
- 2. Basic Representations
- 3. Why is Musical Information Hard to Handle?
- 4. Music vs. Text and Other Media
- 5. OMRAS and Other Projects
- 6. Summary
31. Introduction and Motivation
- Three basic forms (representations) of music are
important - Audio most important for most people (general
public) - All Music Guide (www.allmusicguide.com) has info
on gtgt230,000 CDs - MIDI files often best or essential for some
musicians, especially for pop, rock, film/TV - Hundreds of thousands of MIDI files on the Web
- CMN (Conventional Music Notation) often best,
sometimes essential for musicians (even amateurs)
and music researchers - Music holdings of Library of Congress over 10M
items - Includes over 6M pieces of sheet music and
tens/hundreds of thousands of scores of operas,
symphonies, etc. all notation, especially
Conventional Music Notation (CMN) - Differences among the forms are profound
42. Basic Representations of Music Audio
Audio (e.g., CD, MP3) like speech
Time-stamped Events (e.g., MIDI file) like
unformatted text
Music Notation like text with complex formatting
5Basic Representations of Music Audio
- Audio Time-stamped Events Music Notation
-
- Common examples CD, MP3 file Standard MIDI
File Sheet music - Unit Sample Event Note,
clef, lyric, etc. - Explicit structure none little (partial voicing
much (complete - information) voicing information)
- Avg. rel. storage 2000 1 10
- Convert to left - OK job easy OK job easy
- Good job hard Good job hard
-
- Convert to right 1 note pretty easy OK job
hard - - other hard or very hard
-
- Ideal for music music music
- bird/animal sounds
6The Four Parameters of Notes
- Four basic parameters of a definite-pitched
musical note - 1. pitch how high or low the sound is
perceptual analog of frequency - 2. duration how long the note lasts
- 3. loudness perceptual analog of amplitude
- 4. timbre or tone quality
- Above is decreasing order of importance for most
Western music - and decreasing order of explicitness in CMN!
7How to Read Music Without Really Trying
- CMN shows at least six aspects of music
- NP1. Pitches (how high or low) on vertical axis
- NP2. Durations (how long) indicated by note/rest
shapes - NP3. Loudness indicated by signs like p , mf ,
etc. - NP4. Timbre (tone quality) indicated with words
like violin, pizzicato, etc. - Start times on horizontal axis
- Voicing mostly indicated by staff in complex
cases also shown by stem direction, beams, etc. - See Essentials of Music Reading musical example.
8How People Find Text Information
- What user wants is almost always concepts
- But computer can only recognize words
9How Computers Find Text Information
- Stemming, stopping, query expansion are all
tricks to increase precision recall (avoid
false negatives false positives) due to
synonyms, variant forms of words, etc.
103. Why is Musical Information Hard to Handle?
- 1. Units of meaning not clear there are
anyassuming music even has meaning! (all
representations) - 2. Polyphony parallel independent voices,
something like characters in a play (all
representations) - 3. Recognizing notes (audio only)
- 4. Other reasons
- Musician-friendly I/O is difficult
- Diversity of styles of music, of people
interested in music
11Units of Meaning (Problem 1)
- Handling text information nearly always via words
- What we want is concepts what we have is words
- Not clear anything in music is analogous to words
- No explicit delimiters (like Chinese)
- Experts dont agree on word boundaries (unlike
Chinese) - Music is always art gt meaning much more
subtle! - Are notes like words?
- No. Relative, not absolute, pitch is important
- Are pitch intervals like words?
- No. Theyre too low level more like characters
12Units of Meaning (Problem 1)
- Are pitch intervals like words?
- No. Theyre too low level more like characters
- Are pitch-interval sequences like words?
- In some ways, but
- Ignores rhythm
- Ignores relationships between voices (harmony)
- Probably little correlation with semantics
- Are chords like words? (Christy Keele)
- If so, chord progressions may be like sentences
- In some ways, but ignores melody rhythm, most
relevant for tonal music, etc. - Anyway, in much music, pitch isnt important,
and/or notes arent important!
13Independent Voices in Music (Problem 2)
J.S. Bach St. Anne Fugue, beginning
14Independent Voices in Text
- MARLENE. What I fancy is a rare steak. Gret?
- ISABELLA. I am of course a member of the / Church
of England. - GRET. Potatoes.
- MARLENE. I havent been to church for years. / I
like Christmas carols. - ISABELLA. Good works matter more than church
attendance. - --Caryl Churchill Top Girls (1982), Act 1,
Scene 1
Performance (time goes from left to right)
M What I fancy is a rare steak. Gret? I
havent been... I I am of course a member of
the Church of England. G Potatoes.
15Music Notation vs. Audio
- Relationship between notation and its sound is
very subtle - Not at all one symbol ltgt one symbol
- Notes w/ornaments (trills, etc.) are one gt many
- All symbols but notes are one gt zero!
- Bach F-major Toccata example
- Style-dependent
- Swing (jazz), dotting (baroque art music)
- Improvisation (baroque art music, jazz)
- Events (20th-century art music)
- How well-defined is style-dependent
- Interpretation is difficult even for musicians
- Can take 50-90 of lesson time for performance
students
16Music Perception and Music IR
- Salience is affected by texture, loudness, etc.
- Inner voices in orchestral music rarely salient
- Streaming effects and cross-voice matching
- produced by timbre Wessels illusion (Ex. 1, 2)
- produced by register Telemann example (Ex. 3)
- Octave identities, timbre and texture
- Beethoven Hammerklavier Sonata example (Ex.4,
5) - Affects pitch-interval matching
174. Music vs. Text and Other Media
- Explicit Structure Salience
- least medium most increasers
- Music audio events notation loud thin texture
-
- Text audio (speech) ordinary text with
markup headlining large, - written text bold, etc.
- Images photo, bitmap PostScript drawing-program br
ight color - file
- Video videotape MPEG? Premiere file motion, etc.
- w/o sound
- Biological DNA sequences, MEDLINE abstracts ??
- data 3D protein structures
18Features of Music Text Analogies
- Simultaneous independent voices and texture
- Analogy in text characters in a play
- Chords within a voice
- Analogy in text character in a play writing
something visible to the audience while saying
different out loud - Rhythm
- Analogy in text rhythm in poetry
- Notes and intervals
- Note pitches rarely important
- Intervals more significant, but still very
low-level - Analogy in text interval (very roughly!)
letter, not word
19Features of Text Music Analogies
- Words
- Analogy in music for practical purposes, none
- Sentences
- Analogy in music phrases (but much less
explicit) - Paragraphs
- Analogy in music sections of a movement (but
less explicit) - Chapters
- Analogy in music movements
20Course Overview
- II. Organization of Musical Information (music
representation) - What we want is concepts what we have is words
- Audio, MIDI, notation
- III. Finding Musical Information
- A Similarity Scale for Content-Based Music IR
- IV. Musical Similarity and Finding Music by
Content - V. Finding music via Metadata
- Digital music libraries (Variations2), iTunes,
etc. - Music recommender systems
211. Programming in R No Problem!
- R is very interactive can use as powerful
calculator - Assignments will be fairly simple
- Much help available from Don other students
- Why R?
- NOT because it's great for statistics!
- easy to do simple things with it, including
graphs and handling audio files - probably not good for complex programs
- free, available for all popular operating
systems - very interactive gt easy to experiment
- has good documentation
- In use in other Music Informatics classes,
standardizing is good
221. Rudiments of R
- Originally for statistics good for far more
- How to get R
- Web site http//cran.us.r-project.org/
- Versions for Linux, Mac OS X, Windows
- Already on STC Windows machines will be in M373
- Tutorial
- http//xavier.informatics.indiana.edu/craphael/te
ach/symbolic_music/ - Can use R interactively as a powerful graphing,
musicing, etc. calculator - but its not perfect sometimes very cryptic
23Typkes MIR System Survey
- Rainer Typkes MIR Systems A Survey of Music
Information Retrieval Systems lists many systems - http//mirsystems.info/
- Commercial system Shazam
- Some research systems can be used over the Web,
incl. - C-Brahms
- Meldex/Greenstone
- Mu-seek
- MusicSurfer
- Musipedia/Tuneserver/Melodyhound
- QBH at NYU
- Themefinder
24Machinery to Evaluate Music-IR Research
- Problem how do we know if one system is really
better than another, or an earlier version? - Solution standardized tasks, databases,
evaluation - In use for speech recognition, text IR, question
answering, etc. - Important example TREC (Text Retrieval
Conference) - For music IR, we now have...
- IMIRSEL (International Music Information
Retrieval Systems Evaluation Laboratory) project - http//www.music-ir.org/evaluation/
- MIREX (Music IR Evaluation eXchange) modeled on
TREC - 2005 audio only
- 2006 audio and symbolic
25Collections (a.k.a. Databases) (1 of 2)
- Collections are improving, but very slowly
- For research poor to fair
- Candidate Music IR Test Collections
- http//mypage.iu.edu/donbyrd/MusicTestCollections
.HTML - Representation CMN vs. CMN
- For practical use pathetic (symbolic) to good
(pop audio) - Most are commercial, especially audio
- Very little free/public domain
- especially audio! (cf. RWC)
- IPR issues are a total mess
26Collections (a.k.a. Databases) (2 of 2)
- Why is so little available?
- Symbolic form no efficient way to enter
- Solution OMR? AMR? research challenges
- Music is an art!
- Cf. Searching CMN slides chicken egg problem
- IPR issues are a total mess
276. Summary (1 of 2)
- Basic representations of music audio, events,
notation - Fundamental difference amount of explicit
structure - Have very different characteristics gt each is by
far best for some users and/or application - Converting to reduce structure much easier than
to add - Music in all forms very hard to handle mostly
because of - Units of meaning problem
- Polyphony
- Both problems are much less serious with text
286. Summary (2 of 2)
- Projects include
- Audio-based via recognition of polyphonic music
(OMRAS, query-by-humming, etc.) - CMN-based monophonic query vs. polyphonic
database (emphasis on UI) (OMRAS) - Style-genre identification from audio
- Creative applications music IR for
improvisation, etc. - Machinery to evaluate research is coming along
(MIREX) - Collections
- for research poor to fair
- For practical use pathetic (symbolic) to good
(pop audio) - improving, but
- Serious problems with IPR as well as technology