Locating Cover Songs and Alternate Performances in Databases of Raw Audio - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Locating Cover Songs and Alternate Performances in Databases of Raw Audio

Description:

Alternate lyrics (i.e. Don't Cry versions I and II) Cover versions, artist re-interpretations ... Speech recognition on lyrics? ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 13
Provided by: jip4
Category:

less

Transcript and Presenter's Notes

Title: Locating Cover Songs and Alternate Performances in Databases of Raw Audio


1
Locating Cover Songs and Alternate Performances
in Databases of Raw Audio
  • Robert Turetsky
  • rjt72_at_columbia.edu
  • Advent Workshop
  • May 17, 2002

2
Technology enables liquid music
Production
Consumption
Distribution
3
Content-Based Analysis Motivation
  • Search on file-sharing systems (e.g. KaZaA)
    involves meta-data
  • Meta-data prone to errors, omission, distortion
  • Only works if user already knows what to look for
  • Musical Content Analysis means
  • Query by humming
  • Query by segment/prototype
  • Recommendation engines and artist discovery
  • Machine feedback/collaboration in composition
  • Locating cover songs is a first step

4
Locating Cover Songs Prior Work
  • Query By Humming
  • Mature field (kiosks, applets) but limited to
    monophonic music or manually transcribed
    polyphonic music
  • Jonathan Foote (FX Palo Alto)
  • ARTHUR (2000) align RMS energy. Works only on
    orchestral music, pop music has less dynamic
    range.
  • Content-Based Retrieval of Music and Audio
    (1997). Measures acoustic similarity, not
    equivalence.
  • Cheng Yang (Stanford)
  • Music Database Retrieval Based on Spectral
    Similarity (2001). Aligns MFCC at points of high
    energy using DTW.
  • MACS (2001). Aligns estimates of pitch
    likelihood. Indexing. Bad alignments discarded
    after linearity filter.

5
Why is locating cover songs so difficult?
  • Alternate performances can vary
  • Studio vs. Live
  • Tempo (non-linear time shifting)
  • Pitch transposition
  • Production technique, acoustic character
  • Additions (i.e. audience interaction)
  • Alternate lyrics (i.e. Dont Cry versions I and
    II)
  • Cover versions, artist re-interpretations
  • Vocalist, instrumentation, ornamentation
  • Entire character changes (i.e. Layla, dance
    remixes)
  • Yet we still know these songs are the same!

6
System Overview
Locate Section Breaks
Identify Summary Sections
Preprocessing
Pitch Extraction
Tonic Estimation
Query
Alignment
7
Phase 1 Locate Section Breaks
  • Employ Footes Similarity Matrix
  • Theory Windows of same section will have similar
    features. Windows of different sections will
    have features.
  • Similarity Matrix Cosine distance between every
    fixed width window of the song
  • Novelty Score - measure of newness
    correlation with checkerboard matrix.
  • Section breaks are peaks in the Novelty Score.

8
Phase 2 Summary Segments
  • Motivation Only transcribe and align salient
    segments
  • Measure of salience Repetition
  • Method Search for largest off-diagonal line in
    Similarity Matrix for each segment to measure
    extent of repetition (score)
  • Summary segment is most repeated section. Prune
    rows/columns of similar sections in score matrix.
    Repeat until 45-75 sec of audio is kept

Section 1 -
Section 4 -
Sec 1
Sec 2
Sec 3
Sec 4
Sec 1
Sec 2
Sec 3
Sec 4
9
Phase 3 Pitch Extraction
  • Multi-pitch extraction algorithm based on Klapuri
    et al, 2001.
  • Works well, except in presence of drums.

Noise Suppression
Predominant Pitch Estimation
Time -
Estimate Pitched Sound Characteristics
Estimate Voices and Iterate
Remove Found Sound from Mixture

10
Phase 3 MPE Details
Noise Reduction RASTA style filter
Predominant pitch estimation Fuzzy search for
harmonic peaks
Spectral Smoothing to estimate sound parameters
Resynthesis
Repeat on mixture after removal
Resynthesis
11
Phase 4-5 Query-time alignment
  • Exhaustively align summary segments
  • Two alignments needed Pitch and Time
  • Pitch Alignment Tonic Estimation
  • Align two piano rolls at point of maximum
    cross-correlation between note histograms
  • Temporal Alignment Dynamic Programming (Dynamic
    Time Warp)
  • Currently investigating different weights for
    rewarding note matches, penalizing mismatches

12
Locating Cover Songs Future Work
  • Indexing scheme, other alignment techniques to
    improve speed of query
  • Thematic extraction to find only melody or
    harmony lines
  • Include Beat Tracking as part of score
  • Investigate harmonic analysis (identifying chord
    structure) for better feature
  • Speech recognition on lyrics???
Write a Comment
User Comments (0)
About PowerShow.com