Title: A Wavelet-Based Approach to the Discovery of Themes and Motives in Melodies
1- A Wavelet-Based Approach to the Discovery of
Themes and Motives in Melodies
- Gissel Velarde and David Meredith
- Aalborg UniversityDepartment of Architecture,
Design Media Technology - EuroMAC, September 2014
2We present
- A computational method submitted to the MIREX
2014 Discovery of Repeated Themes Sections task
- The results on the monophonic version of the JKU
Patterns Development Database
3Ground TruthBachs Fugue BWV 889
4Ground Truth Chopins Mazurka Op. 24, No. 4
5The idea behind the method
- In the context of pattern discovery in monophonic
pieces - With a good melodic structure in terms of
segments, it should be possible to gather similar
segments into clusters and rank their salience
within the piece.
6Considerations
- a good melodic structure in terms of segments
- Is considered to be closer to the ground truth
analysis (See Collins, 2014) - It specifies certain segments or patterns
- These patterns can be overlapping and hierarchical
7Considerations
- We also consider other aspects of the problem,
- representation,
- segmentation,
- measuring similarity,
- clustering of segments and
- ranking segments according to salience
8The method
- The method
- Follows and extends our approach to melodic
segmentation and classification based on
filtering with the Haar wavelet (Velarde, Weyde
and Meredith, 2013) - Uses idea of computing a similarity matrix for
window connectivity information from a generic
motif discovery algorithm for sequential data
(Jensen, Styczynski, Rigoutsos and
Stephanopoulos, 2006)
9Wavelet transform
A family of functions is obtained by translations
and dilatations of the mother wavelet
- The wavelet coefficients of the pitch vector v
for scale s and shift u are defined as the inner
product
10Representation (Velarde et al. 2013)
New representation
11First stage Segmentation (Velarde et al. 2013)
New segmentation
12 Segmentation
Constant segmentation, wavelet zero-crossings or
modulus maxima
First stage segmentation
Distance matrix given a measure
Comparison
Binarized distance matrix given a threshold
Concatenation
Contiguous similar diagonal segments are
concatenated
Comparison
Distance matrix given a measure
By agglomerative clusters from an agglomerative
hierarchical cluster tree
Clustering
Ranking
Criteria sum of the length of occurrences
13Parameter combinations
- We tested the following parameter combinations
- MIDI pitch
- Sampling rate 16 samples per qn
- Representation
- normalized pitch signal, wav coefficients, wav
coefficients modulus - Scale representation at 1 qn
- Segmentation
- constant segmentation, zero crossings, modulus
maxima - Scale segmentation at 1 and 4 qn
- Threshold for concatenation 0, 0.1, 1
- Distances
- city-block, Euclidean, DTW
- Agglomerative clusters from an agglomerative
hierarchical cluster tree - Number of clusters 7
- Ranking criterion Sum of the length of
occurrences
14Evaluation
- As described at MIREX 2014Discovery of Repeated
Themes Sections - establishment precision, establishment recall,
and establishment F1 score - occurrence precision, occurrence recall, and
occurrence F1 score - three-layer precision, three-layer recall, and
three-layer F1 score - runtime, first five target proportion and first
five precision - standard precision, recall, and F1 score
15Results
- On the JKU Patterns Development Database
monophonic version - J. S. Bach, Fugue BWV 889,
- Beethoven's Sonata Op. 2, No. 1, Movement 3,
- Chopin's Mazurka Op. 24, No. 4,
- Gibbons's Silver Swan, and
- Mozart's Sonata K.282, Movement 2.
- We selected best combinations according to
representation and segmentation.
16Results
Fig 1. Mean F1 score (mean(f1_est, f1_occ(c.75),
3L F1, f1_occ (c.5)) .
17Results
Fig 2. Standard F1 score
18Results
Fig 3. Mean Runtime per piece.
19Our MIREX Submissions VM1 and VM2
- Combinations selected based on
- mean F1 score mean(F1_est, F1_occ(c.75), F1_3,
F1_occ (c.5)) - standard F1 score
- VM1 differs from VM2 in the following parameters
- Normalized pitch signal representation,
- Constant segmentation at the scale of 1 qn,
- Threshold for concatenation 0.1.
- VM2 differs from VM1 in the following parameters
- Wavelet coefficients representation filtered at
the scale of 1 qn - Modulus maxima segmentation at the scale of 4 qn
- Threshold for concatenation 1
20Our MIREX Submissions
Piece n_P n_Q P_est R_est F1_est P_occ R_occ F1_occ P_3 R_3 F1_3 Runtime FFTP_ FFP P_occ R_occ F1_occ P R F1
Piece n_P n_Q P_est R_est F1_est (c.75) (c.75) (c.75) P_3 R_3 F1_3 (s) est FFP (c.5) (c.5) (c.5) P R F1
Bach 3 7 0.87 0.95 0.91 0.63 0.72 0.67 0.51 0.65 0.57 8.5 0.95 0.6 0.63 0.72 0.67 0.14 0.33 0.2
Beethoven 7 7 0.92 0.92 0.92 0.98 0.98 0.98 0.86 0.91 0.88 31 0.76 0.8 0.89 0.93 0.91 0.57 0.57 0.57
Chopin 4 7 0.53 0.86 0.66 0.66 0.86 0.75 0.48 0.7 0.57 34.2 0.68 0.47 0.46 0.83 0.6 0 0 0
Gibbons 8 7 0.95 0.95 0.95 0.66 0.93 0.77 0.85 0.79 0.82 17.76 0.77 0.79 0.66 0.93 0.77 0.29 0.25 0.27
Mozart 9 7 0.92 0.79 0.85 0.82 0.96 0.88 0.79 0.69 0.73 23.61 0.67 0.73 0.72 0.92 0.81 0.57 0.44 0.5
mean 6.2 7 0.84 0.89 0.86 0.75 0.89 0.81 0.7 0.75 0.71 23.01 0.77 0.68 0.67 0.87 0.75 0.31 0.32 0.31
SD 2.59 0 0.17 0.07 0.12 0.15 0.11 0.12 0.19 0.1 0.14 10.34 0.11 0.14 0.15 0.09 0.12 0.26 0.22 0.23
Table 1. Results of VM1 on the JKU Patterns
Development Database.
Piece n_P n_Q P_est R_est F1_est P_occ R_occ F1_occ P_3 R_3 F1_3 Runtime FFTP_ FFP P_occ R_occ F1_occ P R F1
Piece n_P n_Q P_est R_est F1_est (c.75) (c.75) (c.75) P_3 R_3 F1_3 (s) est FFP (c.5) (c.5) (c.5) P R F1
Bach 3 7 0.56 0.65 0.6 0.89 0.43 0.58 0.39 0.41 0.4 5.07 0.59 0.37 0.56 0.46 0.5 0 0 0
Beethoven 7 7 0.9 0.9 0.9 0.79 0.89 0.84 0.82 0.86 0.84 5.54 0.67 0.75 0.83 0.9 0.86 0 0 0
Chopin 4 7 0.58 0.86 0.69 0.69 0.83 0.75 0.53 0.78 0.64 5.83 0.65 0.44 0.67 0.65 0.66 0 0 0
Gibbons 8 7 0.92 0.88 0.9 0.79 0.84 0.82 0.81 0.73 0.77 2.22 0.7 0.76 0.72 0.69 0.71 0.14 0.13 0.13
Mozart 9 7 0.83 0.71 0.77 0.93 0.93 0.93 0.77 0.63 0.69 5.7 0.56 0.68 0.84 0.88 0.86 0 0 0
mean 6.2 7 0.76 0.8 0.77 0.82 0.78 0.78 0.66 0.68 0.67 4.87 0.63 0.6 0.72 0.71 0.72 0.03 0.03 0.03
SD 2.59 0 0.17 0.11 0.13 0.09 0.2 0.13 0.19 0.17 0.17 1.51 0.06 0.18 0.12 0.18 0.15 0.06 0.06 0.06
Table 2. Results of VM2 on the JKU Patterns
Development Database.
Three Layer F1, (?2(1)1.8, p0.1797)
-gtNo significant difference Standard
F1, (?2(1)4, p0.045)
-gtVM1 preferred Runtime, (?2(1)5,
p0.0253)
-gtVM2 preferred
21Example Bach's Fugue BWV 889 prototypical
pattern
Example Bach's Fugue BWV 889 prototypical pattern
22Observations
- The segmentation stage makes more difference in
the results, according to the parameters - In the first stage segmentation
- The size of the scale affects the results for
standard measures and runtimes - In the first comparison
- Zero-crossings segmentation works best with DTW
- DTW is much more expensive to compute
23Observations
- In the comparison (after segmentation),
City-block is dominant - DTW in the comparison after segmentation is not
in the best combinations - Maybe because there is no ritardando or
accelerando in this dataset and/or representation - For standard measures and a smaller segmentation
scale - Pitch signal works better than wavelet
representation - For non standard measures and a larger
segmentation scale - Modulus maxima performs slightly better than
zero-crossings and constant segmentation
24Conclusions
- Our novel wavelet-based method outperforms the
methods reported by Meredith (2013) and Nieto
Farbood (2013) on the monophonic version of the
JKU PDD training dataset, scoring higher on
precision, recall and F1 score, and reporting
faster runtimes.
25Conclusions
- The segmentation stage makes more difference in
the results, according to the parameters - A small scale for first stage segmentation should
be preferable for higher values of the standard
measures and a large scale should be preferable
for runtime computation. - City-block should be preferable after segmentation
26References
- 1 T. Collins. Mirex 2014 competition Discovery
of repeated themes and sections, 2014.
http//www.music-ir.org/mirex/wiki/2014Discovery_
of_Repeated_Themes_26_Sections. Accessed on 12
May 2014. - 2 K. Jensen, M. Styczynski, I. Rigoutsos and G.
Stephanopoulos A generic motif discovery
algorithm for sequential data, Bioinformatics,
221, pp. 21-28, 2006. - 3 D. Meredith. COSIATEC and SIATECCompress
Pattern discovery by geometric compression,
Competition on Discovery of Repeated Themes and
Sections, MIREX 2013, Curitiba, Brazil, 2013. - 4 O. Nieto, and M. Farbood. Discovering
Musical Patterns Using Audio Structural
Segmentation Techniques. Competition on Discovery
of Repeated Themes and Sections, MIREX 2013,
Curitiba, Brazil, 2013 - 5 G. Velarde, T. Weyde and D. Meredith An
approach to melodic segmentation and
classification based on filtering with the
Haar-wavelet, Journal of New Music Research,
424, 325-345, 2013.