Model reference

This section describes the implementation of models provided by crema.

Chord recognition

The chord recognition model is based on the structured prediction model of McFee and Bello [1]. The implementation here has been enhanced to support inversion (bass) tracking, and predicts chords out of an effective vocabulary of 602 classes. Chord class names are based on an extended version of Harte’s [2] grammar: N corresponds to “no-chord” and X corresponds to out-of-gamut chords (usually power chords).

[1]McFee, Brian, Juan Pablo Bello. “Structured training for large-vocabulary chord recognition.” In ISMIR, 2017.
[2]Harte, Christopher, Mark B. Sandler, Samer A. Abdallah, and Emilia Gómez. “Symbolic Representation of Musical Chords: A Proposed Syntax for Text Annotations.” In ISMIR, vol. 5, pp. 66-71. 2005.
class crema.models.chord.ChordModel

Methods

outputs([filename, y, sr]) Return the model outputs (e.g., class likelihoods)
predict([filename, y, sr, outputs]) Chord prediction
transform([filename, y, sr]) Feature transformation
outputs(filename=None, y=None, sr=None)

Return the model outputs (e.g., class likelihoods)

Parameters:
filename : str (optional)

Path to audio file

y, sr : (optional)

Audio buffer and sample rate

.. note:: At least one of `filename` or `y, sr` must be provided.
Returns:
outputs : dict, {str: np.ndarray}

Each key corresponds to an output name, and the value is the model’s output for the given input

predict(filename=None, y=None, sr=None, outputs=None)

Chord prediction

Parameters:
filename : str

Path to the audio file to analyze

y, sr : np.ndarray, number>0

Audio signal in memory to analyze

outputs : dict {str: np.ndarray}

Pre-computed model outputs, as given by ChordModel.outputs.

.. note:: At least one of `filename`, `y, sr`, or `outputs`

must be provided.

Returns:
jams.Annotation, namespace=’chord’

The chord estimate for the given signal.

Examples

>>> import crema
>>> import librosa
>>> model = crema.models.chord.ChordModel()
>>> chord_est = model.predict(filename=librosa.util.example_audio_file())
>>> chord_est
<Annotation(namespace='chord',
            time=0,
            duration=61.4,
            annotation_metadata=<AnnotationMetadata(...)>,
            data=<45 observations>,
            sandbox=<Sandbox(...)>)>
>>> chord_est.to_dataframe().head(5)
       time  duration  value  confidence
0  0.000000  0.092880  E:maj    0.336977
1  0.092880  0.464399    E:7    0.324255
2  0.557279  1.021678  E:min    0.448759
3  1.578957  2.693515  E:maj    0.501462
4  4.272472  1.486077  E:min    0.287264
transform(filename=None, y=None, sr=None)

Feature transformation