Model reference¶
This section describes the implementation of models provided by crema.
Chord recognition¶
The chord recognition model is based on the structured prediction model of McFee and Bello [1]. The implementation here has been enhanced to support inversion (bass) tracking, and predicts chords out of an effective vocabulary of 602 classes. Chord class names are based on an extended version of Harte’s [2] grammar: N corresponds to “no-chord” and X corresponds to out-of-gamut chords (usually power chords).
[1] | McFee, Brian, Juan Pablo Bello. “Structured training for large-vocabulary chord recognition.” In ISMIR, 2017. |
[2] | Harte, Christopher, Mark B. Sandler, Samer A. Abdallah, and Emilia Gómez. “Symbolic Representation of Musical Chords: A Proposed Syntax for Text Annotations.” In ISMIR, vol. 5, pp. 66-71. 2005. |
-
class
crema.models.chord.
ChordModel
¶ Methods
outputs
([filename, y, sr])Return the model outputs (e.g., class likelihoods) predict
([filename, y, sr, outputs])Chord prediction transform
([filename, y, sr])Feature transformation -
outputs
(filename=None, y=None, sr=None)¶ Return the model outputs (e.g., class likelihoods)
Parameters: - filename : str (optional)
Path to audio file
- y, sr : (optional)
Audio buffer and sample rate
- .. note:: At least one of `filename` or `y, sr` must be provided.
Returns: - outputs : dict, {str: np.ndarray}
Each key corresponds to an output name, and the value is the model’s output for the given input
-
predict
(filename=None, y=None, sr=None, outputs=None)¶ Chord prediction
Parameters: - filename : str
Path to the audio file to analyze
- y, sr : np.ndarray, number>0
Audio signal in memory to analyze
- outputs : dict {str: np.ndarray}
Pre-computed model outputs, as given by
ChordModel.outputs
.- .. note:: At least one of `filename`, `y, sr`, or `outputs`
must be provided.
Returns: - jams.Annotation, namespace=’chord’
The chord estimate for the given signal.
Examples
>>> import crema >>> import librosa >>> model = crema.models.chord.ChordModel() >>> chord_est = model.predict(filename=librosa.util.example_audio_file()) >>> chord_est <Annotation(namespace='chord', time=0, duration=61.4, annotation_metadata=<AnnotationMetadata(...)>, data=<45 observations>, sandbox=<Sandbox(...)>)> >>> chord_est.to_dataframe().head(5) time duration value confidence 0 0.000000 0.092880 E:maj 0.336977 1 0.092880 0.464399 E:7 0.324255 2 0.557279 1.021678 E:min 0.448759 3 1.578957 2.693515 E:maj 0.501462 4 4.272472 1.486077 E:min 0.287264
-
transform
(filename=None, y=None, sr=None)¶ Feature transformation
-