Convolutional and Recurrent Estimators for Music Analysis¶
The crema package provides pre-trained statistical models for a variety of musical analysis tasks. All tasks are provided under a unified interface, which can be accessed by the Analyzer functionality.
Currently, only chord recognition is supported, but more features will be introduced over time.
The crema analyzer can operate on either audio files stored on disk, or audio buffers stored in numpy arrays. The results of the analyzer are stored in a JAMS object.
Installation¶
crema can be installed directly from GitHub by issuing the following command:
pip install -e git+https://github.com/bmcfee/crema.git
or from PyPI by:
pip install crema
Quick start¶
The simplest way to apply crema is via the command line:
python -m crema.analyze -o my_song.jams /path/to/my_song.ogg
Note
Any audio format supported by librosa will work here.
This command will apply all analyzers to mysong.ogg and store the outputs as my_song.jams.
For processing multiple recordings, the above will be inefficient because it will have to instantiate the models repeatedly each time. If you need to process a batch of recordings, it is better to do so directly in Python:
from crema.analyze import analyze
jam1 = analyze(filename='song1.ogg')
jam1.save('song1.jams')
jam2 = analyze(filename='song2.ogg')
jam2.save('song2.jams')
...
API Reference¶
Analyzer¶
CREMA analyzer interface
-
crema.analyze.
analyze
(filename=None, y=None, sr=None)¶ Analyze a recording for all tasks.
Parameters: - filename : str, optional
Path to audio file
- y : np.ndarray, optional
- sr : number > 0, optional
Audio buffer and sampling rate
- .. note:: At least one of `filename` or `y, sr` must be provided.
Returns: - jam : jams.JAMS
a JAMS object containing all estimated annotations
Examples
>>> from crema.analyze import analyze >>> import librosa >>> jam = analyze(filename=librosa.ex('brahms')) >>> jam <JAMS(file_metadata=<FileMetadata(...)>, annotations=[1 annotation], sandbox=<Sandbox(...)>)> >>> # Get the chord estimates >>> chords = jam.annotations['chord', 0] >>> chords.to_dataframe().head(5) time duration value confidence 0 0.000000 3.622313 G:min 0.767835 1 3.622313 1.207438 C:min/5 0.652179 2 4.829751 1.207438 D:maj 0.277913 3 6.037188 4.086712 G:min 0.878656 4 10.123900 1.486077 D#:maj 0.608746
Model reference¶
This section describes the implementation of models provided by crema.
Chord recognition¶
The chord recognition model is based on the structured prediction model of McFee and Bello [1]. The implementation here has been enhanced to support inversion (bass) tracking, and predicts chords out of an effective vocabulary of 602 classes. Chord class names are based on an extended version of Harte’s [2] grammar: N corresponds to “no-chord” and X corresponds to out-of-gamut chords (usually power chords).
[1] | McFee, Brian, Juan Pablo Bello. “Structured training for large-vocabulary chord recognition.” In ISMIR, 2017. |
[2] | Harte, Christopher, Mark B. Sandler, Samer A. Abdallah, and Emilia Gómez. “Symbolic Representation of Musical Chords: A Proposed Syntax for Text Annotations.” In ISMIR, vol. 5, pp. 66-71. 2005. |
-
class
crema.models.chord.
ChordModel
¶ Methods
outputs
([filename, y, sr])Return the model outputs (e.g., class likelihoods) predict
([filename, y, sr, outputs])Chord prediction transform
([filename, y, sr])Feature transformation -
outputs
(filename=None, y=None, sr=None)¶ Return the model outputs (e.g., class likelihoods)
Parameters: - filename : str (optional)
Path to audio file
- y, sr : (optional)
Audio buffer and sample rate
- .. note:: At least one of `filename` or `y, sr` must be provided.
Returns: - outputs : dict, {str: np.ndarray}
Each key corresponds to an output name, and the value is the model’s output for the given input
-
predict
(filename=None, y=None, sr=None, outputs=None)¶ Chord prediction
Parameters: - filename : str
Path to the audio file to analyze
- y, sr : np.ndarray, number>0
Audio signal in memory to analyze
- outputs : dict {str: np.ndarray}
Pre-computed model outputs, as given by
ChordModel.outputs
.- .. note:: At least one of `filename`, `y, sr`, or `outputs`
must be provided.
Returns: - jams.Annotation, namespace=’chord’
The chord estimate for the given signal.
Examples
>>> import crema >>> import librosa >>> model = crema.models.chord.ChordModel() >>> chord_est = model.predict(filename=librosa.util.example_audio_file()) >>> chord_est <Annotation(namespace='chord', time=0, duration=61.4, annotation_metadata=<AnnotationMetadata(...)>, data=<45 observations>, sandbox=<Sandbox(...)>)> >>> chord_est.to_dataframe().head(5) time duration value confidence 0 0.000000 0.092880 E:maj 0.336977 1 0.092880 0.464399 E:7 0.324255 2 0.557279 1.021678 E:min 0.448759 3 1.578957 2.693515 E:maj 0.501462 4 4.272472 1.486077 E:min 0.287264
-
transform
(filename=None, y=None, sr=None)¶ Feature transformation
-
Utilities¶
CREMA utilities
-
crema.utils.
base
(filename)¶ Identify a file by its basename:
/path/to/base.name.ext => base.name
Parameters: - filename : str
Path to the file
Returns: - base : str
The base name of the file
-
crema.utils.
get_ann_audio
(directory)¶ Get a list of annotations and audio files from a directory.
This also validates that the lengths match and are paired properly.
Parameters: - directory : str
The directory to search
Returns: - pairs : list of tuples (audio_file, annotation_file)
-
crema.utils.
git_version
()¶ Return the git revision as a string
Returns: - git_version : str
The current git revision
-
crema.utils.
increment_version
(filename)¶ Increment a model version identifier.
Parameters: - filename : str
The file containing the model version
Returns: - model_version : str
The new model version. This version will also be written out to filename.
-
crema.utils.
load_h5
(filename)¶ Load data from an hdf5 file created by save_h5.
Parameters: - filename : str
Path to the hdf5 file
Returns: - data : dict
The key-value data stored in filename
See also