Skip to content

Instantly share code, notes, and snippets.

@sammlapp
Last active February 6, 2023 16:57
Show Gist options
  • Save sammlapp/b7498eea89f7e0d3a79a2148f2043fb9 to your computer and use it in GitHub Desktop.
Save sammlapp/b7498eea89f7e0d3a79a2148f2043fb9 to your computer and use it in GitHub Desktop.
Sharing models trained in OpenSoundscape

Recommended steps for publicly sharing a model trained with OpenSoundscape

  • Uploading the saved model object to a public location such as Box, OneDrive, etc

    • Save the model object in two different ways to improve usability:

    (1) model.save() in OpenSoundscape to save the entire model object, which can be re-loaded only in the same version as OpenSoundscape

    (2) To enable use with other OpenSoundscape versions, save the weights dictionary and additional model information as a dictionary with torch.save. For instance:

    torch.save({
        'weights':model.network.state_dict(),
        'classes':model.classes,
        'architecture':model.architecture_name,
        'sample_duration':model.preprocessor.sample_duration,
        'single_target':model.single_target,
        'sample_shape':[224,224,1]
    },
        '/home/sml161/trained_models/opso37_dec2022_augment_best.resnet50_weights'
    )
    

    If you're wondering why its necessary to save in two different ways, see this Pytorch article

    NOTE: If you include custom classes or functions defined in your training script, the model object will not load unless those custom classes/functions are defined first. This makes it extra important to share your training script.

    • Include a written description of the models and link to training scripts and script showing how to load and predict with it
  • Share training scripts and example script for loading the model + prediction on GitHub or another public repository

    • Include plenty of comments explaining your decisions and your data
    • Include performance metrics on a validation set, and explain what the validation set contains and how it was created (a validation set is a good test of model transferability if it is a completely different set of audio from the training set, rather than a random subset of the training data)
    • Explain training data and how it was obtained, cleaned and/or filtered
    • Include an exported .yml of your python environment (eg, conda env export > environment.yml)
  • If possible, share training and validation data, or at least a detailed description of how training data was obtained and modified/filtered

  • Feel free to post in the OpenSoundscape discussion area to announce your shared model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment