sammlapp/model_sharing_guidelines.md

## model_sharing_guidelines.md

      
    Raw
  

              model_sharing_guidelines.md
            
          
    Recommended steps for publicly sharing a model trained with OpenSoundscape


Uploading the saved model object to a public location such as Box, OneDrive, etc

Save the model object in two different ways to improve usability:

(1) model.save() in OpenSoundscape to save the entire model object, which can be re-loaded only in the same version as OpenSoundscape
(2) To enable use with other OpenSoundscape versions, save the weights dictionary and additional model information as a dictionary with torch.save. For instance:
torch.save({
    'weights':model.network.state_dict(),
    'classes':model.classes,
    'architecture':model.architecture_name,
    'sample_duration':model.preprocessor.sample_duration,
    'single_target':model.single_target,
    'sample_shape':[224,224,1]
},
    '/home/sml161/trained_models/opso37_dec2022_augment_best.resnet50_weights'
)

If you're wondering why its necessary to save in two different ways, see this Pytorch article
NOTE: If you include custom classes or functions defined in your training script, the model object will not load unless those custom classes/functions are defined first. This makes it extra important to share your training script.

Include a written description of the models and link to training scripts and script showing how to load and predict with it


Share training scripts and example script for loading the model + prediction on GitHub or another public repository

Include plenty of comments explaining your decisions and your data
Include performance metrics on a validation set, and explain what the validation set contains and how it was created (a validation set is a good test of model transferability if it is a completely different set of audio from the training set, rather than a random subset of the training data)
Explain training data and how it was obtained, cleaned and/or filtered
Include an exported .yml of your python environment (eg, conda env export > environment.yml)


If possible, share training and validation data, or at least a detailed description of how training data was obtained and modified/filtered


Feel free to post in the OpenSoundscape discussion area to announce your shared model