Last active
August 20, 2016 08:04
-
-
Save hellska/641c9bc04688ed13a4b8c5f72a610954 to your computer and use it in GitHub Desktop.
GSoC 2016 - Dataset Creation Toolkit
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The Dataset Creation Toolkit GSoC Project | |
This summer I worked for the Metabrainz foundation to add some functionality to the Acousticbrainz server and client. The project developed during the Google Summer of Code period is still a work in progress, but part of it is already published in the official repository. | |
The basic idea of the project is contained in the project proposal published in the GSoC website and in the metabrainz community forum at this link: | |
https://community.metabrainz.org/t/gsoc-2016-acousticbrainz-dataset-creation-toolkit/10583 | |
The main code is contained in the Pull Request 189 that is waiting for merging into the master branch of the Acousticbrainz server, this code permit to use the Acousticbrainz client to submit Datasets that contains samples without MBID and ask Messybrainz to generate an MessybrainzID (msid) in order to keep track of the fact that the recording is not related to any MBID. The detail of the PR and the realtive comments are available at this link: | |
https://github.com/metabrainz/acousticbrainz-server/pull/189 | |
The link with all the commits: | |
https://github.com/hellska/acousticbrainz-server/commits/dataset_creation_toolkit?author=hellska | |
In the course of the project I had to perform a major schema change to accept different kind of uuid, this code add a field to the lowlevel table to mark the gid type, at the moment we accept only two gid types: | |
1 Musicbrainz IDs (mbid) | |
2 Messybrainz IDs (msid) | |
The field named gid_type is created as an enum to have the possibility to extend the types of gid accepted in the future. The code relative to this task is already published in the main repository and the details can be foud here: | |
https://github.com/metabrainz/acousticbrainz-server/pull/194 | |
The link with all the commits: | |
https://github.com/hellska/acousticbrainz-server/commits/submission_type?author=hellska | |
I also performed a very simple correction of the vagrant VM installation process submitted in another Pull Request, this simple fix was necessary to use the new version of the waf tool to install some relevant libraries in the vagrant VM, the libraries are Essentia and Gaia. The code is contained in this pull request: | |
https://github.com/metabrainz/acousticbrainz-server/pull/191 | |
This link shows the commit for this specific task: | |
https://github.com/hellska/acousticbrainz-server/commits/fix_hl_extractor_install?author=hellska | |
The client side has also been modified to permit the submission of datasets of items without MBID and all the code writte is collected in another pull request that can be seen in details at the following link: | |
https://github.com/MTG/acousticbrainz-client/pull/42 | |
The link with all the commits: | |
https://github.com/hellska/acousticbrainz-client/commits/dataset_creation_toolkit_client?author=hellska | |
This is all the code written during the GSoC period and the project is still a work in progress. | |
:Dan |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment