In npm
you can express the basic informaion about a project and its dependencies in the following manner:
{
"name": "dahut",
"version": "1.42.17",
"dependencies": {
"cryptozoology": "^0.9.2",
"jsdom": "1.2.7 || >=1.2.9 <2.0.0",
"web-verse": "1.0.1"
}
}
The manner in which the version specifiers on the right of the dependency map are resolved are detailed in npm semver.
We have a use case concerning data citations. Today researchers who publish data sets don't get much credit for it, notably they do not get cited in the way that they would if they published an article. This is in part due to the fact that people don't really know how to do it properly. One of the issues is versioning. Data sets get updated (some of them a lot). If you provide no way of addressing a version, you lose reproducibility. But if the relationship is too strict you lose the ability to state that your analysis can be expected to be resilient to various degrees of changes to the data. Labelling data sets with semver and matching with npm semver is a good match.
My immediate concern is dependency from article to data; but it also applies to dependencies from articles to software (and then between any pair of software, data, article, and likely a bunch of other things).
The kind of thing I had in mind was (note that this is assuming that isBasedOnUrl
gets generalised to isBasedOn
, there may be a
better way to capture the idea):
For data
{
"@type": "Dataset",
"isBasedOn": [
{
"@type": "DependencyRole",
"isBasedOn": {
"@type": "DataDownload",
"contentUrl": "http://..."
},
"matchVersion": "^1.1.0",
}
]
}
For code:
{
"@type": "SoftwareSourceCode",
"isBasedOn": [
{
"@type": "DependencyRole",
"isBasedOn": {
"@type": "SoftwareSourceCode",
"codeRepository": "http://..."
},
"matchVersion": "~0.9.0",
}
]
}