Similar to the way the file robots.txt controls the scraping of pages on a web server, the file robots.json in a source code repository should control whether machine learning algorithms are allowed to process and exploit the source code.
For example:
{ "disallow": "*" }
Would not permit any source code to be used.
More granual access could be granted or denied bassed on directories:
{ "disallow": { "paths": ["src/models"] } }
should be interpreted as not allowing files matching src/models/**
to be used.
Further control could be granted based on contributor name:
{ "disallow": { "contributors": ["borisj"] } }
should be interepreted as not allowing the use of any files that have been modified by the borisj
user.
Access control could also be based on commit date:
{ "disallow": { "before": "2006-08-14T02:34:56-06:00" } }
Files that cannot be parsed as JSON should be interpreted as disallowing any use.
For example the robots.json file containing:
epstein didn't kill himself
should be interpreted as { "disallow": "*" }