dvolk/robots.md

## robots.md

      
    Raw
  

              robots.md
            
          
    robots.json - machine learning access control proposal

Introduction

Similar to the way the file robots.txt controls the scraping of pages on a web server, the file robots.json in a source code repository should control whether machine learning algorithms are allowed to process and exploit the source code.
Examples

For example:
{ "disallow": "*" }

Would not permit any source code to be used.
More granual access could be granted or denied bassed on directories:
{ "disallow": { "paths": ["src/models"] } }

should be interpreted as not allowing files matching src/models/** to be used.
Further control could be granted based on contributor name:
{ "disallow": { "contributors": ["borisj"] } }

should be interepreted as not allowing the use of any files that have been modified by the borisj user.
Access control could also be based on commit date:
{ "disallow": { "before": "2006-08-14T02:34:56-06:00" } }

Files that cannot be parsed as JSON should be interpreted as disallowing any use.
For example the robots.json file containing:
epstein didn't kill himself

should be interpreted as { "disallow": "*" }