Skip to content

Instantly share code, notes, and snippets.

@theresaanna
Last active September 23, 2016 20:10
Show Gist options
  • Save theresaanna/26e35c583fe4744738c6c03057442f2f to your computer and use it in GitHub Desktop.
Save theresaanna/26e35c583fe4744738c6c03057442f2f to your computer and use it in GitHub Desktop.
Alpha Software Inventory Schema - DC civic.json style
{
"agency": "DOABC",
"projects": [
{
"status": "Alpha",
"vcs": "git",
"repository": "https://github.com/presidential-innovation-fellows/mygov",
"homepage": "https://agency.gov/project-homepage",
"downloadURL": "https://agency.gov/project/dist.tar.gz",
"name": "mygov",
"description": "A Platform for Conecting People and Government",
"tags": [
"platform",
"government",
"connecting",
"people"
],
"languages": [
"java",
"python"
],
"contact": {
"email": "project@agency.gov",
"name": "Project Coordinator Name",
"twitter": "https://twitter.com/projectname",
"phone": "2025551313"
},
"partners": [
{
"name": "DOXYZ",
"email": "project@doxyz.gov"
}
],
"license": "https://path.to/license" OR null,
"openSourceProject": 1,
"governmentWideReuseProject": 0,
"exemption": null,
"updated": {
"lastCommit": "2016-04-30",
"metadataLastUpdated": "2016-04-13",
"sourceCodeLastModified": "2016-04-12"
}
}
]
}
@mattbailey0
Copy link

I love this. Both versions are great. Of them, I think this one is stronger overall and unless otherwise noted here I think all of diffs here are better than in the other one. Some questions/thoughts:

  • tags: I think there was som discussion on this in the Issue but would this be a provided taxonomy? I think that would be best. If there isn't an obvious existing taxonomy to source from, I can provide one that might work which we're developing for another OFCIO effort.
  • contact: one thing worth noting is that we'll be encouraging agencies to create something like a "code@agency.gov" email inbox rather than using individuals for their POC for code. In the case that they do that, the "name" field may be confusing.
  • contact: do we also want to allow agencies to provide phone, twitter..?
  • license: I really like the idea of providing a URL to the license.md file (or equiv) instead of providng strings/UIDs for licenses. What happens for non-OSS codebases?
  • updated: is this for the metadata or for the code?

@theresaanna
Copy link
Author

@mattbailey0 I agree that this one is stronger.

  • tags: that's a great question. I don't know that we need to decide just yet. I could see us going either way. One one hand, free tagging lets users use the system in ways we might not envision now. For example, someone commented on the discussion thread with a suggestion of adding a platform field so that, for example, if an agency is looking for a solution built on top of SalesForce, they can search for that. That's something we could potentially solve with tagging. Though of course, having a defined taxonomy would make for cleaner data.
  • contact: I added phone number and twitter fields. Are there any others that might be appropriate here, too? Name would be optional. When I write the post, I'm going to spell out the optional and required fields. I changed the example to reflect a project email.
  • license: I had that thought, too. I'm thinking of a blended approach, where that field takes either a string or array of strings, for a URL or license names respectively.
  • updated: I'm going to expand that field into an object. Please tell me what you think of the above solution.

@okamanda
Copy link

okamanda commented Sep 20, 2016

This is coming together well!

  • partners: This was a really thoughtful addition given that there are likely a number of scenarios where agencies share responsibility for a project. Does this also open up the possibility of duplicate entries? If GSA is partnering with OMB on a project, do they both list the project and reference each other as partners? If so, does that create a problem?
  • contact: I know that phone and twitter aren't PII, but are there FISMA implications for collecting this info and displaying it on a government site? I don't have strong opinions about which fields should be used, but I could see why it would be valuable to have another form of contact besides e-mail.
  • license: I think a blended approach here might actually be a little confusing, and think the URL is a much more inclusive option. Not only is there a URL for the text to all the major open source licenses (CCO, GNU, MIT, APACHE, etc), but for agencies who have their own original, custom or blended licenses, a URL can point visitors to the full text.
  • updated: Creating an array of dates/timestamps is a good idea. In terms of format, I think the date is probably fine. I can't think of a scenario in which a user searching for software would need to see the hour that a change was made. But I don't understand the difference between lastCommit and SourceCodeLastModified.
  • tags: I'd be interested in what a taxonomy would like in this instance. A lot of platforms, like StackOverflow, allow you to tag your question by choosing from a list of popular tags. But in our case, how would an agency be able to choose from a universe of tags/keywords while generating their inventory? Does that force them to manually tag each repository in their inventory or is there some other way to automate that process?
  • exemption field: I'm not sure what this means-->Is it a good idea to tie the Exemption field to the content of the policy? I assume this policy will change?
  • inventory.json: I prefer 'code.json' because it's on brand, but don't have strong feelings.

@theresaanna
Copy link
Author

@okamanda

  • partners: I had envisioned both GSA and OMB listing the project, and we could potentially link them up somehow in the UI. I can't envision a problem, but that doesn't mean that there isn't one!
  • contact: that's a good question. Worth asking Jez about?
  • license: I didn't love the blended approach myself. I thought the URL option would be prohibitive, but I see your point. I will simplify this field.
  • updated: lastCommit is specifically for projects in VCS, whereas sourceCodeLastModified encompasses projects outside of VCS. Too confusing? should we remove lastCommit and keep sourceCodeLastModified or a renamed version of it?
  • exemption: The value of the field would be a number between 1-5, corresponding to the exemptions in the policy. Is this a bad idea?
  • inventory.json: I chose a different name so that there wouldn't be confusion with data.gov inventories, but I see that folks feel the opposite way, so I'll change it back.

@ckaran
Copy link

ckaran commented Sep 23, 2016

Would it be better to have a simple version field instead of the updated field? I was thinking it would be a string that follows the Semantic Versioning guidelines, which would make it easier for fully automated systems to detect if important changes have occurred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment