Data Commons aims to simplify the process of data science by linking data from a variety of sources into one knowledge graph of information, simplifying the process of data cleaning for modeling. The purpose of this project was to make the knowledge graph more accessible to end users and easier for developers to add contributions.
In my proposal, I initially noted the following pain points in the Data Commons documentation:
- The directions for adding data sets in the 'Get Involved' section were short and unclear.
- The tutorials section only offered Python notebooks, with no reference to other Data Commons API wrappers.
- At the time of application, Data Commons did not offer any tools or visualizations built using its knowledge graph.
The first two goals were retained with some changes. We rewrote the dataset contribution guidelines, providing additional information on how to contribute datasets to the knowledge graph. We also added tutorial material for the API's Sheets wrapper--a change from the initial plan to focus on the knowledge graph's R wrapper as the team re-focused engineering resources.
With the release of Data Commons' Place Explorer tool, the initial third goal to build a sample application with the API was rendered moot. Therefore, we pivoted the project proposal to restructure the total endpoint documentation, providing examples for every endpoint across all the API wrappers.
Our first documentation update was to the dataset contribution pages. We created the following pull requests to address this need:
- datacommonsorg/docsite#61
- datacommonsorg/docsite#63
- datacommonsorg/docsite#64
- datacommonsorg/docsite#72
- datacommonsorg/docsite#77
- datacommonsorg/docsite#86
NOTE: These PRs no longer correspond to what is published on the Data Commons documentation main site--after Google incorporated the API into search results for demographic information, the product team wanted to temporarily scale back on community dataset contributions.
Next, we moved to update and restructure the examples and informations for all endpoints, methods, and formulae provided by Data Commons through its REST API and through its wrappers available for Python and Sheets. We made these pull requests:
- datacommonsorg/docsite#87
- datacommonsorg/docsite#95
- datacommonsorg/docsite#96
- datacommonsorg/docsite#97
- datacommonsorg/docsite#98
- datacommonsorg/docsite#102
- datacommonsorg/docsite#103
- datacommonsorg/docsite#104
- datacommonsorg/docsite#112
- datacommonsorg/docsite#138
- datacommonsorg/docsite#143
- datacommonsorg/docsite#156
- datacommonsorg/docsite#106
- datacommonsorg/docsite#107
- datacommonsorg/docsite#108
- datacommonsorg/docsite#110
- datacommonsorg/docsite#113
- datacommonsorg/docsite#116
- datacommonsorg/docsite#117
- datacommonsorg/docsite#136
- datacommonsorg/docsite#141
- datacommonsorg/docsite#125
- datacommonsorg/docsite#127
- datacommonsorg/docsite#129
- datacommonsorg/docsite#130
- datacommonsorg/docsite#132
- datacommonsorg/docsite#153
I also created the general cleanup PR datacommonsorg/docsite#154.
The results of these PRs can be seen on the Data Commons main documentation site in the API section.
In this phase, I tried a couple of different approaches to re-formatting the API docs using Swagger and Redoc.ly. We eventually decided to move away from these approaches, since they didn't translate to consistent design appearance across REST, Python, and Google Sheets docs. However, we were able to incorporate a new plugin to tab between different examples of REST endpoint usage, including sample code in Javascript presented using JS Fiddle.
In the final months of the season, we worked on creating tutorial material for the Google Sheets wrapper for Data Commons. Here are the PRs created in connection with that:
And here are the links to the final tutorials:
- https://docs.datacommons.org/tutorials/sheets_latitude.html
- https://docs.datacommons.org/tutorials/sheets_sleep.html
- https://docs.datacommons.org/tutorials/sheets_covid.html
In addition, I brought entirely new content to the Data Commons by writing the glossary (https://docs.datacommons.org/glossary.html) and style guide (https://github.com/datacommonsorg/docsite/blob/master/STYLE_GUIDE.md) The style guide in particular brought industry standards to the project documentation, creating a foundation for Data Commons' future approach to project documentation.
PRs associated with this (as well as other fixes):
- datacommonsorg/docsite#99
- datacommonsorg/docsite#131
- datacommonsorg/docsite#134
- datacommonsorg/docsite#135
- datacommonsorg/docsite#73
- datacommonsorg/docsite#72
- datacommonsorg/docsite#71
Google Season of Docs was an incredible opportunity to grow in my technical writing abilities. I saw noticeable personal improvement in my ability to create documentation that used both formal and informal style protocols to effectively communicate challenging technical concepts. I also learned about the costs and benefits of deviating from established formats when I tried to move away from our existing patterns for API documentation and towards new approaches using Redoc.ly and Swagger, gaining technical experience and providing perspective on the power of Data Commons' relatively simple information design. Finally, my soft abilities to connect with people holding diverse perspectives and create content meaningful to all of their levels improved substantially over the course of the project.