Skip to content

Instantly share code, notes, and snippets.

@nikhilkrishna
Last active April 14, 2020 04:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save nikhilkrishna/731d9fe1b3d42c9cab5191232ea52845 to your computer and use it in GitHub Desktop.
Save nikhilkrishna/731d9fe1b3d42c9cab5191232ea52845 to your computer and use it in GitHub Desktop.
Draft document for an approach towards managing data for Coronasafe

Overview

The Covid-19 pandemic is a rapidly growing public health crisis that has to be managed at multiple levels, ranging from the individual, to the community and the society. In order to do this effectively access to accurate, and updated information is needed.

However, in building a data management system that captures this data, care should be taken to ensure that :-

  1. The data that is being made available is relevant and useful to those audiences that need it.
  2. The data is controlled so that only the relevant audiences have access to it and this access is audited regularly.
  3. The data is securely stored with appropriate mechanisms to ensure it can be restored in the event of data loss.
  4. The data that is stored is consistent and of high quality in order to be useful for various data analysis.
  5. Meets the regulatory and privacy requirements of the society whose data is being captured.

Balancing short term priorities against more strategic goals

The nature of the Covid 19 pandemic demands a rapid response. Priority should be to make tools available to the population to gather information needed by the public health professionals to manage the situation as it evolves.

Start with gathering consistent data

The focus during the data gathering phase should be in ensuring that the data being gathered is consistent and meets the minimum information requirements of the audience using it.

Tools for gathering this data should be made so that it is easy for people to provide the required information quickly and accurately. The use of metadata and automation to reduce the amount of data entry should be explored, keeping in mind that the absolute minimum information needed should be captured.

Build a data strategy and implement it

While this effort is going on, a strategic effort to build tools (and setup procedures) to clean up and manage the data should be implemented to meet the data requirements (outlined earlier) should be taken up. While everything may not be automated effort should be made to leverage automation as much as possible.

Create and publish your data strategy document, so that the society that is impacted by it is clear on how their data is being managed and secured.

Data management

Consistency is key to achieve efficiency. Therefore the organisation needs to have a standard way to recording information so that when users and systems are presented with the data, they understand what it means and how to process it. Specific aspects of data management include data standardisation, data cataloguing, common vocabulary, metadata, modelling, master and reference data, etc

Security and Compliance

Users should only have access to the data they need, and the strategy should not only address access control but also the way data is kept secure and how it is embedded in day-to-day operations.

As far as possible avoid gathering Personally Identifiable Information and rely on the use of hashed keys and randomly generated unique identifiers to relate data together.

Data Governance

Data governance is a system for defining who within an organisation has authority and control over data assets and how those data assets may be used. It encompasses the people, processes, and technologies required to manage and protect data assets.

For access and identity management try to rely on open security standards (openID, OAuth) and Open Source tools to manage secrets.

Procedures (ideally tools) to securely and completely delete all data and metadata of individuals should be designed.

Have a clear policy and criteria (can be time based, on request, or some other business rule) for purging data as part of your data strategy and publish it.

If you want to retain data for analysis and research create a protocol on how the data privacy of the individual would be ensured and identifying data information removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment