Skip to content

Instantly share code, notes, and snippets.

@DavidLemayian
Forked from munenedenis/DataDeskManual.doc
Created February 6, 2019 06:42
Show Gist options
  • Save DavidLemayian/a94533d65888ca58132c9a8c2f420e90 to your computer and use it in GitHub Desktop.
Save DavidLemayian/a94533d65888ca58132c9a8c2f420e90 to your computer and use it in GitHub Desktop.
Data Desk Manual

Introduction

This document is purposely compiled to give guidelines on how to assess an organization's need for a data division, how to set up the data desk as well as addressing the following issues;

Needs Assessment Framework Data Desk Terms of Reference Setup strategies Resources required Barriers envisioned in setting up. Modus operandi Feedback Methodology Data Desk Training /Capacity Building

Compilation of this document shall be through crowd sourcing input from the Open Source Community who have been exposed and are involved in data journalism and running of data departments. Input shall be coordinated and compiled through Git hub and the Code 4 Kenya Initiative.

Building Capacity for Data Journalism.

Journalists working in many newsrooms find seeking out data to back their story up a big challenge. An even bigger challenge is how to process, analyze and visualize the data in ways that allows them to tell compelling stories. The skills to do this are not part of the journalist’s skillset. As we move to a more data driven culture globally the need to augment these analytical skills in the newsrooms has become more urgent. Some of these skills like front-end web programming will still not become part of the journalists ‘swiss knife’ of trade tools but the skillset can be made available to those working in the newsroom to improve how they tell their stories.

Data desks have over the last few years began to evolve within media organizations to provide the skills and tools that empower data journalism within their organizations. In the west, notable data desks have been at The Guardian, the LA Times and the New York Times have proved to add value to the content generation process of the organizations. These organisations each have a different approach to the setting up of the desk and the way in which these desks interact with the rest of the organisation.

Complementing Government Open Data

Governments around the world are increasingly publishing open data. With legislation such as Freedom of Information Acts already in place in many countries to support the opening up of data, focus is now shifting to standards of data and ensuring consistent quality.

The media organisations have generated their own data within their stories and reports but have not organized it into usable analysis-ready formats. These reports have to be mined for data and organised in a manner that can be used by various programs and analysis tools. Sharing of this mined data will be the second step to achieving a much needed Government Open Data Checker. The media house can issue parallel statistics on Education, Health, Accidents, Economic Indicators etc and act as a checker for Government Data. In addition, as infomediaries make requests for data that has not been published on the official government open data portal, new government data hosted on these corporate open data portals shall become yet another source to verify government open data. such structures result in better transparency from government and cleaner data published to the government portal.

Hosting a Data Desk

This data desk resides within the organisation but is designed as a data agency providing data services to both the media Organisation (inhouse) and external partners and programs. Partners (especially content generators for programs and features) would be supported by the data desk as a resource they can use to meet their data needs.

An ideal situation after the setting up of the data desk will be a media organization with journalists that have the skills to curate and analyze data. The team at the data desk provides assistance in daily organisation of data, scraping data from sources, building the code to use the data in telling a story, designing the visualizations and infographics, mashing up data from different sources to find trends and sharing out the datasets with the general public through data catalogues.

It is hoped that the result of active data desks within media organization will improve the data asymmetry that currently exists in favor of supply to one that is more demand driven.

1 Introduction

1.1 The lack of synergy in the journalism field

most journalists do not have the time nor the skills to analyse large datasets based on their deadlines and workload. The few that actually work on data analysis will do basic analysis and not use too many data sources. 1.2 Building bridges between analytics and the newsroom

journalists, when covering a story, usually go to professionals and sector analysts for insight and for a holistic view of the issues. some of these analyses can be done in-house so that they are able to have a better holistic view of the issues in order to re-inforce their reporting 2 Data journalism

data visualisation is often the expression of data journalism, but the process of digging through the data to find the stories that matter, that is at its heart.” Simon Rogers, The Guardian

3 Visualisation

visualisations are the most easily understood part of data journalism. the pretty pictures and the interactive images engage the users/ targetted audienceyet only count for a small part of the data journalism school. However, you cannot ignore the message and level at which visualisations are able to quickly engage and inform audiences. 3.1 samples of visualisations

insert samples of visualisations and the tyoe of visualisations to be used depending on the circumstances 4 Working with data

4.1 Finding data

Freedom of Information (FOI) statutes have been enacted in many democratic states. Citizens has access to any data generated by a project/ office ran by tax payers money. Some journalists have access to already existing datasets from their contacts. and some data is published by one government office or the other. these datasets can be compiled, organised, assigned proper metadata and archived in a central repository to be used by the journalists when investogating/ analysing stories. 4.2 Open data

Governments are disseminating their data through open data portals and the uptake of these datasets, especially in Developing Countries, is not as fast as previously envisaged. The purpose of data journalism and establishing a data desk is so that there can be a flip in the symmetry of data supply. the current setup favours the supplier as opposed to it being demand driven. 4.3 Opening up data

the data that is mined and organised within the media organisation can be catalogued within the inhouse repository. this data, once verified, can be opened up to the public. this independent content accumulator is able to produce a dataset that can be used to match/ check the givernment data.

5 Functions of the data desk

5.1 Terms of Reference for a Data Desk

1.Support the day-to-day needs of journalists to enable them write better articles that are well researched and backed up with appropriate data. 2. Receive any data journalists gather as part of their research, clean and upload it to the data portal. 3. Act as the host component for a central data repository for all data that the organization holds. 4. Facilitate training of journalists on data/ analytical basics 5. Guide the media house on data acquisition and licensing issues 6. Extract data from sources that aren’t in reusable, shareable formats as well as coordinate data collection into the data portal 7. Acquire datasets from organizations publishing data publicly and providing them in a tagged, categorized and searchable manner through the data management system. 8. Manipulate data sets for purposes of analysis. 9. Identify and manage appropriate standards, policies and procedures for data management within the organization. 10. Develop new ways for users to consume and experience the datasets, acting on feedback. 11. Creation of visualizations for the journalists and development and/or deployment of tools that empower journalists to do so themselves. 12. Write, edit and review content for use on the data portal as well as other publishing channels used by the organization and its partners. 13. Educate and inform internal and external audiences about the various tools and resources related to open data and data journalism; market and promote the data portals through networking and presentation opportunities, including the use of social media. 14. Engage audiences in a dialogue around the data and its use, and encourage greater involvement of key communities such as development researchers and academics, policy-makers and software developers. 15. Pro-actively monitor activity, and engage and contribute as appropriate, in relevant on-line discussion groups. 16. Maintain a data journalism calendar of events, initiating, planning and organizing relevant internal and external events on data journalism to link ideas, sponsors, partners and data enthusiasts.

6 Assembling the Desk

6.1 Staffing

the data desk can be constructed through two known methods: horizontal recruitment from within. getting journalists with particular skills and putting them together to get a complete unit vertical recruitment for external help. getting people to come in, sit at a desk and get them to work on nothing else but the data. hybrid approach. do both depending on inhouse capacity

depending on the organisations capacity, strategic plan and the skill sets accumulated inhouse, there is an opportunity to outsource some parts of the functions of the data desk. the key positions of the data desk are: Data Analyst IT developer (programme coder) graphic designer/ visualisation experts

the number of individuals per position are dependent on the workload and type of requests and deliverables of the desk 6.2 Infrastructure

computers server space minimum space design softwares data analysis tools backup servers access to current filing system

7 Administration of the data desk

7.1 functionality within the organisation structure

7.2 request handling/ prioritising

7.3 reporting protocol

7.4 heirarchical structure within the org

8 Implementation phasing

8.1 How to prioritise the immediate needs of the org based on a) budget b) scope of work and c) strategic plan

8.2 External support systems if necessary

9 Innovation channels

9.1 What leeway can be given to allow innovation

9.2 Entering competitions and challenges

9.3 In house innovation challenges

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment