Skip to content

Instantly share code, notes, and snippets.

View kiostark's full-sized avatar

Kio Stark kiostark

View GitHub Profile

Keybase proof

I hereby claim:

  • I am kiostark on github.
  • I am kio (https://keybase.io/kio) on keybase.
  • I have a public key ASCts-M6KzxjVNxj2Am6EgKFd2KdGkDvHfuMKYhgJcGVYAo

To claim this, I am signing this object:

Journalism can be a high-risk activity, and some stories are a lot riskier than others. In a part one we covered the digital security precautions that everyone in news organizations should take. If one of your colleagues uses weak passwords or clicks on a phishing link, more sophisticated efforts are wasted. But assuming that everyone you are working with is already up to speed on basic computer security practice, there's a lot more you can do to provide security for a specific, sensitive story.

This work begins with thinking through what it is you have to protect, and from whom. This is called threat modeling and is the first step in any security analysis. The goal is to construct a picture -- in some ways no more than an educated guess -- of what you're up against. There are many ways to do this, but this post is structured around four basic questions.

  • What do you want to keep private?
  • Who wants to know?
  • What can

[IN GENERAL: This is excellent at outlining what the decisions are that you'll have to make, but I don't feel like there is quite enough on how to make these decisions about how much to do. It's all in the form of "you have to assess and decide." Which I know is the case, but it would be great if there was a little more "at a glance" guidance. Maybe a checklist at the end of each section? Not sure how to solve this. Just keep in mind as you make a next pass.]

Journalism can be a high-risk profession, and some stories are a lot riskier than others[LET'S not repeat same opening line as first one]. In a previous post [[link to first part]], we covered the digital security precautions that every journalist should take. If one of your colleagues uses weak passwords or clicks on a phishing link, more sophisticated efforts are wasted. But assuming that everyone you are working with is already up to speed on basic computer security practice, there's a lot more you can do to provide security for a specific, sensitive

You got the documents. Now what?

[omg documents.png]

Congratulations! Your Freedom of Information request finally yielded a big brown envelope in the mail. You are the lucky recipient of a juicy leak. You've managed to scrape all the PDFs from that stone-age government portal. Now all you have to do is the reporting.

Would that it were so easy. Your next steps depend on what you've got and what you're trying to do. You might have one page or one million pages. You could be starting with a tall stack of paper or a CSV file or anything in between. Maybe you already know exactly what you're looking for, or maybe that anonymous tip was maddeningly non-specific. In the course of my work on the Overview document-mining software I've seen just about every problem that journalists can have with a document-driven story. These are the stories of unreadable formats, heaps of paper, and late nights reading. This post is organized as a sort of flowchart, a series of questions y

You got the documents. Now what?

[omg documents.png]

Congratulations! Your Freedom of Information request finally yielded a big brown envelope in the mail. You are the proud owner of a juicy leak. You've managed to scrape all the PDFs from that stone-age open government portal. Now all you have to do is the reporting.

In the course of my work on the Overview document-mining software I've seen just about every problem that journalists can have with a document-driven story. These are the stories of unreadable formats, heaps of paper, and late nights reading.

When you're the proud owner of a brand new document dump, the next steps depend on what you've got and what you're trying to do. You might have one page or one million pages. You could be starting with a tall stack of paper or a CSV file or anything in between. Maybe you already know exactly what you're looking for, or maybe that anonymous tip was so non-specific you don't know where to start. This post is organized as a

@kiostark
kiostark / styleguide.md
Last active December 20, 2015 06:08 — forked from kissane/styleguide.md

Style Book

We're currently using Chicago, but we can switch to AP if there's a good reason. (Like, say, Chicago annoying our entire readership.)

Style Basics & Deviations From the Stylebook

  • We don't wrap article titles within text in quotes, but we do link to them on first usage
  • We don't italicize the names of publications in article text
  • We don't cap "The" in publication titles in article text, but we do in Organization entries
  • Commas and periods go inside closing quotation marks
@kiostark
kiostark / gist:6023960
Last active December 19, 2015 21:58 — forked from jstray/gist:6003431

Title: Don't Let Your Data Fool You

The job of a data journalist is to turn data into a story by finding some sort of pattern. If you start with a spreadsheet of cancer rates, the story might be "people living near oil refineries had three times the rate of lung cancer." Or it might not be, because you could be mis-interpreting the data in some way. Think of headlines like "crime rates fall," "humans are causing climate change," or "countries with more guns have more deaths by firearms." What exactly are these headlines claiming, and are these stories true?

Data doesn't speak for itself, or the data journalist would not be needed. Instead it must be interpreted. This is the process of selecting and obtaining the relevant data, finding the interesting facts or patterns, putting them in context, and explaining what they mean. But there are many ways this process can go wrong and, sorry to say, professional journalists s