Skip to content

Instantly share code, notes, and snippets.

View b5's full-sized avatar
💭
use of moved value

Brendan O'Brien b5

💭
use of moved value
View GitHub Profile
@b5
b5 / audit_full.json
Created April 28, 2017 16:21
epa dataset audit
This file has been truncated, but you can view the full file.
{
"name": "EPA",
"descendants": 18476,
"descendantsDownloadedOnce": 6860,
"descendantsArchivedOnce": 6842,
"children": [
{
"name": "https",
"descendants": 15278,
"descendantsDownloadedOnce": 5865,
@b5
b5 / kiwix.go
Last active May 11, 2017 20:43
Scrape Kiwix Urls
// Scrape info from kiwix service. This takes a little while to execute.
package main
import (
"encoding/json"
"github.com/PuerkitoBio/goquery"
"io/ioutil"
"log"
"net/http"
"os"
@b5
b5 / collection.json
Created September 21, 2017 16:49
EPA Endangered Species Act Page
{ "name" : "Protecting Endangered Species from Pesticides",
"url" : "https://www.epa.gov/endangered-species",
"description" : "Main EPA Endangerewd Species Act & Pesticides Web Pages"
}
package main
import (
"bufio"
"flag"
"fmt"
"github.com/PuerkitoBio/purell"
"io"
"net/url"
"os"

Keybase proof

I hereby claim:

  • I am b5 on github.
  • I am bfive (https://keybase.io/bfive) on keybase.
  • I have a public key ASAxwz2PapxPUS8GzFKzteRZ15MFoKkJjbxtjMP2Qw98tQo

To claim this, I am signing this object:

@b5
b5 / collection.json
Last active November 4, 2017 19:12 — forked from titaniumbones/collection.json
extract_href output from https://nccwsc.usgs.gov/acccnrs, "Advisory Committee on Climate Change and Natural Resource Science (ACCCNRS)"
{
"name": "Advisory Committee on Climate Change and Natural Resource Science (ACCCNRS)",
"url": "https://nccwsc.usgs.gov/acccnrs",
"description": "The Advisory Committee on Climate Change and Natural Resource Science (ACCCNRS) was established in 2013 to advise the Secretary of the Interior on the operations of the U.S. Geological Survey (USGS) National Climate Change and Wildlife Science Center (NCCWSC) and the Department of the Interior (DOI) Climate Science Centers (CSCs). ACCCNRS was composed of 25 members that represented (1) State and local governments, including state membership entities; (2) Nongovernmental organizations, including those whose primary mission is professional and scientific and those whose primary mission is conservation and related scientific and advocacy activities; (3) American Indian tribes and other Native American entities; (4) Academia; (5) Landowners, businesses, and organizations representing landowners or businesses. In 2015, ACCCNRS released its 2015 Report to the Secre
@b5
b5 / collection.json
Last active November 4, 2017 19:33 — forked from titaniumbones/collection.json
extract_href output from https://www.fws.gov/endangered/education/wonderful.html, "Wierd & Wonderful wildlife"
{
"name": "Wierd & Wonderful wildlife",
"url": "https://www.fws.gov/endangered/education/wonderful.html",
"description": "'There are many threatened and endangered species that you have probably never heard of. Here, you can learn about 14 weird and wonderful species that are currently endangered, threatened, or of special concern. Learning about these rare species can be fun! Choose a game and good luck!' Links mostly to old flash games. Not long for this world.'"
}
@b5
b5 / cr_to_crlf_replacer.go
Created February 27, 2018 00:18
Dealing with Solo Carriage Returns in csv.Reader
package main
import (
"bufio"
"bytes"
"fmt"
"io"
"encoding/csv"
)
@b5
b5 / skylark_transformations_tutorial.md
Created June 11, 2018 15:57
Qri Skylark Transformations Tutorial

Qri ("query") is about datasets. A transformion is a repeatable script for generating a dataset. Skylark is a scripting langauge from Google that feels a lot like python. This package implements skylark as a transformation syntax. Skylark tranformations are about as close as one can get to the full power of a programming language as a transformation syntax. Often you need this degree of control to generate a dataset.

Typical examples of a skylark transformation include:

  • combining paginated calls to an API into a single dataset
  • downloading unstructured structured data from the internet to extract
  • re-shaping raw input data before saving a dataset

We're excited about skylark for a few reasons:

  • python syntax - many people working in data science these days write python, we like that, skylark likes that. dope.
  • deterministic subset of python - unlike python, skylark removes properties that reduce introspection into code behaviour
@b5
b5 / config.json
Created December 17, 2018 21:19
walk test config
{
"Badger": {
"Dir": "badger",
"ValueDir": "badger",
"SyncWrites": true,
"TableLoadingMode": 1,
"ValueLogLoadingMode": 2,
"NumVersionsToKeep": 1,
"MaxTableSize": 67108864,
"LevelSizeMultiplier": 10,