Skip to content

Instantly share code, notes, and snippets.

@hellais
Created August 3, 2018 14:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hellais/8308cca261023cb0fb68f46a90094e2a to your computer and use it in GitHub Desktop.
Save hellais/8308cca261023cb0fb68f46a90094e2a to your computer and use it in GitHub Desktop.
CREATE TABLE `results` (
`id` INTEGER PRIMARY KEY AUTOINCREMENT,
-- This can be one of "websites", "im", "performance", "middlebox".
`test_group_name` VARCHAR(255),
-- We use a different start_time and runtime, because we want to also have
-- data to measure the overhead of creating a report and other factors that
-- go into the test.
-- That is to say: `SUM(runtime) FROM measurements` will always be <=
-- `runtime FROM results` (most times <)
`start_time` DATETIME,
`runtime` REAL,
-- This is a flag used to indicate if the result is done or is currently running.
`is_done` TINYINT(1),
`data_usage_up` INTEGER,
`data_usage_down` INTEGER
);
CREATE TABLE `measurements` (
`id` INTEGER PRIMARY KEY AUTOINCREMENT,
-- This can be one of:
-- facebook_messenger
-- telegram
-- whatsapp
-- http_header_field_manipulation
-- http_invalid_request_line
-- dash
-- ndt
`test_name` VARCHAR(255),
`start_time` DATETIME,
`runtime` REAL,
-- For the purpose of populating the probe information in the results
-- views, you should pick the first measurement in the JOIN sorted by
-- start_time.
-- You don't have the guarantee that every (ip, asn, country, network_name)
-- is the same in a "measurement set" associated to a "result".
`ip` VARCHAR(255),
`asn` VARCHAR(16),
`country` VARCHAR(2),
`network_name` VARCHAR(255),
-- Note for golang: we used to have state be one of `done` and `active`, so
-- this is equivalent to done being true or false.
-- `state` TEXT,
`is_done` TINYINT(1),
-- The reason to have a dedicated is_uploaded flag, instead of just using
-- is_upload_failed, is that we may not have uploaded the measurement due
-- to a setting.
`is_uploaded` TINYINT(1),
-- This is the measurement failed to run and the user should be offerred to
-- re-run it.
`is_failed` TINYINT(1),
`failure_msg` VARCHAR(255),
`is_upload_failed` TINYINT(1),
`upload_failure_msg` VARCHAR(255),
-- This is the server-side report_id returned by the collector. By using
-- report_id & input, you can query the api to fetch this measurement.
-- Ex.
-- GET https://api.ooni.io/api/v1/measurements?input=$INPUT&report_id=$REPORT_ID
-- Extract the first item from the `result[]` list and then fetch:
-- `measurement_url` to get the JSON of this measurement row.
-- These two values (`report_id`, `input`) are useful to fetch a
-- measurement that has already been processed by the pipeline, to
-- implement cleanup of already uploaded measurements.
`report_id` VARCHAR(255),
`input` VARCHAR(255),
-- This is not yet a feature of the collector, but we are planning to add
-- this at some point in the near future.
-- See: https://github.com/ooni/pipeline/blob/master/docs/ooni-uuid.md &
-- https://github.com/ooni/pipeline/issues/48
-- @dark should be use a INT(64) or some VARCHAR for the canonical hex representation?
`measurement_id` INT(64),
-- This indicates in the case of a websites test, that a site is likely
-- blocked, or for an IM test if the IM tests says the app is likely
-- blocked, or if a middlebox was detected.
-- You can `JOIN` a `COUNT()` of this value in the results view to get a count of
-- blocked sites or blocked IM apps
`anomaly` TINYINT(1),
-- This is an opaque JSON structure, where we store some of the test_keys
-- we need for the measurement details views and some result views (ex. the
-- upload/download speed of NDT, the reason for blocking of a site,
-- etc.)
`test_keys` JSON,
-- The cross table reference to JOIN the two tables together.
`result_id` INTEGER REFERENCES `results` (`id`) ON DELETE SET NULL ON UPDATE CASCADE
-- This is a variable used internally to track the path to the on-disk
-- measurements.json. It may make sense to write one file per entry by
-- hooking MK and preventing it from writing to a file on disk which may
-- have many measurements per file.
`report_file` VARCHAR(255),
);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment