Skip to content

Instantly share code, notes, and snippets.

@tmtmtmtm
Created May 28, 2018 11:39
Show Gist options
  • Save tmtmtmtm/96670a219dfec90add8d6a597c7b2fe7 to your computer and use it in GitHub Desktop.
Save tmtmtmtm/96670a219dfec90add8d6a597c7b2fe7 to your computer and use it in GitHub Desktop.
#!/bin/env ruby
# frozen_string_literal: true
require 'csv'
require 'pry'
require 'rest-client'
#---------------------------------------------------------------------------
# Find all members of the 37th Eduskunta with no P4100 (parliamentary group)
# and add one from
# https://fi.wikipedia.org/wiki/Luettelo_vaalikauden_2015–2019_kansanedustajista
#
# Takes the output of the scraper at
# https://github.com/everypolitician-scrapers/finland-eduskunta-2015-wikipedia
# stored as wikipedia.csv
#
# Emits commands to feed to PositionStatements
# https://github.com/everypolitician/position_statements
#---------------------------------------------------------------------------
WIKIDATA_SPARQL_URL = 'https://query.wikidata.org/sparql'
def sparql(query)
result = RestClient.get WIKIDATA_SPARQL_URL, accept: 'text/csv', params: { query: query }
CSV.parse(result.body, headers: true, header_converters: :symbol)
rescue RestClient::Exception => e
raise "Wikidata query #{query} failed: #{e.message}"
end
query = <<SPARQL
SELECT ?item ?ps WHERE {
?item p:P39 ?ps .
?ps ps:P39/wdt:P279* wd:Q17592486 ; pq:P2937 wd:Q20253302 .
FILTER NOT EXISTS { ?ps pq:P4100 [] }
}
SPARQL
wikipedia = CSV.table('wikipedia.csv').group_by { |r| r[:wikidata] }
sparql(query).map(&:to_h).each do |row|
id = row[:item].split('/').last
wp_row = wikipedia[id] or next
commands = [id, 'P39', row[:ps].split('/').last, 'P4100', wp_row.first[:party_wikidata]]
puts commands.join("\t")
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment