Skip to content

Instantly share code, notes, and snippets.

@inokappa
Last active Nov 24, 2015
Embed
What would you like to do?
PM2.5モニタリングデータ(海外) の CSV データ(http://www2.env.go.jp/pm25monitoring/download.html) を Elasticsearch に放り込む雑なスクリプト
# coding: utf-8
require "csv"
require 'json'
require 'net/http'
#CSV.open("data/201507.csv", encoding: "Shift_JIS:UTF-8") do |f|
# f.each_with_index do |item, i|
# next if i == 0
# p item
# end
#end
def post_to_elasticsearch(json)
uri = URI.parse('http://localhost:9200/pm25/abroad')
http = Net::HTTP.new(uri.host, uri.port)
req = Net::HTTP::Post.new(uri.request_uri)
req["Content-Type"] = "application/json"
req.body = json
res = http.request(req)
puts "code -> #{res.code}"
puts "msg -> #{res.message}"
puts "body -> #{res.body}"
end
header = ["CHECK_TIME", "CHECK_POINT", "DATE", "TIME", "VALUE"]
file = "data/" + ARGV[0] + ARGV[1] + ".csv"
open(file, "rb:Shift_JIS:UTF-8", undef: :replace) do |f|
CSV.new(f).each do |row|
date = []
row[1].split("/").each do |n|
date << sprintf("%02d", n)
end
time = row[2].gsub("24", "0")
check_time = date.join("-") + " " + sprintf("%02d" ,time) + ":00:00"
row.unshift(check_time)
ary = [header,row].transpose
h = Hash[*ary.flatten]
# p h
post_to_elasticsearch(h.to_json)
end
end
@inokappa

This comment has been minimized.

Copy link
Owner Author

@inokappa inokappa commented Jul 9, 2015

事前に以下のように mapping を登録しておく。

curl -XPUT 'localhost:9200/pm25' -d '
{
    "mappings" : {
      "abroad" : {
        "properties" : {
          "DATE" : {
            "type" : "string"
          },
          "TIME" : {
            "type" : "string"
          },
          "VALUE" : {
            "type" : "long"
          },
          "CHECK_POINT" : {
            "type" : "string"
          },
          "CHECK_TIME" : {
            "format" : "YYYY-MM-dd HH:mm:ss","type" : "date"
          }
        }
      }
    }
  }'
@inokappa

This comment has been minimized.

Copy link
Owner Author

@inokappa inokappa commented Jul 9, 2015

データの一行目がヘッダになっているの適当に削除する。

mkdir data
cd data
wget http://www2.env.go.jp/pm25monitoring/data/csv/H_2013.zip
unzip H_2013.zip
for i in `ls 2013*`; do sed -i '1d' $i; done
@inokappa

This comment has been minimized.

Copy link
Owner Author

@inokappa inokappa commented Nov 24, 2015

curl -XPUT 'localhost:9200/pm25' -d '
{
    "mappings" : {
      "kyushu" : {
        "properties" : {
          "town_name" : {
            "type" : "string"
          },
          "mon_st_name" : {
            "type" : "string"
          },
          "PM2_5" : {
            "type" : "long"
          },
          "TEMP" : {
            "type" : "long"
          },
          "CHECK_TIME" : {
            "type" : "long"
          },
          "CHECK_DATE_TIME" : {
            "format" : "YYYY-MM-dd kk:mm:ss","type" : "date"
          }
        }
      }
    }
  }'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment