Skip to content

Instantly share code, notes, and snippets.

@shibacow
Created September 13, 2017 17:02
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save shibacow/18cec4ed476f21132268227e0a4d056e to your computer and use it in GitHub Desktop.
Save shibacow/18cec4ed476f21132268227e0a4d056e to your computer and use it in GitHub Desktop.
e-Gov法令検索 http://elaws.e-gov.go.jp/download/lawdownload.html から法令データの一括ダウンロードを行う。10秒のインターバルをしている
#!/usr/bin/env python
# -*- coding:utf-8 -*-
import requests
from pyquery import PyQuery as pq
from datetime import datetime
import re
import os
import shutil
import time
src='http://elaws.e-gov.go.jp/download/lawdownload.html'
dst='http://elaws.e-gov.go.jp/download/'
def down(nowd,f):
d=dst+f
res=requests.get(d,stream=True)
dstf=nowd+os.sep+f
print(d)
with open(dstf,'wb') as fp:
shutil.copyfileobj(res.raw,fp)
def main():
#r=requests.get(url=src)
d=pq(src)
k=d('table#sclTbl')
nowd=datetime.now().strftime('%Y-%m-%d')
if not os.path.isdir(nowd):
os.mkdir(nowd)
for p in k('a'):
p=pq(p)
c=p.attr('href')
s=re.search("javascript:lawdata_download\('([\d]+.zip)'\)",c)
if s:
f=s.group(1)
down(nowd,f)
time.sleep(10) #sleep 10 seconds
if __name__=='__main__':main()
@shibacow
Copy link
Author

必要パッケージ

pyquery==1.2.9
requests==2.9.1
pip install -U pyquery
pip install -U requests

で入ります。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment