Skip to content

Instantly share code, notes, and snippets.

@saper
Created April 10, 2015 09:58
Show Gist options
  • Save saper/6339db1dd01561fb84a2 to your computer and use it in GitHub Desktop.
Save saper/6339db1dd01561fb84a2 to your computer and use it in GitHub Desktop.
Script started on Fri Apr 10 11:57:02 2015
$ rm -rf ~/dump/exp
$ mkdir ~/dm ump/exp
$  python dumpgenerator.py --xml --path ~/dump/exp --force http://pl.wikimed $ ~/dump/exp --force http://pl.wikimedi <a.org/
Checking API... http://pl.wikimedia.org/w/api.php
API is OK: http://pl.wikimedia.org/w/api.php
Checking index.php... http://pl.wikimedia.org/w/index.php
index.php is OK
PLEASE, DO NOT USE THIS SCRIPT TO DOWNLOAD WIKIMEDIA PROJECTS!
Download the dumps from http://dumps.wikimedia.org
#########################################################################
# Welcome to DumpGenerator 0.3.0-alpha by WikiTeam (GPL v3) #
# More info at: https://github.com/WikiTeam/wikiteam #
#########################################################################
#########################################################################
# Copyright (C) 2011-2014 WikiTeam #
# This program is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# This program is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details. #
# #
# You should have received a copy of the GNU General Public License #
# along with this program. If not, see <http://www.gnu.org/licenses/>. #
#########################################################################
Analysing http://pl.wikimedia.org/w/api.php
Warning!: "/usr/home/saper/dump/exp" path exists
There is a dump in "/usr/home/saper/dump/exp", probably incomplete.
If you choose resume, to avoid conflicts, the parameters you have chosen in the current session will be ignored
and the parameters available in "/usr/home/saper/dump/exp/config.txt" will be loaded.
Do you want to resume ([yes, y], [no, n])? y
No config file found. I can't resume. Aborting.
$ python dumpgenerator.py --xml --path ~/dump/exp --force http://pl.wikimed> $ python dumpgenerator.py $ /exp --force http://pl.wikimedia.org/ <
Checking API... http://pl.wikimedia.org/w/api.php
API is OK: http://pl.wikimedia.org/w/api.php
Checking index.php... http://pl.wikimedia.org/w/index.php
index.php is OK
PLEASE, DO NOT USE THIS SCRIPT TO DOWNLOAD WIKIMEDIA PROJECTS!
Download the dumps from http://dumps.wikimedia.org
#########################################################################
# Welcome to DumpGenerator 0.3.0-alpha by WikiTeam (GPL v3) #
# More info at: https://github.com/WikiTeam/wikiteam #
#########################################################################
#########################################################################
# Copyright (C) 2011-2014 WikiTeam #
# This program is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# This program is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details. #
# #
# You should have received a copy of the GNU General Public License #
# along with this program. If not, see <http://www.gnu.org/licenses/>. #
#########################################################################
Analysing http://pl.wikimedia.org/w/api.php
Warning!: "/usr/home/saper/dump/exp" path exists
There is a dump in "/usr/home/saper/dump/exp", probably incomplete.
If you choose resume, to avoid conflicts, the parameters you have chosen in the current session will be ignored
and the parameters available in "/usr/home/saper/dump/exp/config.txt" will be loaded.
Do you want to resume ([yes, y], [no, n])? n
You have selected: NO
Trying to use path "/usr/home/saper/dump/exp-2"...
Trying generating a new dump into a new directory...
Loading page titles from namespaces = all
Excluding titles from namespaces = None
18 namespaces found
Retrieving titles in the namespace 0
.... 1876 titles retrieved in the namespace 0
Retrieving titles in the namespace 1
. 193 titles retrieved in the namespace 1
Retrieving titles in the namespace 2
.. 774 titles retrieved in the namespace 2
Retrieving titles in the namespace 3
..^CTraceback (most recent call last):
File "dumpgenerator.py", line 2031, in <module>
main()
File "dumpgenerator.py", line 2023, in main
createNewDump(config=config, other=other)
File "dumpgenerator.py", line 1595, in createNewDump
getPageTitles(config=config, session=other['session'])
File "dumpgenerator.py", line 381, in getPageTitles
for title in titles:
File "dumpgenerator.py", line 248, in getPageTitlesAPI
r = session.post(url=config['api'], data=params)
File "/usr/lib64/python2.7/site-packages/requests/sessions.py", line 498, in post
return self.request('POST', url, data=data, **kwargs)
File "/usr/lib64/python2.7/site-packages/requests/sessions.py", line 456, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib64/python2.7/site-packages/requests/sessions.py", line 559, in send
r = adapter.send(request, **kwargs)
File "/usr/lib64/python2.7/site-packages/requests/adapters.py", line 327, in send
timeout=timeout
File "/usr/lib64/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 493, in urlopen
body=body, headers=headers)
File "/usr/lib64/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 319, in _make_request
httplib_response = conn.getresponse(buffering=True)
File "/usr/lib64/python2.7/httplib.py", line 1067, in getresponse
response.begin()
File "/usr/lib64/python2.7/httplib.py", line 409, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python2.7/httplib.py", line 365, in _read_status
line = self.fp.readline(_MAXLINE + 1)
File "/usr/lib64/python2.7/socket.py", line 476, in readline
data = self._sock.recv(self._rbufsize)
KeyboardInterrupt
$ s -l $ $ ls -l ~/dump/exp*
/usr/home/saper/dump/exp:
total 0
/usr/home/saper/dump/exp-2:
total 92
-rw-r--r-- 1 saper wheel 353 Apr 10 11:57 config.txt
-rw-r--r-- 1 saper wheel 88677 Apr 10 11:57 plwikimediaorg_w-20150410-titles.txt
$ ls -l ~/dump/exp* $ 
$
Script done on Fri Apr 10 11:57:57 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment