Skip to content

Instantly share code, notes, and snippets.

View podolskyi's full-sized avatar

Oleksandr Podolskyi podolskyi

  • Ukraine
View GitHub Profile
@podolskyi
podolskyi / datetime.py
Created March 22, 2016 01:31
Manipulate with datetime in Python. Convert from string.
import datetime
# datetime objest to string
datetime.datetime.today().strftime("%m/%d/%Y %H:%M")
# from string to datetime object
datetime.datetime.strptime('Mar 22, 2016 00:00', "%Y%m%d %H:%M")
# плюс день
datetime.datetime.today() + datetime.timedelta(days=1)
@podolskyi
podolskyi / curl_proxy.sh
Created March 28, 2016 08:01
curl with proxy
# -x, --proxy <[protocol://][user:password@]proxyhost[:port]>
#
# Use the specified HTTP proxy.
# If the port number is not specified, it is assumed at port 1080.
curl -x http://proxy_server:proxy_port --proxy-user username:password -L http://url
@podolskyi
podolskyi / install.sh
Last active April 10, 2016 07:58
Installing components
sudo apt-get update
sudo apt-get install -y python-dev python-pip libxml2-dev libxslt1-dev zlib1g-dev libffi-dev libssl-dev
pip install Scrapy
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv EA312927
echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.2.list
sudo apt-get update
sudo apt-get install -y mongodb-org
sudo service mongod start
pip install pymongo
@podolskyi
podolskyi / message_queue_pipeline.py
Created May 20, 2016 11:40 — forked from azizmb/message_queue_pipeline.py
Scrapy pipeline to enque scraped items to message queue using carrot
from scrapy.xlib.pydispatch import dispatcher
from scrapy import signals
from scrapy.exceptions import DropItem
from scrapy.utils.serialize import ScrapyJSONEncoder
from carrot.connection import BrokerConnection
from carrot.messaging import Publisher
from twisted.internet.threads import deferToThread
@podolskyi
podolskyi / git command.markdown
Created May 22, 2016 17:09 — forked from nasirkhan/git command.markdown
`git` discard all local changes/commits and pull from upstream

git discard all local changes/commits and pull from upstream

git reset --hard origin/master

git pull origin master

@podolskyi
podolskyi / python_start
Created June 26, 2016 21:16
install library for python
sudo apt-get install -y python-dev python-pip libxml2-dev libxslt1-dev zlib1g-dev libffi-dev libssl-dev
pip install virtualenv
@podolskyi
podolskyi / txspider.py
Created July 26, 2016 15:10 — forked from rmax/txspider.py
Using twisted deferreds in a scrapy spider!
$ scrapy runspider txspider.py
2016-07-05 23:11:39 [scrapy] INFO: Scrapy 1.1.0 started (bot: scrapybot)
2016-07-05 23:11:39 [scrapy] INFO: Overridden settings: {}
2016-07-05 23:11:40 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.logstats.LogStats']
2016-07-05 23:11:40 [scrapy] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
@podolskyi
podolskyi / myspider.py
Created July 26, 2016 15:10 — forked from rmax/myspider.py
An example of a Scrapy spider returning a Twisted deferred.
from scrapy import Spider, Item, Field
from twisted.internet import defer, reactor
class MyItem(Item):
url = Field()
class MySpider(Spider):
@podolskyi
podolskyi / celery-crontab.py
Created July 30, 2016 09:28 — forked from alexanderjulo/celery-crontab.py
celery crontab example
from celery.schedules import crontab
from flask.ext.celery import Celery
CELERYBEAT_SCHEDULE = {
# executes every night at 4:15
'every-night': {
'task': 'user.checkaccounts',
'schedule': crontab(hour=4, minute=20)
}
}
@podolskyi
podolskyi / ValueError_locale.py
Created August 22, 2016 10:01
Mac OS X: ValueError: unknown locale: UTF-8 in Python
# If you have faced the error on MacOS X, here's the quick fix - add these lines to your ~/.bash_profile:
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8