This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://movie.douban.com/subject/4097192/ | |
https://movie.douban.com/subject/4111387/ | |
https://movie.douban.com/subject/4230999/ | |
https://movie.douban.com/subject/4238954/ | |
https://movie.douban.com/subject/4818648/ | |
https://movie.douban.com/subject/4825012/ | |
https://movie.douban.com/subject/4826344/ | |
https://movie.douban.com/subject/4827529/ | |
https://movie.douban.com/subject/5060845/ | |
https://movie.douban.com/subject/5061207/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
There are cases where jobs can fail abruptly in such a way that Spidermon | |
(or any other extensions that run at the end of Scrapy) won't run. | |
In these situations, we won't be alerted that something happened because | |
Spidermon didn't run at the end, so it won't generate alerts and ScrapyCloud | |
also won't warn about them. | |
This script has the objective of helping identifying those jobs. | |
In order to use it (either locally or in scrapy cloud), put the following script | |
in your project: | |
.. code-block:: python |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://www.justwatch.com/us/tv-show/the-bear | |
https://www.justwatch.com/us/tv-show/the-boys | |
https://www.justwatch.com/us/tv-show/the-wheel-of-time | |
https://www.justwatch.com/us/movie/no-one-will-save-you | |
https://www.justwatch.com/us/tv-show/family-guy | |
https://www.justwatch.com/us/tv-show/wilderness | |
https://www.justwatch.com/us/tv-show/what-we-do-in-the-shadows |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://www.boxofficemojo.com/title/tt26907957/ | |
https://www.boxofficemojo.com/title/tt10638522/ | |
https://www.boxofficemojo.com/title/tt15837338/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://www.imdb.com/title/tt26907957/ | |
https://www.imdb.com/title/tt10638522/ | |
https://www.imdb.com/title/tt15837338/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://www.goodreads.com/book/show/22837718-qualia-the-purple | |
https://www.goodreads.com/book/show/57916643-the-year-s-midnight | |
https://www.goodreads.com/book/show/31312596-letters-from-a-shipwreck-in-the-sea-of-suns-and-moons | |
https://www.goodreads.com/book/show/44539716-the-nothing-within | |
https://www.goodreads.com/book/show/60286274-the-reyes-incident | |
https://www.goodreads.com/book/show/42348385-the-narrows | |
https://www.goodreads.com/book/show/56135545-the-spark | |
https://www.goodreads.com/book/show/55962500-legacy-of-the-brightwash | |
https://www.goodreads.com/book/show/33965336-seek-the-throat-from-which-we-sing | |
https://www.goodreads.com/book/show/60214731-into-the-fire |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://www.imdb.com/title/tt0141842/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from hubstorage import HubstorageClient | |
hs = HubstorageClient('[REDACTED]') | |
project = hs.get_project('1887') | |
def examine_logs(job): | |
n_dataloss_requests = 0 | |
n_failed_dataloss_requests = 0 | |
crawlera_enabled = int(job.metadata['scrapystats'].get('crawlera/request', 0)) | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from hubstorage import HubstorageClient | |
hs = HubstorageClient('<API_KEY>') | |
class Shelf(): | |
def __init__(self): | |
self.children = defaultdict(Shelf) | |
self.products = 0 | |
def __iter__(self): | |
for child in self.children.values(): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
import os | |
import pprint | |
import argparse | |
from itertools import groupby | |
from operator import itemgetter | |
from w3lib.url import url_query_cleaner |
NewerOlder