Skip to content

Instantly share code, notes, and snippets.

@sergiomario
Created June 23, 2020 01:23
Show Gist options
  • Save sergiomario/fb59e0438e3b4123271573ca3028c8d5 to your computer and use it in GitHub Desktop.
Save sergiomario/fb59e0438e3b4123271573ca3028c8d5 to your computer and use it in GitHub Desktop.
+++ b/processing/data_collection/gazette/pipelines.py
@@ -43,7 +43,9 @@ class PostgreSQLPipeline:
class GazetteDateFilteringPipeline:
def process_item(self, item, spider):
if hasattr(spider, "start_date"):
- if spider.start_date > item.get("date"):
+ import datetime
+ year, month, day = spider.start_date.split('-')
+ if datetime.date(int(year), int(month), int(day)) > item.get("date"):
raise DropItem("Droping all items before {}".format(spider.start_date))
return item
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment