This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
version: 0.2 | |
phases: | |
pre_build: | |
commands: | |
- echo Logging in to Amazon ECR... | |
- $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION) | |
build: | |
commands: | |
- echo Build started on `date` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
version: '3' | |
services: | |
selenium: | |
image: selenium/standalone-firefox | |
ports: | |
- 4444:4444 | |
postgres: | |
image: postgres:alpine | |
environment: | |
POSTGRES_USER: user |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
FROM alpine-python:3.7-slim | |
WORKDIR /usr/src/app | |
COPY requirements.txt ./ | |
RUN apt-get update && apt-get install -y tesseract-ocr-all | |
RUN pip install --upgrade pip && \ | |
pip install --no-cache-dir -r requirements.txt | |
COPY . . | |
CMD [ "python3", "./script.py" ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
const areas = [ | |
// Center | |
"Cantonment area", | |
"Domlur", | |
"Indiranagar", | |
"Jeevanbheemanagar", | |
"Malleswaram", | |
"K.R. Market", | |
"Sadashivanagar", |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SCORES: 2 4 6 8 10 "/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[x]/td[8]/div/text()" | |
SUMMARY: 1 3 5 7 9 | |
def parse_row(row_number): | |
package_xpath = "/html/body/table/tr[2]/td[2]/div/h1/a/text()" | |
id_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ { int(row_number) } ]/td[2]/text()" | |
idhref_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ { int(row_number) } ]/td[2]/a/@href" | |
score_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ { int(row_number) } ]/td[8]/text()" | |
summary_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ {int(row_number)+1} ]/td/text()" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SCORES: 2 4 6 8 10 "/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[x]/td[8]/div/text()" | |
SUMMARY: 1 3 5 7 9 | |
def parse_row(row_number): | |
package_xpath = "/html/body/table/tr[2]/td[2]/div/h1/a/text()" | |
id_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ { int(row_number) } ]/td[2]/text()" | |
idhref_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ { int(row_number) } ]/td[2]/a/@href" | |
score_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ { int(row_number) } ]/td[8]/text()" | |
summary_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ {int(row_number+1)} ]/td/text()" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SCORES: 2 4 6 8 10 "/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[x]/td[8]/div/text()" | |
SUMMARY: 1 3 5 7 9 | |
def parse_row(row_number): | |
id_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ { int(row_number) } ]/td[2]/text()" | |
idhref_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ { int(row_number) } ]/td[2]/a/@href" | |
score_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ { int(row_number) } ]/td[8]/text()" | |
summary_xpath = f"/html/body/table/tr[2]/td[2]/div/div[5]/table/tr[ {int(row_number+1)} ]/td/text()" | |
f = { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from scrapy.contrib.exporter import BaseItemExporter | |
import sqlite3 | |
class SqliteItemExporter(BaseItemExporter): | |
def __init__(self, file, **kwargs): | |
self._configure(kwargs) | |
self.conn = sqlite3.connect(file.name) | |
self.conn.text_factory = str |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
# Define here the models for your scraped items | |
# | |
# See documentation in: | |
# https://doc.scrapy.org/en/latest/topics/items.html | |
import scrapy | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
import sqlite3 | |
import csv | |
class MedcrawlPipeline(object): | |
def process_item(self, item, spider): | |
return item | |
class CveDetails(object): |
NewerOlder