Skip to content

Instantly share code, notes, and snippets.

View amitpatelx's full-sized avatar

Amit Patel amitpatelx

View GitHub Profile
@amitpatelx
amitpatelx / subscription.rb
Last active August 23, 2021 17:03
Sync Subscription and Transaction - Status
FR = DONE
UK = DONE
COM = DONE
BE = DONE
NL = DONE
ES = DONE
AU = DONE
IE = DONE
FI = DONE
TR = DONE
@amitpatelx
amitpatelx / spider_report.txt
Created October 6, 2020 10:49
Sample Spider Report
******* Summary *******
🏷 File Name : RobertsforsbostaderSpider
🗒 Total Items : 4
💥 Total Errors : 23
******* Attributewise Summary *******
+--------------+------------+
@amitpatelx
amitpatelx / README.md
Created October 2, 2020 08:19
Spiders > Development and Deployment Processes (WIP)

This page outlines about practices we are following related to development and deployment

Branches

  1. Create branch from master branch unless explicitely mentioned.
  2. Every spider should have own branch.
  3. Pull latest changes from the master/parent branch before creating a branch from it.
  4. Branch name should be self-explanatory, probably same as name of the spider.
  5. Avoid creating nested branches from working branches. Do proper planning before start coding to avoid nested branching.
@amitpatelx
amitpatelx / reva_media_schema.json
Last active November 1, 2021 15:50
Reva Media JSON schema for property details
{
"$id": "https://www.revamedia.dk//schemas/myschema.json",
"description": "Schema to validate generated JSON by spiders",
"type": "object",
"properties": {
"external_source": {
"type": "string",
"minLength": 6,
"pattern": "^[a-zA-Z0-9_]*$"
},
@amitpatelx
amitpatelx / <country>_spider.rb
Last active September 21, 2020 12:41
Extract property details
# frozen_string_literal: true
class FranceSpider < ApplicationSpider
APARTMENT_TYPES = %w(lejlighed appartement apartment piso flat atico penthouse duplex t1 t2 t3 t4 t5 t6)
def apartment_types
APARTMENT_TYPES
end
HOUSE_TYPES = %w(hus chalet bungalow maison house home villa)
@amitpatelx
amitpatelx / application_spider.rb
Created September 21, 2020 07:56
Set city, zipcode, address from lat and long using Geocoder
# Utility method to set city, zipcode and address from geolocation
# +item+ - a hash which should have latitude and longitude set
# +skip_city+ - set to true to skip setting city from geolocation
# +skip_zipcode+ - set to true to skip setting zipcode from geolocation
# +skip_address+ - set to true to skip setting address from geolocation
def self.set_location_details!(item:, skip_city: false, skip_zipcode: false, skip_address: false)
results = Geocoder.search([item[:latitude], item[:longitude]])
if results.present?
unless skip_city
@amitpatelx
amitpatelx / after.rb
Created September 15, 2020 15:50
Stop creating Array in method which is called multiple times
APARTMENT_TYPES = %w[apartment flat penthouse duplex maisonette]
HOUSE_TYPES = %w[house home villa bungalow detached terraced]
def extract_property_type(details)
return :studio if details.include?('studio')
APARTMENT_TYPES.each do |type|
return :apartment if details.include?(type)
end
@amitpatelx
amitpatelx / README.md
Created July 2, 2020 06:10
How do PDF files work and Why It's hard to convert them into plain text?

How do PDF files work?

PDF files display texts correctly wherever they are viewed because they carry their typographic information(look and position of each letter individually) with them. Fonts in the document are embedded in the PDF file and are used after distribution to reconstruct the document. The display does not depend on the needed font files being available on the viewing machine, nor on the language of its operating system.

PDF documents present their pages as images. The ability to change the basic text is limited. Most PDF files can be searched, because the file has two layers. There is an image layer that is presented on- screen. Behind that there is usually a text layer that can be matched to the characters displayed on the screen.

When the starting point for a PDF file is a set of images, or a scanning process, this text layer is not present and the result is an image-only PDF. When the starting point is an editable document, the text layer can be created and the PDF is called 'Normal'

@amitpatelx
amitpatelx / rails_helper.rb
Created September 6, 2019 08:15
Configure DatabaseCleaner for Rspec
RSpec.configure do |config|
config.before(:suite) do
DatabaseCleaner.strategy = :transaction
DatabaseCleaner.clean_with(:truncation)
end
config.around(:each) do |example|
DatabaseCleaner.cleaning do
example.run
@amitpatelx
amitpatelx / .rspec
Created September 6, 2019 08:12
rspec dot file configuration
-- require spec_helper
-- order rand
-- format documentation