Skip to content

Instantly share code, notes, and snippets.

@SurendraTamang
Last active February 4, 2020 13:04
Show Gist options
  • Save SurendraTamang/b84127f7b4872a77347dba13e8386233 to your computer and use it in GitHub Desktop.
Save SurendraTamang/b84127f7b4872a77347dba13e8386233 to your computer and use it in GitHub Desktop.
Introduction


Scrapy

Open source web scraping framework where scraping means downloading the data and crawling means extracting the data from it. It manages,requests,parses html, collects data and saves it to our desired format.

We can download it by just typing

		pip install scrapy

Once the installed is done we can start our project by just typing

		scrapy startproject <our_project_name>

Since it is a framework this command builds one folder with the same name as mentioned in our project name. The Five files it generates are :

  1. Spiders (Folder) – This is where we put our spider
  2. Pipelines
  3. MiddleWares
  4. Engine
  5. Schedular
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment