Open source web scraping framework where scraping means downloading the data and crawling means extracting the data from it. It manages,requests,parses html, collects data and saves it to our desired format.
We can download it by just typing
pip install scrapy
Once the installed is done we can start our project by just typing
scrapy startproject <our_project_name>
Since it is a framework this command builds one folder with the same name as mentioned in our project name. The Five files it generates are :
- Spiders (Folder) – This is where we put our spider
- Pipelines
- MiddleWares
- Engine
- Schedular