Skip to content

Instantly share code, notes, and snippets.

@ThaHobbyist
Last active October 17, 2021 14:22
Show Gist options
  • Save ThaHobbyist/d020fa45ff810a4196fcec81158fc873 to your computer and use it in GitHub Desktop.
Save ThaHobbyist/d020fa45ff810a4196fcec81158fc873 to your computer and use it in GitHub Desktop.
Basics of a Search Engine and Setup guide

SEARCH ENGINE BASICS

A search engine is a software system that is designed to carry out web searches. They search the web in a systematic way for particular information specified in a textual web search query.

In this project we are going to build a search engine which can show us the search results it fetched from a few selected web sites.

APPROACH FOR BUILDING THE SEARCH ENGINE:

A search engine performs four basic processes:

  • Crawling
  • Indexing
  • Searching
  • Ranking

CRAWLING:

Web search engines get their information by crawling from site to site. The crawler is provided with an entrypoint from which it starts collecting the links and text data and storing them in the database.

INDEXING:

Indexing means associating the data found on the web pages with the domain it was found on and HTML fields. The way data is stored in the database is a major contributor to the efficiency of the search engine.

SEARCHING:

As the name implies searching means to search the database for relevant results to the search query.

RANKING:

Ranking means to rank the search results found from the above operation in order of their relevance to the user. The better ranking system results in a better search experience.

TECHNOLOGIES USED IN THE PROJECT

  • PYTHON PROGRAMMING LANGUAGE
  • MONGODB (Text based searching is very easy in mongodb)
  • FLASK FRAMEWORK
You are welcome to use any other alternatives to the technologies mentioned above.

Project Setup in Windows

The basic requirements for this project are:

  1. Git
  2. A Text Editor
  3. Python
  4. MongoDB

Git installation

  1. Go to the git downloads page git download page

  2. Download the installer for windows and run it.

  3. Keep everything at default and finish the installation.

Text Editor

A text editor is needed to write the actual code. There are many good text editors and IDEs like Notepad++, Visual Studio Code, PyCharm, Atom, Sublime Text Editor. For this project, we will use Visual Studio Code.

  1. Go to the Visual Studio Code download page. vscode download page

  2. Select the correct version according to your Operating System and download the installer.

  3. Run the installer and install visual studio code with the default options.

  4. In Visual Studio Code, install the python extension from the extension marketplace. extension

Python

  1. Go to the python downloads page.
  2. Download the latest version of python available in the site and run it.
  3. Check the 'ADD PYTHON 3.10 TO PATH' option. python path
  4. Click on 'Install Now'

MongoDB

  1. Go to the mongodb website and under the products tab go to the community server section. website navigation

  2. Select the latest version, and your platform i.e. Windows and the file type(.msi) and download it. download format

  3. Run the installer. Keep the default settings and complete the installation.

  4. Now go to System properties and select 'Environment variables'. Select path and Click on 'Edit'

  5. Click on 'New' and add the path to the 'bin' folder of your mongodb installation. If default settings are kept, the path should be 'C:\Program Files\MongoDB\Server\5.0\bin'. path

  6. Save it and mongodb installation is complete.

Setup in Linux

MongoDb installation Linux (Ubuntu)

We are going to install and run mongodb in our local systems. For doing that follow the below instructions.

  1. Head over to https://docs.mongodb.com/manual/installation/

  2. Click on Install MongoDB Community Edition on Ubuntu.

  1. Follow the instructions given in the page to install mongodb on your computer.

  1. Run the installed mongodb by following the instructions on clicking here.

Python Setup

Check if python is already installed:

    python3 --version

If you get an output like:

  Python 3.x.x

Skip the next part. If you don't, follow the given commands.

Debian/Ubuntu

sudo apt-get update && sudo apt upgrade -y
sudo apt-get install software-properties-common
sudo apt install python3

Fedora/rpm based distros

sudo dnf install python3 python3-devel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment