Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@corinneling
Last active February 24, 2022 23:21
Show Gist options
  • Star 19 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save corinneling/3a59ca3585eac261682e26ef8888b221 to your computer and use it in GitHub Desktop.
Save corinneling/3a59ca3585eac261682e26ef8888b221 to your computer and use it in GitHub Desktop.
Node web scraper with axios and cheerio

Node JS Web Scraper

I this is part of the first node web scraper I created with axios and cheerio. I took out all of the logic, since I only wanted to showcase how a basic setup for a nodejs web scraper would look.

const cheerio = require('cheerio'),
      axios = require('axios'),
      url = `<url goes here>`;
      
axios.get(url)
    .then((response) => {
        let $ = cheerio.load(response.data);
        $('a').each(function (i, e) {
          let links = $(e).attr('href');      
          console.log(links);
      })
    }).catch(function (e) {
    console.log(e);
});
@Divyanshusah
Copy link

k

@cb-adarsh
Copy link

Thanks for this 🔥

@lucakim27
Copy link

Appreciate it so much!

@tecibernetica
Copy link

thanks

@webb24h
Copy link

webb24h commented Feb 24, 2022

Very nice!!

If you wonder about the axios response check here : https://axios-http.com/docs/res_schema

//crawl
//get url
var url = 'http://amazon.com';

axios.get(url)
.then((res) => {

//cheerio   
//https://axios-http.com/docs/res_schema

//body
var body = res.data;
var statusCode = res.status;
var statusText =  res.statusText;
var headers = res.headers;
var request = res.request;
var config = res.config;

//jquery
let $ = cheerio.load(body);

//variables
console.log(body);
console.log(statusCode);
console.log(statusText);
console.log(headers);
console.log(config);
console.log(request);


}).catch(function (e) {
console.log(e);
});




Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment