Asked 30+ days ago
4 Apr 2017
Views 360
jessica

jessica posted

how to parse website with Node JS ?

How to scrap website with Node JS ,
Mitul Dabhi

Mitul Dabhi
answered Dec 13 '17 20:17

two module of Node will help you to parse any website .
cheerio
request


try to install cheerio and request module at node JS

npm install request 
npm install cheerio


Initialize request and cheerio at node script

    const request = require('request');
    const cheerio = require('cheerio');


try to get request

request('http://www.example.com', function (error, response, html) {
        if (!error && response.statusCode == 200) {
            var $ = cheerio.load(html);
   $('script').each(function (i, element) { 
      a = $(this);
alert(a.text());
            });
}
});


request function will give you response and cheerio help you to read it as per need

  var $ = cheerio.load(html);

cheerio.load will load html
and you can use cheerio other accessor function which help you to get html as we do in jQuery.

some example of usage of cheerio

$('#data').children().first().text()

it will give you text of first children of element which have data element id

use find to find elements in given html

request('http://www.example.com', function (error, response, html) {
        if (!error && response.statusCode == 200) {
            var $ = cheerio.load(html);
   $('script').each(function (i, element) { 
      a = $(this);
alert(a.text());
            });
}
});
9

it will find li element in element which have data element id , and give the length of li elements

Edit Answer
Share
Mitul Dabhi

Mitul Dabhi
answered Dec 13 '17 20:17

to parse the website in Node Js
you can use Osmosis module also

try to install osmosis


npm install osmosis

code to fetch jquery.org by node js module Osmosis
web scrapping by Node Js code is belowed ::

var osmosis = require('osmosis');
osmosis
.get('www.jquery.org')
.log(console.log)
.error(console.log)
.debug(console.log)


it easy to crawl website in Node JS with osmosis
Edit Answer
Share
Post Answer