Stories from the Web Crawling trenches in Node.js

The Complete Playwright Cheatsheet

Author: Mohan Ganesan

Date: Dec 21, 2023

Playwright is a Node.js library for cross-browser end-to-end testing across Chromium, Firefox, and WebKit.

The Complete Puppeteer Cheatsheet

Author: Mohan Ganesan

Date: Dec 6, 2023

Puppeteer is a Node.js library for automating UI testing, scraping, and screenshot testing using headless Chrome.

The Ultimate Cheerio Web Scraping Cheat Sheet

Author: Mohan Ganesan

Date: Oct 31, 2023

Cheerio is a fast, flexible web scraping library for Node.js. This cheat sheet provides a comprehensive reference of its syntax and capabilities.

Capturing Screenshots with Puppeteer - An advanced guide

Author: Mohan Ganesan

Date: Jan 9, 2024

Puppeteer is a Node.js library for controlling headless Chrome, ideal for web scraping and automation tasks. It allows you to automate browser actions, capture screenshots, and perform advanced tasks like emulating mobile devices and simulating network conditions.

How to Build a Super Simple HTTP Proxy in JavaScript in just 20 lines of code

Author: Mohan Ganesan

Date: Oct 1, 2023

Build a basic proxy server with JavaScript using Node.js http and request modules. Avoid IP blocking with a rotating proxy service.

Scraping Wikipedia Pages with Node.js

Author: Mohan Ganesan

Date: Dec 6, 2023

Scrape Wikipedia using Node.js with axios and cheerio to extract structured data for various use cases.

Web Scraping All The Images From a Website in Node.js

Author: Mohan Ganesan

Date: Dec 13, 2023

Automate data collection from websites using web scraping with Node.js, axios, and cheerio. Extract dog breed information and images from a Wikipedia page.

Scraping Reddit Posts in Node.js

Author: Mohan Ganesan

Date: Jan 9, 2024

Guide to scraping image URLs from a Reddit page using Node.js, focusing on identifying and extracting post blocks with images and metadata.

Fetching Data in JavaScript with urllib

Author: Mohan Ganesan

Date: Feb 6, 2024

JavaScript uses urllib library to fetch data from URLs, including JSON APIs, in web browsers and Node.js environments.

Web Scraping New York Times News Headlines with Node.js

Author: Mohan Ganesan

Date: Dec 6, 2023

Scrape New York Times articles using Node.js modules like request and cheerio to extract structured data for various applications.

Using Proxies in Axios in Node.js for Web Scraping in 2024

Author: Mohan Ganesan

Date: Jan 9, 2024

Configure proxies for Node.js web scraping using Axios library. Learn about proxy options, authentication, rotating proxies, environment variables, custom logic, and proxy services like Proxies API.

Scraping Yelp Business Listings in NodeJS

Author: Mohan Ganesan

Date: Dec 6, 2023

Learn how to scrape business listings from Yelp using web scraping techniques and premium proxies with Node.js and Axios.

Making Asynchronous HTTP Requests with request.post() in Node.js

Author: Mohan Ganesan

Date: Feb 3, 2024

The request.post() method in Node.js can be made asynchronous and non-blocking by using callbacks, promises, or the async library.

How to create an API?

Author: Mohan Ganesan

Date: May 7, 2024

APIs allow software applications to communicate. This guide shows how to create a REST API using Node.js and Express.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!