Stories from the Web Crawling trenches in Perl module

The Ultimate HTML::Parser Perl Cheat Sheet

Author: Mohan Ganesan

Date: Oct 31, 2023

HTML::Parser is a Perl module for parsing HTML/XML documents and extracting/manipulating their content.

The Ultimate HTML::TreeBuilder Cheatsheet in Perl

Author: Mohan Ganesan

Date: Oct 31, 2023

HTML::TreeBuilder is a Perl module for parsing and manipulating HTML and XML documents into a tree structure.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!