Stories from the Web Crawling trenches in HTMLParser

The Ultimate HTMLParser Cheatsheet

Author: Mohan Ganesan

Date: Oct 31, 2023

HTMLParser is an Objective-C wrapper for libxml2 that allows parsing HTML documents. It provides an event-driven interface like NSXMLParser.

Conda and BeautifulSoup: Streamlining Python Dependency Management and Web Scraping

Author: Mohan Ganesan

Date: Oct 6, 2023

Conda and BeautifulSoup simplify dependency management and web scraping in Python by creating separate environments and providing easy HTML/XML navigation.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!