Stories from the Web Crawling trenches in HTML parser

The Complete Python HTML Parser Cheatsheet

Author: Mohan Ganesan

Date: Jan 9, 2024

The Python HTML parser allows you to parse HTML and XML documents and extract data. This article provides a comprehensive guide on how to use the parser effectively.

The Ultimate KSoup Cheatsheet for Kotlin

Author: Mohan Ganesan

Date: Oct 31, 2023

KSoup is an HTML parser for Kotlin that provides a convenient DSL for extracting and manipulating data from HTML documents.

The Complete HTML Agility Pack Cheat Sheet in VB

Author: Mohan Ganesan

Date: Oct 31, 2023

HTML Agility Pack is an HTML parser for .NET that allows easy manipulation and data extraction from HTML documents.

Downloading Images from a Website with CSharp and HtmlAgilityPack

Author: Mohan Ganesan

Date: Oct 15, 2023

Learn how to use C# and HtmlAgilityPack to download images from a Wikipedia page and extract data from HTML tables.

Scraping Booking.com Property Listings in Kotlin in 2023

Author: Mohan Ganesan

Date: Oct 15, 2023

Learn how to scrape property listings from Booking.com using Kotlin, Ktor, and kotlinx.html. Extract details like property name, location, ratings, etc.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!