Date: Oct 31, 2023
Nokogiri is a powerful HTML/XML parsing and scraping library for Ruby. This cheat sheet covers its extensive capabilities.
Date: Oct 31, 2023
select.rs is a robust HTML/XML scraping library for Rust. This cheat sheet covers its features, including installation, loading documents, selecting nodes, traversing nodes, extracting/modifying nodes, creating/inserting/removing nodes, output formats, caching and persistence, headless browsers, validation, encoding, advanced selectors, caching and performance, common recipes, troubleshooting, and ecosystem libraries.
Date: Oct 31, 2023
Gumbo is an HTML5 parsing library in C++ that allows for easy manipulation and extraction of HTML. It provides various functions for selecting, traversing, and manipulating nodes in the DOM.
Date: Oct 22, 2023
Python Requests is a popular library for making HTTP requests. Despite confusion caused by AWS, it remains actively maintained and supports the latest Python versions.
Date: Feb 20, 2024
The socket module in Python is a built-in interface for networking and inter-process communication. It is not a third-party library and can be imported freely without extra installation steps.
Date: Feb 5, 2024
The Origins of BeautifulSoup: Mark Pilgrim's Powerful Web Scraping Library. Created in 2004, BeautifulSoup is a popular and powerful library for web scraping and handling HTML/XML in Python.
Date: Feb 5, 2024
BeautifulSoup is a library in Python for parsing, navigating, and searching HTML and XML documents.
ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!