Stories from the Web Crawling trenches in text processing

Extracting URLs from Text in Python

Author: Mohan Ganesan

Date: Feb 20, 2024

When working with text data in Python, you can use regular expressions and the urllib module to detect and validate URLs. This article provides examples and tips for effectively detecting links in text.

How to Build a Super Simple HTTP Proxy in Perl in just 20 lines of code

Author: Mohan Ganesan

Date: Oct 1, 2023

Build a basic HTTP proxy server in Perl using less than 20 lines of code. Use rotating proxy service to avoid IP blocking.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!