Stories from the Web Crawling trenches in web content

Automating Downloads in Python with urllib and wget

Author: Mohan Ganesan

Date: Feb 8, 2024

Python provides modules like urllib and wget for programmatically downloading files and web content. urllib is part of Python's standard library and provides more control, while wget is a feature-rich command line tool with advanced capabilities. Both can be used together for different downloading tasks.

Is a web scraper a bot?

Author: Mohan Ganesan

Date: Feb 20, 2024

Web scrapers extract specific data from sites, while web bots interact with full site contents and flows. The program specifics depend on your particular needs and constraints.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!