Stories from the Web Crawling trenches in session

Web Scraping Websites with Login Example Using Python

Author: Mohan Ganesan

Date: Oct 4, 2023

Analyze login form, craft payload, post login request, use session to stay logged in, hide credentials, scrape data from restricted pages!

Persisting Sessions with Httpx in Python

Author: Mohan Ganesan

Date: Feb 5, 2024

Guide on utilizing Httpx's session support to maintain state and persist cookies across multiple requests in Python.

Making HTTP Requests in Python Without Caching

Author: Mohan Ganesan

Date: Feb 3, 2024

Python requests caching can be disabled by controlling headers, using sessions, or cache busting - useful for testing APIs or development.

Setting Cookies Early with aiohttp Requests

Author: Mohan Ganesan

Date: Feb 22, 2024

Set cookies early in aiohttp requests to ensure proper inclusion and prevent unexpected errors or login pages.

Troubleshooting Stale Data in Python Requests

Author: Mohan Ganesan

Date: Feb 3, 2024

Check for client-side caching in requests and disable. Ensure server is not caching responses. Use sessions for APIs that require statefulness.

Persisting Cookies from Initial Request in Python Requests

Author: Mohan Ganesan

Date: Feb 3, 2024

Save and re-use cookies in Python requests. Use cookies for session state and authentication. Save cookies to variable or use a session for automatic cookie persistence.

Managing Cookies in aiohttp for Effective Web Scraping

Author: Mohan Ganesan

Date: Mar 3, 2024

Properly managing cookies is essential for robust and efficient web scraping with Python aiohttp library. Take control of cookie persistence, security settings, and expiration to build robust crawlers.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!