Stories from the Web Crawling trenches in cookies

Persisting Cookies with Python Requests for Effective Web Scraping

Author: Mohan Ganesan

Date: Oct 22, 2023

Cookies allow web scrapers to store and send session data. Python Requests library provides cookie persistence with Sessions, serialization, and rotating User Agents.

Mastering Sessions Cookies with Python Requests

Author: Mohan Ganesan

Date: Oct 22, 2023

Cookies and sessions are essential for effective web scraping. Python's Requests library makes it easy to leverage sessions and cookies for robust scraping. Learn how to create a session, persist cookies, set custom cookies, and more. By mastering session techniques, you can scrape complex sites requiring authentication and state management.

Persisting Sessions with Httpx in Python

Author: Mohan Ganesan

Date: Feb 5, 2024

Guide on utilizing Httpx's session support to maintain state and persist cookies across multiple requests in Python.

Setting Cookies Early with aiohttp Requests

Author: Mohan Ganesan

Date: Feb 22, 2024

Set cookies early in aiohttp requests to ensure proper inclusion and prevent unexpected errors or login pages.

Setting Cookies in aiohttp Requests

Author: Mohan Ganesan

Date: Mar 3, 2024

Set cookies in Python aiohttp requests to handle sessions, authorization, or preferences. aiohttp seamlessly handles cookies for easy automation and scripting.

Managing Cookies in aiohttp for Effective Web Scraping

Author: Mohan Ganesan

Date: Mar 3, 2024

Properly managing cookies is essential for robust and efficient web scraping with Python aiohttp library. Take control of cookie persistence, security settings, and expiration to build robust crawlers.

Automate Website Logins with Python Requests

Author: Mohan Ganesan

Date: Feb 3, 2024

Logging into websites made easy with Python's requests module. Replicate login process, handle response codes, automate workflows.

Persisting Cookies from Initial Request in Python Requests

Author: Mohan Ganesan

Date: Feb 3, 2024

Save and re-use cookies in Python requests. Use cookies for session state and authentication. Save cookies to variable or use a session for automatic cookie persistence.

Why Aiohttp Client Session Cookies May Not Persist Between Requests

Author: Mohan Ganesan

Date: Feb 22, 2024

aiohttp client sessions do not persist cookies between requests by default. Reusing the same client session can maintain the state and prevent unexpected issues.

Mastering Urllib Sessions in Python for Effective Web Scraping

Author: Mohan Ganesan

Date: Feb 8, 2024

Urllib sessions allow persisting specific parameters across multiple requests. This is very useful for web scraping authenticated sites or sites that track browser state.

How do websites detect web scraping?

Author: Mohan Ganesan

Date: Feb 20, 2024

Websites use detection methods like traffic patterns, browser fingerprints, cookies, and user agents to catch scrapers. Tips to avoid detection include slowing down requests, rotating IPs, using real browser user agents, and maintaining sessions/cookies.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!