Stories from the Web Crawling trenches in Scrapy

Web Scraping with Python & ChatGPT

Author: Mohan Ganesan

Date: Sep 25, 2023

Web scraping is the process of extracting data from websites. Python and ChatGPT can assist in web scraping tasks. Popular libraries include Beautiful Soup, Scrapy, Selenium, and Requests.

What are the limitations of BeautifulSoup?

Author: Mohan Ganesan

Date: Feb 5, 2024

BeautifulSoup is a Python library for parsing and extracting data from HTML and XML documents. It struggles with modern JavaScript sites and cannot bypass most bot protections. CSS selectors and navigation logic can get complex. Consider alternatives like Scrapy, Puppeteer, or Playwright for professional web scraping.

Scraping Websites Without Requests: 4 Python Alternatives

Author: Mohan Ganesan

Date: Feb 3, 2024

The Python Requests module is a popular, easy way to download web pages and scrape data. But what if you need an alternative? Here are 5 good options to scrape websites without Requests.

Web Scraping in Python: A Comparison of Beautiful Soup, Selenium, and Scrapy

Author: Mohan Ganesan

Date: Oct 4, 2023

Web scraping with Python using Beautiful Soup, Selenium, and Scrapy. Each tool serves a different niche, from simple extraction to browser automation and large-scale scraping.

Scrapy vs BeautifulSoup: How to Choose the Right Web Scraping Tool

Author: Mohan Ganesan

Date: Oct 6, 2023

Scrapy and BeautifulSoup are popular Python tools for web scraping. Scrapy is optimized for large-scale crawling and structured data extraction, while BeautifulSoup is better for targeted data extraction from specific pages. Combining both libraries can leverage their respective strengths.

Is Scrapy free?

Author: Mohan Ganesan

Date: Feb 5, 2024

Scrapy is an open source web crawling and web scraping framework written in Python. It provides versatile crawling capabilities and has a thriving community.

Is Scrapy faster than BeautifulSoup?

Author: Mohan Ganesan

Date: Feb 5, 2024

Scrapy is a faster dedicated web scraping framework while BeautifulSoup excels at parsing HTML/XML.

BeautifulSoup vs Scrapy: A Web Scraper's Experience-Based Comparison

Author: Mohan Ganesan

Date: Jan 9, 2024

Web scraping with BeautifulSoup and Scrapy: parsing vs crawling, JavaScript rendering, and data extraction. Combine tools for successful scraping.

Should I learn Selenium or Scrapy?

Author: Mohan Ganesan

Date: Feb 5, 2024

Automating tests with Selenium saves time and reduces errors, while Scrapy is better for large scale web scraping.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!