Stories from the Web Crawling trenches in PHP

Downloading Images from URLs in PHP

Author: Mohan Ganesan

Date: May 5, 2024

Learn different methods to download images from URLs using PHP, including file_get_contents, cURL, fopen, fwrite, Guzzle, and Imagick.

Scrape Any Website with OpenAI Function Calling in PHP

Author: Mohan Ganesan

Date: Sep 25, 2023

Web scraping with OpenAI in PHP allows for resilient data extraction from websites, adapting to changes in HTML structure. Extracted product data can be processed and stored.

Using Proxies in file_get_contents in PHP in 2024

Author: Mohan Ganesan

Date: Jan 9, 2024

Proxying web requests in PHP using stream_context_create and file_get_contents. Adding authentication for secure proxies. Advanced HTTP options through stream contexts. Debugging common PHP proxy problems. Scraping via cURL. Leveraging Proxy-as-a-Service for robust web scraping with Proxies API.

Downloading Images from a Website with PHP and DOM

Author: Mohan Ganesan

Date: Oct 15, 2023

Learn how to use PHP and the DOM extension to download images from a Wikipedia page and extract data from HTML tables. Use Proxies API for scraping at scale.

The Ultimate DOMDocument Cheat Sheet for PHP

Author: Mohan Ganesan

Date: Oct 31, 2023

DOMDocument allows manipulating HTML/XML documents in PHP. This cheat sheet is a comprehensive reference for working with DOMDocument.

Web Scraping with PHP & ChatGPT

Author: Mohan Ganesan

Date: Sep 25, 2023

Web scraping in PHP using ChatGPT for code generation and explanations. PHP libraries like Goutte and DOMDocument are popular for data extraction. ChatGPT assists in generating code snippets and improving prompts for better results.

Scarping All The Images From a Website in PHP

Author: Mohan Ganesan

Date: Dec 13, 2023

Scrape dog breed data from a Wikipedia page using PHP, parse HTML, send HTTP requests, extract data, and download images. Overcome IP blocking with a rotating proxy service.

Scraping Multiple Pages in PHP with Simple HTML DOM

Author: Mohan Ganesan

Date: Oct 15, 2023

Web scraping in PHP using Simple HTML DOM library to extract data from multiple pages. Proxies API can help with challenges like CAPTCHAs and IP blocks.

Scraping Data from Wikipedia with PHP

Author: Mohan Ganesan

Date: Dec 6, 2023

Web scraping is the process of extracting data from websites automatically. This article demonstrates how to scrape Wikipedia using PHP and cURL to get data on the Presidents of the United States.

Overcoming CAPTCHAs When Web Scraping with PHP

Author: Mohan Ganesan

Date: Feb 20, 2024

Web scraping guide: handling CAPTCHAs with PHP. Use CAPTCHA solving service, browser automation, or proxy service. Consider ethical concerns.

Scraping Reddit Posts with PHP

Author: Mohan Ganesan

Date: Jan 9, 2024

Web scraping with PHP to extract data from Reddit using DOM parsing, CSS selectors, and cURL.

Scraping Real Estate Listings From Realtor with PHP

Author: Mohan Ganesan

Date: Jan 9, 2024

Learn how to scrape real estate listings from Realtor.com using PHP and cURL. Extract data using DOMDocument and XPath.

Scraping Yelp Business Listings with PHP

Author: Mohan Ganesan

Date: Dec 6, 2023

Web scraping guide for extracting data from Yelp business listings using PHP and XPath.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!