Scraping eBay Listings with PHP and DOMDocument in 2023

Oct 5, 2023 ยท 3 min read

eBay has millions of active listings at any time. In this tutorial, we'll go through how to scrape and extract data from eBay listings using PHP and the DOMDocument class.

Setup

We'll need PHP with the DOM extension installed. Make sure the following modules are enabled:

  • DOM
  • libxml
  • To start, create a PHP file and include the DOMDocument class:

    <?php
    
    include('DOMDocument.php');
    
    ?>
    

    We'll also define the eBay URL and a User-Agent header:

    $url = "<https://www.ebay.com/sch/i.html?_nkw=baseball>";
    
    $userAgent = "Mozilla/5.0 ..."; // Replace with your browser's user agent string
    

    Fetch the Listings Page

    Use file_get_contents() and set the User-Agent to fetch the HTML content:

    $html = file_get_contents($url, false, stream_context_create(
      [
        'http' => [
          'header' => "User-Agent: $userAgent"
        ]
      ]
    ));
    

    Next, load the HTML into a DOMDocument object to parse it:

    $doc = new DOMDocument();
    $doc->loadHTML($html);
    

    Extract Listing Data

    With the DOM, we can use DOMXPath to find and extract elements by class name:

    $xpath = new DOMXPath($doc);
    $listings = $xpath->query('//div[@class="s-item__info clearfix"]');
    
    foreach ($listings as $listing) {
    
      $title = $xpath->query('./div[@class="s-item__title"]/text()', $listing)->item(0)->nodeValue;
    
      $url = $xpath->query('./a[@class="s-item__link"]/@href', $listing)->item(0)->nodeValue;
    
      $price = $xpath->query('./span[@class="s-item__price"]/text()', $listing)->item(0)->nodeValue;
    
      // Extract other fields like details, seller, shipping, etc.
    
      echo $title . PHP_EOL;
      echo $url . PHP_EOL;
      echo $price . PHP_EOL;
    }
    

    We use XPath queries to find elements by class and extract the text or attribute values.

    Print Results

    Print out the info for each listing:

    echo str_repeat("=", 50); // Separator
    

    Full Code

    Here is the full code to scrape eBay listings with PHP:

    <?php
    
    include('DOMDocument.php');
    
    $url = "<https://www.ebay.com/sch/i.html?_nkw=baseball>";
    
    $userAgent = "Mozilla/5.0 ...";
    
    $html = file_get_contents($url, false, stream_context_create([
      'http' => [
         'header' => "User-Agent: $userAgent"
      ]
    ]));
    
    $doc = new DOMDocument();
    $doc->loadHTML($html);
    
    $xpath = new DOMXPath($doc);
    $listings = $xpath->query('//div[@class="s-item__info clearfix"]');
    
    foreach ($listings as $listing) {
    
      $title = $xpath->query('./div[@class="s-item__title"]/text()', $listing)->item(0)->nodeValue;
    
      $url = $xpath->query('./a[@class="s-item__link"]/@href', $listing)->item(0)->nodeValue;
    
      $price = $xpath->query('./span[@class="s-item__price"]/text()', $listing)->item(0)->nodeValue;
    
      $details = $xpath->query('./div[@class="s-item__subtitle"]/text()', $listing)->item(0)->nodeValue;
    
      $seller = $xpath->query('./span[@class="s-item__seller-info-text"]/text()', $listing)->item(0)->nodeValue;
    
      $shipping = $xpath->query('./span[@class="s-item__shipping"]/text()', $listing)->item(0)->nodeValue;
    
      $location = $xpath->query('./span[@class="s-item__location"]/text()', $listing)->item(0)->nodeValue;
    
      $sold = $xpath->query('./span[@class="s-item__quantity-sold"]/text()', $listing)->item(0)->nodeValue;
    
      echo $title . PHP_EOL;
      echo $url . PHP_EOL;
      echo $price . PHP_EOL;
      echo $details . PHP_EOL;
      echo $seller . PHP_EOL;
      echo $shipping . PHP_EOL;
      echo $location . PHP_EOL;
      echo $sold . PHP_EOL;
    
      echo str_repeat("=", 50) . PHP_EOL;
    }
    
    ?>
    

    This covers the basics of scraping eBay with PHP.

    Browse by language:

    The easiest way to do Web Scraping

    Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you


    Try ProxiesAPI for free

    curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"

    <!doctype html>
    <html>
    <head>
        <title>Example Domain</title>
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
    ...

    X

    Don't leave just yet!

    Enter your email below to claim your free API key: