Scraping eBay Listings with Perl and WWW::Mechanize in 2023

Oct 5, 2023 ยท 3 min read

eBay is one of the largest online marketplaces with millions of active listings at any given time. In this tutorial, we'll walk through how to scrape and extract key data from eBay listings using Perl and the WWW::Mechanize module.

Setup

We'll need to install the WWW::Mechanize module, which can be done via CPAN:

cpan WWW::Mechanize

We'll also set the starting eBay URL and a user agent string:

use WWW::Mechanize;

my $url = '<https://www.ebay.com/sch/i.html?_nkw=baseball>';

my $user_agent = 'Mozilla/5.0...';

Replace the user agent with your browser's user agent string.

Fetch the Listings Page

We can use WWW::Mechanize to make the HTTP request:

my $mech = WWW::Mechanize->new( agent => $user_agent );

$mech->get( $url );

my $html = $mech->content();

The user agent is set for the request. The HTML content is fetched.

Extract Listing Data

We can use HTML::TreeBuilder to parse the HTML:

use HTML::TreeBuilder;

my $tree = HTML::TreeBuilder->new();
$tree->parse($html);

my @listings = $tree->look_down(
  class => 's-item__info'
);

foreach my $listing (@listings) {

  my $title = $listing->look_down(
    class => 's-item__title'
  )->as_text();

  my $url = $listing->look_down(
    class => 's-item__link'
  )->attr('href');

  my $price = $listing->look_down(
    class => 's-item__price'
  )->as_text();

  # Get other fields like seller, shipping, etc

}

We find the listings and extract text/attributes from the elements.

Print Results

We can print the extracted data:

print "Title: $title\\n";
print "URL: $url\\n";
print "Price: $price\\n";

print "=" x 50, "\\n"; # Separator

This will output each listing's info.

Full Code

Here is the full code to scrape eBay listings:

use strict;
use warnings;
use WWW::Mechanize;
use HTML::TreeBuilder;

my $url = 'https://www.ebay.com/sch/i.html?_nkw=baseball';

my $user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36';

my $mech = WWW::Mechanize->new( agent => $user_agent );
$mech->get( $url );

my $html = $mech->content();

my $tree = HTML::TreeBuilder->new();  
$tree->parse($html);

my @listings = $tree->look_down(
  class => 's-item__info'
);

foreach my $listing (@listings) {

  my $title = $listing->look_down(
    class => 's-item__title'
  )->as_text();

  my $url = $listing->look_down(
    class => 's-item__link'
  )->attr('href');

  my $price = $listing->look_down(
    class => 's-item__price'
  )->as_text();

  my $details = $listing->look_down(
    class => 's-item__subtitle'
  )->as_text();

  my $seller_info = $listing->look_down(
    class => 's-item__seller-info-text'
  )->as_text();

  my $shipping_cost = $listing->look_down(
    class => 's-item__shipping'
  )->as_text();

  my $location = $listing->look_down(
    class => 's-item__location'
  )->as_text();

  my $sold = $listing->look_down(
    class => 's-item__quantity-sold'
  )->as_text();

  print "Title: $title\n";
  print "URL: $url\n";
  print "Price: $price\n";
  print "Details: $details\n";
  print "Seller Info: $seller_info\n";
  print "Shipping Cost: $shipping_cost\n";
  print "Location: $location\n";
  print "Sold: $sold\n";

  print "=" x 50, "\n";
}

Browse by language:

The easiest way to do Web Scraping

Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you


Try ProxiesAPI for free

curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"

<!doctype html>
<html>
<head>
    <title>Example Domain</title>
    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
...

X

Don't leave just yet!

Enter your email below to claim your free API key: