PySpider Vs. Proxies API

May 5th, 2020

The world of web scraping is varied and complex, and Proxies API sits at one of the most crucial junctions. They are allowing web scrapers/crawlers to bypass IP blocks by using a single API endpoint to access our 20 million-plus high-speed proxies on rotation.

Example:

curl "http://api.proxiesapi.com/?auth_key=YOUR_KEY&url=URL"

One of the questions we get frequently is how we are different from services like OctoParse or Diffbot. Many times it is like comparing Apples and Oranges. Still, when we send this comparison table to our customer's developer team, their CXO, their marketing, or SEO team, they typically get it quite quickly if we are a convenient service or not.

So here is how we are different from PySpider.

PySpider is useful if you want to crawl and spider at massive scales. It has a web UI to monitor crawling projects, support DB integrations out of the box, uses message queues, and comes ready with support for a distributed architecture. This library is a beast.

You can do complex operations like.

Set priorities.

def index_page(self):
    self.crawl('', callback=self.index_page)
    self.crawl('', callback=self.detail_page,
               priority=1)

Set delayed crawls. This one crawls after 30 mins using queues

import time
def on_start(self):
    self.crawl('', callback=self.callback,
               exetime=time.time() 30*60)

this one automatically recrawls a page every 5 hours

def on_start(self):
    self.crawl('', callback=self.callback,
               age=5*60*60, auto_recrawl=True)

link http://docs.pyspider.org/en/latest/

PySpider vs. Proxies API

Aspect	Proxies API	PySpider
Who is it for?	Developers	Developers
Cost	1000 free CallsStarts at $49 pm	Open Source
API access	Yes	yes
Size of project	enterprisemediumsmall	enterprisemediumsmall
Easy to setup	single api call for everything	manual setup
Product/Service	product	product
Rotating Proxies	Yes	no
Single API?	Yes	no
Desktop App	no	no
Visual Scraping	no	no
Untitled

PySpider vs. Proxies API

Get our articles in your inbox