May 5th, 2020
Kimura Framework Vs. Proxies API

The world of web scraping is varied and complex, and Proxies API sits at one of the most crucial junctions. They are allowing web scrapers/crawlers to bypass IP blocks by using a single API endpoint to access our 20 million-plus high-speed proxies on rotation.

Example:

curl "http://api.proxiesapi.com/?auth_key=YOUR_KEY&url=URL"

One of the questions we get frequently is how we are different from services like OctoParse or Diffbot. Many times it is like comparing Apples and Oranges. Still, when we send this comparison table to our customer's developer team, their CXO, their marketing, or SEO team, they typically get it quite quickly if we are a convenient service or not.

So here is how we are different from Kimura Framework.

A brilliantly simple Ruby-based framework that can render javascript and comes out of the box with headless chromium and Firefox.

Here is how simple it is to work with infinite scroll web pages

# infinite_scroll_spider.rb
require 'kimurai'

class InfiniteScrollSpider < Kimurai::Base
  @name = "infinite_scroll_spider"
  @engine = :selenium_chrome
  @start_urls = [""]

  def parse(response, url:, data: {})
    posts_headers_path = "//article/h2"
    count = response.xpath(posts_headers_path).count

    loop do
      browser.execute_script("window.scrollBy(0,10000)") ; sleep 2
      response = browser.current_response

      new_count = response.xpath(posts_headers_path).count
      if count == new_count
        logger.info "> Pagination is done" and break
      else
        count = new_count
        logger.info "> Continue scrolling, current count is #{count}..."
      end
    end

    posts_headers = response.xpath(posts_headers_path).map(&:text)
    logger.info "> All posts from page: #{posts_headers.join('; ')}"
  end
end

InfiniteScrollSpider.crawl!

Link https://github.com/vifreefly/kimuraframework

Kimura Framework vs. Proxies API

AspectProxies APIKimura Framework
Who is it for?DevelopersDevelopers
Cost1000 free CallsStarts at $49 pmOpen Source
API accessYesyes
Size of projectenterprisemediumsmallenterprisemediumsmall
Easy to setupsingle api call for everythingmanual setup
Product/Serviceproductproduct
Rotating ProxiesYesno
Single API?Yesno
Desktop Appnono
Visual Scrapingnono
Untitled


Share this article:

Get our articles in your inbox

Dont miss our best tips/tricks/tutorials about Web Scraping
Only great content, we don’t share your email with third parties.
Icon