How to Build a Super Simple HTTP Proxy in R in just 20 lines of code

Oct 1, 2023 ยท 3 min read

R provides excellent capabilities for statistical computing, but can also be used for some basic network programming thanks to packages like httpuv. Here we'll build a basic HTTP proxy server in R in less than 20 lines of code.

First we load the required packages:

library(httpuv)
library(httr)

The httpuv package gives us a simple HTTP server, and httr allows us to make HTTP requests.

Next we create a basic httpuv server:

proxy <- httpuv::startServer("127.0.0.1", 8080)

This will listen on port 8080 for incoming connections.

We define a request handler function to process each request:

handler <- function(req){

  url <- parseUrl(req$PATH_INFO)

  response <- GET(url)

  return(list(
    status=200,
    headers=c('Content-Type' = 'text/plain'),
    body=content(response)
  ))

}

The parseUrl function extracts the URL from the request path:

parseUrl <- function(path){
  return(sub("^/", "", path))
}

We use httr's GET request to fetch the proxied URL and return the response body.

Finally, we assign the handler function to the server:

httpuv::serviceCallback(proxy) <- list(
  function(req) handler(req)
)

The full proxy code is:

library(httpuv)
library(httr)

proxy <- httpuv::startServer("127.0.0.1", 8080)

parseUrl <- function(path){
  return(sub("^/", "", path))
}

handler <- function(req){

  url <- parseUrl(req$PATH_INFO)

  response <- GET(url)

  return(list(
    status=200,
    headers=c('Content-Type' = 'text/plain'),
    body=content(response)
  ))

}

httpuv::serviceCallback(proxy) <- list(
  function(req) handler(req)
)

This implements a simple HTTP proxy in R in less than 20 lines of code using the httpuv and httr packages.

This is great as a learning exercise but it is easy to see that even the proxy server itself is prone to get blocked as it uses a single IP. In this scenario where you may want a proxy that handles thousands of fetches every day using a professional rotating proxy service to rotate IPs is almost a must.

Otherwise, you tend to get IP blocked a lot by automatic location, usage, and bot detection algorithms.

Our rotating proxy server Proxies API provides a simple API that can solve all IP Blocking problems instantly.

  • With millions of high speed rotating proxies located all over the world,
  • With our automatic IP rotation
  • With our automatic User-Agent-String rotation (which simulates requests from different, valid web browsers and web browser versions)
  • With our automatic CAPTCHA solving technology,
  • Hundreds of our customers have successfully solved the headache of IP blocks with a simple API.

    The whole thing can be accessed by a simple API like below in any programming language.

    In fact, you don't even have to take the pain of loading Puppeteer as we render Javascript behind the scenes and you can just get the data and parse it any language like Node, Puppeteer or PHP or using any framework like Scrapy or Nutch. In all these cases you can just call the URL with render support like so:

    curl "<http://api.proxiesapi.com/?key=API_KEY&render=true&url=https://example.com>"
    
    

    We have a running offer of 1000 API calls completely free. Register and get your free API Key.

    Browse by tags:

    Browse by language:

    The easiest way to do Web Scraping

    Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you


    Try ProxiesAPI for free

    curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"

    <!doctype html>
    <html>
    <head>
        <title>Example Domain</title>
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
    ...

    X

    Don't leave just yet!

    Enter your email below to claim your free API key: