The Definitive Guide to Handling Proxies in Go in 2024

Jan 9, 2024 ยท 7 min read

Dealing with proxies in Go doesn't need to be hard. I'm going to walk you through everything you need to know, from basic setup to advanced techniques for privacy and performance.

Let's start with the basics...

What is a Proxy Server?

A proxy server acts as an intermediary between your application and the wider internet. Instead of connecting directly to websites, your requests get routed through the proxy.

This is useful for:

  • Scrape sites that block traffic from certain IPs
  • Hide your real IP address
  • Bypass regional restrictions
  • Without a proxy, sites can easily detect and block scraping bots. So proxies become essential for successful web scraping.

    Let's say you try to scrape a site without one:

    resp, err := http.Get("<http://example.com>")
    // fhir.FhirError: IP blocked after 24 requests
    

    The firewall noticed your distinct IP making repeated requests and shut you down.

    But with a proxy...

    client := clientWithProxy("<http://123.45.6.7:8080>")
    resp, err := client.Get("<http://example.com>")
    // Success - proxies disguise scrapers
    

    Now your code connects through a different IP, avoiding that block.

    So how do we actually configure proxies in Go?

    Setting an HTTP Proxy

    Go uses the net/http package for making HTTP requests. This supports proxies via environment variables and custom transports/clients.

    Here are the main ways to set a proxy:

    1. HTTP_PROXY Environment Variable

    The easiest way is to set the HTTP_PROXY environment variable:

    export HTTP_PROXY="<http://123.45.6.7:8080>"
    

    Any Go code will now send requests through that proxy URL.

    Pro Tip: It also works for HTTPS_PROXY and supports authentication via http://user:pass@ip:port.

    But this applies globally - what if we want to isolate proxy usage to specific clients?

    2. Custom HTTP Transport

    We can create a custom transport that routes through a proxy:

    proxyUrl, _ := url.Parse("<http://123.45.6.7:8080>")
    transport := &http.Transport{Proxy: http.ProxyURL(proxyUrl)}
    
    client := &http.Client{Transport: transport}
    resp, err := client.Get("<http://example.com>")
    

    Now only that client goes through the proxy. Much more flexible!

    You can even set TLS configs on the transport for encrypted proxy connections.

    3. Set Default Transport Proxy

    To apply a proxy globally without env vars, set the default transport:

    defaultTransport := &http.Transport{
      Proxy: http.ProxyFromEnvironment,
    }
    
    http.DefaultTransport = defaultTransport
    

    This uses any HTTP_PROXY values from the environment.

    So in summary:

  • Env Var: Quick and dirty proxying
  • Custom Client: For isolated scrapers
  • Default Transport: Global proxy
  • Now let's look at some more interesting use cases beyond basic setup...

    Leveraging Proxies for Security & Privacy

    Proxies have some cool security applications:

    Make Anonymous Scraping Requests

    Ever scrape sites you weren't exactly "supposed" to? ๐Ÿ˜…

    Proxies let you hide your real IP address so no one knows the requests came from you:

    resp, err := clientWithProxy("<http://123.45.6.7:8080>").Get("<http://privateSiteImScraping.com>")
    
    // Site sees request coming from proxy IP rather than my real IP
    

    No more looking over your shoulder for angry scrapees knocking at your door!

    Route Traffic Through Tor for Maximum Anonymity

    For the truly paranoid, you can tunnel your web scraping through Tor to anonymize on a whole other level:

    proxyUrl, _ := url.Parse("socks5://127.0.0.1:9050")
    
    transport := &http.Transport{Proxy: http.ProxyURL(proxyUrl)}
    client := &http.Client{Transport: transport}
    
    resp, err := client.Get("<http://example.com>")
    
    // Requests now routed through Tor network ๐Ÿ•ต๏ธ
    

    This connects through the Tor proxy running locally on port 9050. The IP that shows up server-side will be a random Tor node, keeping your identity safe.

    Bypass Regional Restrictions

    Sometimes scraping sites works fine from your home country...until you travel overseas.

    Then you might notice sites blocking traffic from foreign IPs. Proxies come to the rescue yet again:

    // Traveling abroad
    resp, err := http.Get("<http://example.com>")
    // Site blocked my European IP
    
    euProxy := "<http://123.123.123.123:8080>" // Random EU residential proxy
    client := clientWithProxy(euProxy)
    
    resp, err = client.Get("<http://example.com>")
    // Works! Site thinks I'm still in Europe
    

    This spoofs your location by routing traffic through proxies in specific regions, giving you access no matter where you're located physically.

    So proxies grant increased privacy, anonymity, and access. Now let's talk performance...

    Leveraging Proxy Caching & Performance

    An often overlooked benefit of proxies is improved speed and efficiency.

    This stems from their ability to cache content, fulfilling future requests for data locally instead of fetching it again externally.

    Think of it like storing takeout in your fridge. Sure, you could go back to the restaurant every time you're hungry. But it's faster to just heat it up from your kitchen.

    The same concept applies here:

    1. Initial request -> proxy fetches data from site
    2. Proxy stores response in cache
    3. Later request -> proxy returns cached data instead of fetching from site again
    

    This saves precious time and bandwidth. For sites that change slowly like Wikipedia, the savings add up:

    // First request
    resp, err := client.Get("<http://en.wikipedia.org/wiki/cats>")
    // 120 ms (actually downloads page)
    
    // ...some time later
    
    // Second request
    resp, err = client.Get("<http://en.wikipedia.org/wiki/cats>")
    // 5 ms (served from cache) ๐Ÿš€
    

    95% faster on the repeat request! Of course, it's smart to eventually invalidate stale cache entries.

    But clever use of proxy caches takes load off the sites you interact with. And everyone wins ๐Ÿ† when systems run faster.

    Now that you know your way around proxies in Go, let's talk about handling issues that come up...

    Common Issues & Troubleshooting

    Proxy connections seem simple on the surface. But just like computer problems manifesting after a software update, it's often not smooth sailing.

    Here are some common proxy pitfalls and how to address them:

    Credentials Not Working Through Proxy

    You set up an authenticated proxy expecting seamless usage. But then get hit with 407 Proxy Authentication Required errors accusing you of unauthorized access. Ouch!

    This usually indicates a credential mismatch between code expectations and actual proxy permissions. Double check server credentials match those provided by your network admin or proxy provider. When in doubt, regenerate fresh keys.

    Too Many Requests From Proxy IP

    Another sadly common occurrence:

    resp, err := client.Get("<http://example.com>")
    // IP blocked after 100 requests
    

    The proxy IP itself is getting banned for making too many requests!

    This happens because multiple clients funnel through the same outbound IP, overwhelming sites with traffic from one spot.

    Some fixes:

  • Rotate proxies from a large pool to distribute requests
  • Limit request rates in code per proxy
  • Use a smart proxy service that handles rotation for you (more below ๐Ÿ‘‡)
  • Of course manually wrangling proxies adds complexity. Wouldn't it be nice if something handled the gritty details for you?

    Well I happen to run Proxies API, a paid proxy solution catered towards developers.

    It provides an API for proxy requests that handles:

  • Automatic IP and user-agent rotation
  • Javascript rendering
  • CAPTCHA solving
  • High availability through server redundancy
  • You don't deal with credential management, blocked IPs, or other proxy headaches.

    It's as simple as:

    curl "<http://api.proxiesapi.com/?key=xxx&render=1&url=http://example.com>"
    

    And you get back scraped content straight away, already rendered.

    I worked on this problem for years before reaching a solution robust enough to sell. So I understand firsthand the proxy struggles developers face.

    But now Proxies API solves them instantly so you can focus on building, rather than proxy wrangling.

    We offer 1000 free API calls/month so you can test it out no commitment. Grab your API key to try here.

    Browse by tags:

    Browse by language:

    Tired of getting blocked while scraping the web?

    ProxiesAPI handles headless browsers and rotates proxies for you.
    Get access to 1,000 free API credits, no credit card required!