urllib get

Feb 8, 2024 ยท 2 min read

The urllib module in Python provides a simple interface for fetching data over HTTP. With just a few lines of code, you can easily make GET and POST requests to access web pages and APIs.

Making a Basic GET Request

The simplest GET request with urllib looks like:

import urllib.request

with urllib.request.urlopen('http://example.com') as response:
   html = response.read()

This opens the URL, downloads the response content, and stores it in the html variable as a bytes object.

We wrap the call in a with block so the connection automatically closes after we're done, even if there's an error. This avoids tying up resources.

Handling Response Info

The response returned by urllib contains useful metadata like the status code and headers:

print(response.status) # 200
print(response.getheaders()) # Headers dictionary

We can check the status to see if the request was successful and access the headers to check content type, encoding, and more.

Adding Request Parameters

To pass data in a GET request, we append it as query parameters:

import urllib.parse

url = 'http://example.com/search?' + urllib.parse.urlencode({'q':'python'}) 
data = urllib.request.urlopen(url)

Here we use urllib.parse.urlencode() to encode the params into a query string that gets appended to the URL.

Key Takeaways

  • urllib makes HTTP requests simple with just the standard library
  • Use a with block to ensure connections are closed properly
  • Check status codes and headers on the response
  • Pass query parameters by encoding a dict into a string
  • Next, try adding headers and data for POST requests to interact with more complex web APIs.

    Browse by tags:

    Browse by language:

    Tired of getting blocked while scraping the web?

    ProxiesAPI handles headless browsers and rotates proxies for you.
    Get access to 1,000 free API credits, no credit card required!