Handling Responses with urllib in Python

Feb 6, 2024 ยท 2 min read

The urllib module in Python provides useful functionality for fetching data from URLs. Once you make a request to a web server using urllib, you get back a response object that contains the data from the server. Properly handling this response is important for robust code.

When you make a request with urllib, such as:

import urllib.request

response = urllib.request.urlopen('http://example.com')

you get back a response object. This will be an object of type http.client.HTTPResponse. This object contains all the data that was returned by the web server:

  • response.status - The HTTP status code returned by the server, e.g. 200, 404, 500 etc.
  • response.reason - The reason phrase returned by the server e.g. "OK", "Not Found" etc.
  • response.headers - A dictionary-like object containing the response headers sent by the server.
  • response.readlines() - Reads the response body and returns it as a list of lines.
  • response.read() - Reads the full response body and returns it as a bytes object.
  • So for example, you could print the headers from the response with:

    print(response.headers)

    And read the entire response body with:

    data = response.read()

    It's good practice to always check the status code first to make sure you got back a successful response:

    if response.status == 200:
        # Success!
    else:
        # An error occurred

    And handle any exceptions that may occur, e.g.:

    try:
       response = urllib.request.urlopen(url)
    except urllib.error.HTTPError as e:
       # Print error message
       print(e.reason) 
    except urllib.error.URLError as e:
       # Print error message
       print(e.reason)

    Properly handling the response allows you to write robust code that can deal with errors from the server, read response data correctly, and take appropriate actions based on different status codes. The HTTPResponse object gives you all the information you need to handle the result of your urllib request.

    Browse by tags:

    Browse by language:

    The easiest way to do Web Scraping

    Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you


    Try ProxiesAPI for free

    curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"

    <!doctype html>
    <html>
    <head>
        <title>Example Domain</title>
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
    ...

    X

    Don't leave just yet!

    Enter your email below to claim your free API key: