Retrieving and Parsing Text from URLs with Python's urllib

Feb 8, 2024 ยท 2 min read

The urllib module in Python provides useful tools for retrieving and parsing content from URLs. It comes built-in with Python, making it easy to access in your code.

Fetching Text Content

To fetch text content from a URL, you can use urllib.request.urlopen():

import urllib.request

with urllib.request.urlopen('') as response:
    html =

This opens the URL, downloads the response content as bytes, and stores it in the html variable.

You can also read line by line by treating the response as a file object:

with urllib.request.urlopen('') as response:
    for line in response:

Parsing Text

Once you have retrieved the text content, you may want to parse it to extract relevant information.

For example, to parse HTML you can use a parser like Beautiful Soup. To parse JSON, you can use the built-in json module.

Here's an example parsing JSON from a URL:

import json
import urllib.request 

with urllib.request.urlopen("") as url:
    data = json.loads(

This fetches the JSON data, decodes the bytes to text, parses it to a Python dict with json.loads(), and accesses a key's value.

Handling Errors

Make sure to wrap calls to urlopen() in try/except blocks to handle errors gracefully:

    with urllib.request.urlopen('') as response:
        # Code here   
except urllib.error.URLError as e:
    print(f"URL Error: {e.reason}")

This way you can catch common issues like connection issues, HTTP errors, redirect loops, etc.

Overall, urllib offers a straightforward way to programmatically access text content from the web in Python without needing third-party libraries.

Browse by tags:

Browse by language:

The easiest way to do Web Scraping

Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you

Try ProxiesAPI for free

curl ""

<!doctype html>
    <title>Example Domain</title>
    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />


Don't leave just yet!

Enter your email below to claim your free API key: