URL Encoding and Decoding in Python

Feb 6, 2024 ยท 2 min read

When working with URLs in Python, you may encounter percent-encoded characters like %20 or %2F. These encoded characters allow URLs to contain special characters like spaces that would otherwise break the URL syntax. To handle these properly, we need to understand URL encoding and decoding.

URL encoding converts characters into a percent-encoded format. For example, a space character is encoded as %20. This allows URLs to contain spaces and other special characters without issues.

In Python, the urllib.parse module provides tools for encoding and decoding URLs. The key functions are:

from urllib.parse import quote, unquote

encoded = quote(some_string) # Encode the string
decoded = unquote(encoded) # Decode the encoded string

quote() takes a string and returns a percent-encoded version. unquote() does the reverse - it takes an encoded string and returns the original decoded string.

Here's a practical example:

from urllib.parse import quote, unquote

text = "Hello world"
encoded = quote(text) 
# encoded is now "Hello%20world"

decoded = unquote(encoded)
# decoded is back to the original "Hello world"

As you can see, quote() encoded the space character into %20, and unquote() decoded it back.

The main reason you'd use this in Python is building URLs programmatically. For example:

from urllib.parse import urlencode

query_args = { "name": "John Doe" } 
encoded_args = urlencode(query_args)  
# Returns "name=John+Doe" with spaces encoded

So in summary, URL encoding/decoding allows handling spaces and special characters in URLs. Python's urllib.parse makes it easy to encode or decode as needed. Knowing how to use quote() and unquote() helps build and parse URLs properly.

Browse by tags:

Browse by language:

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!