Encoding URLs with Python's urllib

Feb 6, 2024 ยท 2 min read

When building URLs in Python, you may need to include special characters or spaces that can cause issues if left unencoded. Thankfully, Python's urllib library provides a simple way to encode these characters.

The urllib.parse.urlencode() function takes a dictionary and converts it into a url encoded string that can be appended to URLs. Here's a quick example:

import urllib.parse

params = {'q': 'python urllib', 'type': 'articles', 'sort': 'relevance'}
url = 'https://www.example.com/search?' + urllib.parse.urlencode(params)

print(url)

This would print:

https://www.example.com/search?q=python+urllib&type=articles&sort=relevance

The key things urlencode does:

  • Converts spaces to + signs
  • Converts special characters to their hexidecimal encodings
  • So behind the scenes, it is converting the space in "python urllib" to a + sign.

    Some common use cases for urlencode:

  • Building search URLs with query parameters
  • Passing non-ASCII text in URLs
  • Including spaces or special characters in URLs
  • A few things to watch out for:

  • URLs have a maximum length, so encoding a very long string may exceed that
  • Some characters like / and ? have special meaning and may need to be avoided or additionally encoded
  • To decode an encoded URL string back to a dictionary, use urllib.parse.parse_qs() or urllib.parse.parse_qsl():

    import urllib.parse
    
    url = "https://www.example.com/search?q=python+urllib&type=articles"
    print(urllib.parse.parse_qs(url)) 

    This covers the basics of encoding parameters into URLs with Python. The urllib.parse module contains many helpful functions for manipulating URLs. When building URLs, be mindful of encoding special characters correctly.

    Browse by tags:

    Browse by language:

    Tired of getting blocked while scraping the web?

    ProxiesAPI handles headless browsers and rotates proxies for you.
    Get access to 1,000 free API credits, no credit card required!