Accessing URLs Requiring Authentication with Python's urllib

When accessing web URLs that require authentication, Python's urllib module provides a simple way to supply credentials and access protected resources. Whether you need to pull data from a website or API endpoint, urllib handles the basic auth handshake automatically behind the scenes.

Here's a quick example accessing a protected URL:

import urllib.request
import urllib.parse

username = 'myusername'
password = 'mypassword'

url = 'https://api.example.com/data'

p = urllib.parse.urlencode({'username': username, 'password': password})
request = urllib.request.Request(url)
request.add_header('Authorization', 'Basic %s' % p)

response = urllib.request.urlopen(request)
data = response.read()

We supply the username and password, encode them into a string using urllib.parse.urlencode, and add an Authorization header to the request with the encoded credentials.

When accessing URLs over HTTPS, this handles the authentication automatically without needing to deal with cookies, sessions, etc.

Tips

Use a context manager to automatically close the response:

with urllib.request.urlopen(request) as response:
   data = response.read()

If authentication fails, it will raise a HTTPError - catch this for handling invalid creds

Scenarios

Accessing APIs that require an API key

Pulling reports from a web app that requires login

Scrape data from a website that uses basic auth protection

Using urllib for basic access authentication provides a simple way to supply credentials for restricted URLs. With a few lines of code, you can access protected resources and data.

Accessing URLs Requiring Authentication with Python's urllib

Tips

Scenarios

Browse by tags:

Browse by language:

The easiest way to do Web Scraping

Accessing URLs Requiring Authentication with Python's urllib

Tips

Scenarios

The easiest way to do Web Scraping

Don't leave just yet!