Bypassing Captcha with Selenium and Anti-Captcha Services

Oct 4, 2023 ยท 5 min read

Captcha challenges are a common headache when trying to automate web interactions using Selenium. Thankfully, anti-captcha services provide a straightforward way to bypass captcha protections programmatically. This guide will walk through the key steps to get around captchas using Python, Selenium, and Anti-Captcha.

Overview of Captcha and Anti-Captcha Providers

Captchas (Completely Automated Public Turing tests to tell Computers and Humans Apart) are utilized on many websites to prevent bots and automated scripts from exploiting services. They typically require users to decipher and respond to visual prompts to verify they are human.

{screenshot suggestion of captcha example}

Anti-captcha services use real humans to solve captcha tests behind the scenes. For a small payment, they will interpret captcha images or audio prompts and return the correct solution to your code. Well-known anti-captcha services include Anti-Captcha, 2Captcha, and DeathByCaptcha.

Using anti-captcha services, you can automatically send captcha challenges to be solved and bypass the protections they enforce.

Retrieving the Captcha Site Key

The first step is to retrieve the site key or data-sitekey attribute from the captcha code on the page. This identifies the specific captcha for the anti-captcha service to target.

Using Selenium in Python, you can extract the captcha site key like this:

captcha_site_key = browser.find_element(By.XPATH, '//*[@id="recaptcha-demo"]').get_attribute('outerHTML')

cleaned_site_key = captcha_site_key.split('" data-callback')[0].split('data-sitekey="')[1]

The site key is parsed from the element's outer HTML and cleaned up using split() operations.

Configuring the Anti-Captcha Client

Next, instantiate the anti-captcha solver, set your API key, the target website URL, and the cleaned site key:

solver = recaptchaV2Proxyless()
solver.set_verbose(1)
solver.set_key(os.environ["anticaptcha_api_key"])
solver.set_website_url(captcha_url)
solver.set_website_key(cleaned_site_key)

The set_verbose(1) line prints status updates so you can monitor the progress.

Solving Captcha and Inserting the Response

With the client configured, you can programmatically solve the captcha like this:

captcha_response = solver.solve_and_return_solution()

if captcha_response != 0:
  print("Captcha responded: "+captcha_response)
else:
  print("failed with error: "+solver.error_code)

If successful, captcha_response will contain the captcha solution token. You can then insert this into the page using Selenium:

browser.execute_script('document.getElementById("g-recaptcha-response").innerHTML = arguments[0]', captcha_response)

This places the response in the appropriate form field to mimic human input.

Submitting the Captcha-Protected Form

Finally, you can locate and click the submit button to complete the captcha-protected form submission:

browser.find_element(By.XPATH, '//*[@id="recaptcha-demo-submit"]').click()

And that's it! The anti-captcha service will seamlessly solve the captcha challenges for you, enabling automated form submissions.

Here is the full code example with descriptive variable names:

from anticaptchaofficial.recaptchav2proxyless import *
from webdriver_manager.chrome import ChromeDriverManager
from selenium import webdriver
from selenium.webdriver.common.by import By
import os

browser = webdriver.Chrome(ChromeDriverManager().install())
captcha_url = "https://www.google.com/recaptcha/api2/demo"

page = browser.get(captcha_url)
time.sleep(10) 

captcha_site_key = browser.find_element(By.XPATH, '//*[@id="recaptcha-demo"]').get_attribute('outerHTML')
cleaned_site_key = captcha_site_key.split('" data-callback')[0].split('data-sitekey="')[1]
print(cleaned_site_key)

solver = recaptchaV2Proxyless()
solver.set_verbose(1)
solver.set_key(os.environ["anticaptcha_api_key"])
solver.set_website_url(captcha_url)
solver.set_website_key(cleaned_site_key)

captcha_response = solver.solve_and_return_solution()

if captcha_response != 0:
  print("Captcha responded: "+captcha_response) 
else:
  print("failed with error: "+solver.error_code)

browser.execute_script('var element=document.getElementById("g-recaptcha-response"); element.style.display="";')  
browser.execute_script("""document.getElementById("g-recaptcha-response").innerHTML = arguments[0]""", captcha_response)
browser.execute_script('var element=document.getElementById("g-recaptcha-response"); element.style.display="none";')

browser.find_element(By.XPATH, '//*[@id="recaptcha-demo-submit"]').click()
time.sleep(10)

Using these techniques, you can leverage anti-captcha services to bypass captcha protections in your web automation scripts. The human solvers handle the challenges behind the scenes, removing the captcha obstacle.

Rather than building and managing your own captcha solving infrastructure, services like Proxies API handle all of this complexity for you.

With Proxies API, you make a simple API request with the target URL. It will handle:

  • Rotating proxies and IP addresses
  • Rotating user agents
  • Solving captchas
  • Running JavaScript
  • And return the rendered HTML. No need to orchestrate the numerous steps required for reliable captcha solving.

    For example:

    curl "http://api.proxiesapi.com/?key=API_KEY&render=true&url=https://targetpage.com"
    

    This takes care of all the headaches of automation. No proxies, browsers, or captcha solving services to manage.

    Proxies API offers 1000 free API calls to get started. Check it out if you need to integrate robust captcha solving and proxy rotation in your projects.

    Browse by tags:

    Browse by language:

    Tired of getting blocked while scraping the web?

    ProxiesAPI handles headless browsers and rotates proxies for you.
    Get access to 1,000 free API credits, no credit card required!