How to Bypass PerimeterX in 2024

Apr 30, 2024 ยท 6 min read

PerimeterX is a powerful bot detection and mitigation system. It protects websites from automated attacks and unwanted scraping. As a web scraper or developer, you may have encountered PerimeterX blocking your requests. In this article, we'll explore how PerimeterX works and various methods to bypass it in 2024.

How does PerimeterX work?

PerimeterX uses a combination of techniques to detect and block bots:

  • JavaScript challenge: It injects a JavaScript snippet into the website that analyzes user behavior.
  • Fingerprinting: It collects browser and device fingerprints to identify unique users.
  • Machine learning: It uses ML algorithms to detect anomalies and suspicious patterns.
  • When a request is deemed suspicious, PerimeterX blocks it and displays an error page.

    Popular PerimeterX Errors

    Here are some common PerimeterX error messages you might encounter:

  • "Access to this page has been denied."
  • "Please contact the site administrator, if you believe you have received this message in error."
  • "Blocked by PerimeterX"
  • These errors indicate that PerimeterX has identified your request as coming from a bot.

    How it detects bots

    PerimeterX employs various techniques to detect bots:

  • User behavior analysis: It monitors mouse movements, keystrokes, and other user interactions.
  • IP reputation: It checks if the IP address has a history of malicious activity.
  • Request patterns: It analyzes request frequency, headers, and payloads for anomalies.
  • By combining these signals, PerimeterX can accurately identify and block bot traffic.

    Methods to bypass PerimeterX

    Here are some effective methods to bypass PerimeterX in 2024:

    1. Use rotating proxies

    Rotating proxies are a great way to avoid IP-based detection. They allow you to make requests from different IP addresses, making it harder for PerimeterX to track your activity.

    Here's an example of using rotating proxies with Python's requests library:

    import requests
    
    proxies = {
        'http': '<http://user:pass@proxy1.example.com>',
        'https': '<http://user:pass@proxy2.example.com>'
    }
    
    response = requests.get('<https://example.com>', proxies=proxies)
    

    Using a professional rotating proxy service like Proxies API can simplify the process and handle complexities like CAPTCHA solving and user-agent rotation.

    2. Use headless browsers

    Headless browsers simulate real user interactions and run JavaScript, making them effective against PerimeterX. Popular headless browsers include Puppeteer, Selenium, and Playwright.

    Here's an example of using Puppeteer to bypass PerimeterX:

    const puppeteer = require('puppeteer');
    
    (async () => {
      const browser = await puppeteer.launch();
      const page = await browser.newPage();
      await page.goto('<https://example.com>');
      const content = await page.content();
      console.log(content);
      await browser.close();
    })();
    

    3. Use CAPTCHA bypass

    PerimeterX may present CAPTCHAs to verify human users. Using a CAPTCHA solving service can help automate the process.

    Here's an example of using the 2captcha API to solve CAPTCHAs:

    import requests
    
    api_key = 'YOUR_API_KEY'
    captcha_url = '<https://example.com/captcha.png>'
    
    response = requests.get(f'<http://2captcha.com/in.php?key={api_key}&method=base64&body={captcha_url}>')
    captcha_id = response.text.split('|')[1]
    
    while True:
        response = requests.get(f'<http://2captcha.com/res.php?key={api_key}&action=get&id={captcha_id}>')
        if response.text.startswith('OK|'):
            captcha_solution = response.text.split('|')[1]
            break
    
    # Use the captcha_solution in your request
    

    4. Scrape Google Cache

    If the website allows indexing by search engines, you can retrieve the cached version from Google. This bypasses PerimeterX entirely.

    Here's an example of scraping Google Cache using Python:

    import requests
    from bs4 import BeautifulSoup
    
    url = '<https://example.com>'
    cache_url = f'<http://webcache.googleusercontent.com/search?q=cache:{url}>'
    
    response = requests.get(cache_url)
    soup = BeautifulSoup(response.text, 'html.parser')
    
    # Extract the cached content
    cached_content = soup.find('div', {'id': 'main-content'}).text
    print(cached_content)
    

    5. Solve the PerimeterX JavaScript challenge

    PerimeterX injects a JavaScript challenge that needs to be solved to prove you're a human. You can use a headless browser or a specialized library to execute the JavaScript and extract the required parameters.

    Here's an example using the perimeterx-js-sdk library:

    const PerimeterX = require('perimeterx-js-sdk');
    
    const perimeterx = new PerimeterX({
      appId: 'YOUR_APP_ID',
      cookieKey: 'YOUR_COOKIE_KEY',
      authToken: 'YOUR_AUTH_TOKEN'
    });
    
    perimeterx.challenge('<https://example.com>')
      .then(page => {
        console.log(page.content());
      })
      .catch(error => {
        console.error(error);
      });
    

    Conclusion

    Bypassing PerimeterX requires a combination of techniques and tools. Rotating proxies, headless browsers, CAPTCHA solving, scraping Google Cache, and solving JavaScript challenges are effective methods to overcome PerimeterX's bot detection.

    However, it's important to note that these methods may still be prone to detection if not implemented carefully. Using a professional rotating proxy service like Proxies API can simplify the process and handle complexities behind the scenes.

    With Proxies API, you can make requests through millions of high-speed rotating proxies located worldwide. It offers automatic IP rotation, user-agent string rotation, and CAPTCHA solving, making it a reliable solution for bypassing PerimeterX.

    Here's an example of using Proxies API in any programming language:

    curl "<http://api.proxiesapi.com/?key=API_KEY&render=true&url=https://example.com>"
    

    Proxies API offers 1000 API calls completely free. Register and get your free API key here.

    FAQs

    What is PerimeterX used for?

    PerimeterX is used to protect websites from automated attacks, bot traffic, and unwanted scraping. It helps secure websites by detecting and mitigating malicious bot activities.

    What does PerimeterX do?

    PerimeterX employs various techniques to detect and block bot traffic. It analyzes user behavior, fingerprints devices, and uses machine learning algorithms to identify suspicious patterns. When a request is deemed malicious, PerimeterX blocks it and displays an error page.

    Who is PerimeterX Net?

    PerimeterX Net is a comprehensive bot protection platform offered by PerimeterX. It provides advanced bot detection and mitigation capabilities to safeguard websites from automated threats.

    What is Perimeterx code defender?

    PerimeterX Code Defender is a client-side security solution that protects websites from malicious JavaScript code injection and client-side attacks. It monitors and sanitizes client-side code execution to prevent unauthorized modifications and data exfiltration.

    What is Cloudflare bot management?

    Cloudflare Bot Management is a service provided by Cloudflare that helps websites identify and mitigate bot traffic. It uses advanced algorithms and machine learning to detect and block malicious bots while allowing legitimate traffic to pass through.

    Who acquired PerimeterX?

    In 2021, PerimeterX was acquired by Akamai Technologies, a leading content delivery network (CDN) and cloud security provider. The acquisition aimed to strengthen Akamai's portfolio of security solutions and enhance its bot management capabilities.

    What are some other anti-bot services?

    Some other popular anti-bot services include:

  • Akamai Bot Manager
  • Cloudflare Bot Management
  • Imperva Bot Protection
  • DataDome Bot Protection
  • ShieldSquare Bot Prevention
  • These services offer similar capabilities to PerimeterX, focusing on detecting and mitigating bot traffic to protect websites.

    Is it legal to scrape PerimeterX protected pages?

    The legality of scraping PerimeterX protected pages depends on various factors, such as the website's terms of service, the purpose of scraping, and the applicable laws in your jurisdiction. It's important to review the website's robots.txt file and terms of service to understand their scraping policies. If scraping is prohibited, it's advisable to seek permission from the website owner before proceeding.

    Browse by tags:

    Browse by language:

    The easiest way to do Web Scraping

    Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you


    Try ProxiesAPI for free

    curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"

    <!doctype html>
    <html>
    <head>
        <title>Example Domain</title>
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
    ...

    X

    Don't leave just yet!

    Enter your email below to claim your free API key: