In this article, we will learn how to scrape property listings from Booking.com using Elixir. We will use Elixir libraries like HTTPoison and Floki to fetch the HTML content and parse/extract details like property name, location, ratings etc.
Prerequisites
To follow along, you will need:
Adding Dependencies
We will use HTTPoison for sending requests and Floki for HTML parsing.
Add them to 
def deps do
  [
    {:httpoison, "~> 1.8"},
    {:floki, "~> 0.30.0"}
  ]
end
Run 
Importing Libraries
Import the modules:
import HTTPoison, only: [get: 1]
import Floki
Defining URL

—
Define the target URL:
url = "<https://www.booking.com/searchresults.en-gb.html?ss=New+York&checkin=2023-03-01&checkout=2023-03-05&group_adults=2>"
Setting User Agent
Set the User Agent header:
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36"
Fetching the Page
Make the GET request to fetch HTML:
response = get(url, [], hackney: [user_agent: user_agent])
html = response.body
Pass the configured User Agent.
Parsing the HTML
Parse the HTML with Floki:
page = Floki.parse_document(html)
Extracting Cards

Get elements with the 
cards = page |> Floki.find("div[data-testid='property-card']")
This extracts the property cards.
Processing Each Card
Loop through the cards:
cards |> Enum.each(fn card ->
  # Extract data from card
end)
Inside we can extract details from each 
Extracting Title
Get the 
title = card |> Floki.find("h3") |> Floki.text
Extracting Location
Get address 
location = card |> Floki.find("span[data-testid='address']") |> Floki.text
Extracting Rating
Get 
rating = card |> Floki.find("div.e4755bbd60") |> Floki.attribute("aria-label")
Filter by class name.
Extracting Review Count
Get text of the 
review_count = card |> Floki.find("div.abf093bdfe") |> Floki.text
Extracting Description
Get description div text:
description = card |> Floki.find("div.d7449d770c") |> Floki.text
Printing the Data
Print out the extracted details:
IO.puts("Name: #{title}")
IO.puts("Location: #{location}")
IO.puts("Rating: #{rating}")
IO.puts("Review Count: #{review_count}")
IO.puts("Description: #{description}")
Full Script
Here is the complete scraping script:
import HTTPoison, only: [get: 1]
import Floki
url = "<https://www.booking.com/searchresults.en-gb.html?ss=New+York&checkin=2023-03-01&checkout=2023-03-05&group_adults=2>"
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36"
response = get(url, [], hackney: [user_agent: user_agent])
html = response.body
page = Floki.parse_document(html)
cards = page |> Floki.find("div[data-testid='property-card']")
cards |> Enum.each(fn card ->
  title = card |> Floki.find("h3") |> Floki.text
  location = card |> Floki.find("span[data-testid='address']") |> Floki.text
  rating = card |> Floki.find("div.e4755bbd60") |> Floki.attribute("aria-label")
  review_count = card |> Floki.find("div.abf093bdfe") |> Floki.text
  description = card |> Floki.find("div.d7449d770c") |> Floki.text
  IO.puts("Name: #{title}")
  IO.puts("Location: #{location}")
  IO.puts("Rating: #{rating}")
  IO.puts("Review Count: #{review_count}")
  IO.puts("Description: #{description}")
end)
This scrapes and extracts key data from Booking.com listings using Elixir. The same approach can be used for any website.
While these examples are great for learning, scraping production-level sites can pose challenges like CAPTCHAs, IP blocks, and bot detection. Rotating proxies and automated CAPTCHA solving can help.
Proxies API offers a simple API for rendering pages with built-in proxy rotation, CAPTCHA solving, and evasion of IP blocks. You can fetch rendered pages in any language without configuring browsers or proxies yourself.
This allows scraping at scale without headaches of IP blocks. Proxies API has a free tier to get started. Check out the API and sign up for an API key to supercharge your web scraping.
With the power of Proxies API combined with Python libraries like Beautiful Soup, you can scrape data at scale without getting blocked.
