← Back to blog

How to Scrape Booking.com Hotel Data with Python in 2026

How to Scrape Booking.com Hotel Data with Python in 2026

Booking.com lists over 28 million accommodation options across 228 countries. Whether you're building a price comparison tool, monitoring competitor rates, or collecting review data for hospitality research, Booking.com is one of the richest travel data sources available.

It's also one of the hardest sites to scrape. This guide covers what actually works in 2026 — the internal API endpoints, browser automation techniques, Datadome bypass strategies, ThorData proxy integration, and a complete price monitoring pipeline.

Booking.com's Anti-Bot Stack

Before writing any code, understand what you're up against. Booking.com uses a multi-layered defense:

  1. Datadome (primary): Enterprise-grade bot detection that analyzes TLS fingerprints, mouse movements, JavaScript execution patterns, canvas fingerprints, and WebGL characteristics. This is the main gatekeeper — it's triggered before your request even reaches Booking's own servers in many cases.

  2. Dynamic page rendering: Prices and availability load asynchronously via XHR/fetch calls after initial page load. A simple requests.get() returns a page shell without any pricing data.

  3. Session-based pricing: Booking.com adjusts displayed prices based on cookies, browsing history, device type, and apparent location. The same hotel shows different rates to different "users."

  4. CAPTCHA challenges: After behavioral anomalies are detected, Datadome serves interactive CAPTCHA challenges that require real user interaction to solve.

  5. Rate limiting: Aggressive throttling kicks in at roughly 20-30 requests per minute from a single IP, with the threshold even lower from datacenter IPs.

Understanding the Data Available

Before diving into code, know what's on each page type:

Search results (/searchresults.html): - Hotel name, star rating, thumbnail image - Starting price for selected dates - Review score and review count - Distance from city center - Free cancellation flag - Availability indicators

Property pages (/hotel/{country}/{slug}.html): - Full description and facilities list - Room types with individual pricing - Photo gallery - Location map coordinates - Check-in/check-out policies - Aggregate review score breakdown

Review pages (/reviewlist.html): - Individual guest reviews with scores - Reviewer nationality, room type, trip type - Positive and negative text sections

Setup

pip install playwright curl-cffi beautifulsoup4 sqlite3
playwright install chromium

Approach 1: Intercepting the Search XHR Endpoint

Booking.com's search results page fires an AJAX call that returns structured JSON. By intercepting this with Playwright, you bypass HTML parsing entirely.

import asyncio
import json
import time
import random
import sqlite3
import re
from datetime import datetime, timedelta
from typing import Optional, Dict, List, Any
from urllib.parse import urlencode
from bs4 import BeautifulSoup


STEALTH_SCRIPT = """
Object.defineProperty(navigator, 'webdriver', { get: () => undefined });
Object.defineProperty(navigator, 'plugins', { get: () => [1, 2, 3, 4, 5] });
Object.defineProperty(navigator, 'languages', { get: () => ['en-US', 'en'] });
window.chrome = { runtime: {} };
const origGetParameter = WebGLRenderingContext.prototype.getParameter;
WebGLRenderingContext.prototype.getParameter = function(param) {
    if (param === 37445) return 'Intel Inc.';
    if (param === 37446) return 'Intel Iris OpenGL Engine';
    return origGetParameter.call(this, param);
};
"""


async def scrape_booking_search(
    destination: str,
    checkin: str,
    checkout: str,
    adults: int = 2,
    rooms: int = 1,
    currency: str = "USD",
    proxy_server: Optional[str] = None,
) -> Dict:
    """Scrape Booking.com search results using Playwright with network interception.

    Returns both raw API data (if intercepted) and parsed DOM results.
    """
    intercepted_hotels = []
    raw_api_responses = []

    from playwright.async_api import async_playwright

    async with async_playwright() as p:
        launch_opts = {
            "headless": True,
            "args": [
                "--no-sandbox",
                "--disable-blink-features=AutomationControlled",
                "--disable-dev-shm-usage",
            ],
        }
        if proxy_server:
            launch_opts["proxy"] = {"server": proxy_server}

        browser = await p.chromium.launch(**launch_opts)
        context = await browser.new_context(
            user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36",
            viewport={"width": 1920, "height": 1080},
            locale="en-US",
            timezone_id="America/Chicago",
        )
        await context.add_init_script(STEALTH_SCRIPT)

        page = await context.new_page()

        # Intercept API responses
        async def handle_response(response):
            url = response.url
            if "booking.com" in url and response.status == 200:
                ct = response.headers.get("content-type", "")
                if "json" in ct:
                    try:
                        body = await response.json()
                        raw_api_responses.append({"url": url, "data": body})
                        if "results" in body and isinstance(body["results"], list):
                            intercepted_hotels.extend(body["results"])
                    except Exception:
                        pass

        page.on("response", handle_response)

        search_url = (
            f"https://www.booking.com/searchresults.html"
            f"?ss={destination}"
            f"&checkin={checkin}&checkout={checkout}"
            f"&group_adults={adults}&no_rooms={rooms}"
            f"&selected_currency={currency}"
        )

        await page.goto(search_url, wait_until="networkidle", timeout=60000)

        # Wait for content to load
        try:
            await page.wait_for_selector('[data-testid="property-card"]', timeout=15000)
        except Exception:
            pass  # Continue even if selector not found

        await page.wait_for_timeout(3000)

        # Parse DOM results as fallback/supplement
        html = await page.content()
        dom_hotels = parse_hotel_cards_from_html(html)

        await browser.close()

    return {
        "intercepted_hotels": intercepted_hotels,
        "dom_hotels": dom_hotels,
        "raw_api_responses": len(raw_api_responses),
        "destination": destination,
        "checkin": checkin,
        "checkout": checkout,
    }

Parsing Search Result HTML

When network interception doesn't catch the JSON, parse the DOM directly. Booking.com uses data-testid attributes that are more stable than class names.

def parse_hotel_cards_from_html(html: str) -> List[Dict]:
    """Parse hotel cards from Booking.com search results HTML."""
    soup = BeautifulSoup(html, "html.parser")
    hotels = []

    for card in soup.select('[data-testid="property-card"]'):
        # Core elements with data-testid attributes (more stable than class names)
        name_el = card.select_one('[data-testid="title"]')
        price_el = card.select_one('[data-testid="price-and-discounted-price"]')
        score_el = card.select_one('[data-testid="review-score"]')
        link_el = card.select_one('a[data-testid="title-link"]')
        address_el = card.select_one('[data-testid="address"]')
        distance_el = card.select_one('[data-testid="distance"]')

        # Price extraction
        price_text = price_el.get_text(strip=True) if price_el else ""
        price_numeric = None
        price_match = re.search(r"\$?([\d,]+\.?\d*)", price_text.replace(",", ""))
        if price_match:
            try:
                price_numeric = float(price_match.group(1))
            except ValueError:
                pass

        # Score breakdown
        score_text = score_el.get_text(" ", strip=True) if score_el else ""
        score_numeric = None
        count_numeric = None
        score_match = re.search(r"(\d+\.?\d*)", score_text)
        if score_match:
            try:
                score_numeric = float(score_match.group(1))
            except ValueError:
                pass
        count_match = re.search(r"([\d,]+)\s+review", score_text.replace(",", ""))
        if count_match:
            try:
                count_numeric = int(count_match.group(1))
            except ValueError:
                pass

        # Free cancellation badge
        cancellation_el = card.select_one('[data-testid="cancellation-flexibility"]')
        free_cancel = (
            cancellation_el is not None and
            "free" in (cancellation_el.get_text(strip=True)).lower()
        ) if cancellation_el else False

        # Breakfast included badge
        breakfast_el = card.select_one('[data-testid="breakfast-badge"]')

        hotels.append({
            "name": name_el.get_text(strip=True) if name_el else None,
            "price_display": price_text,
            "price_numeric": price_numeric,
            "review_score": score_numeric,
            "review_count": count_numeric,
            "address": address_el.get_text(strip=True) if address_el else None,
            "distance_from_center": distance_el.get_text(strip=True) if distance_el else None,
            "url": link_el.get("href") if link_el else None,
            "free_cancellation": free_cancel,
            "breakfast_included": breakfast_el is not None,
        })

    return hotels

Scraping Individual Hotel Pages

For detailed property data — amenities, room types, full description, coordinates:

async def scrape_hotel_detail(
    hotel_url: str,
    checkin: str,
    checkout: str,
    proxy_server: Optional[str] = None,
) -> Dict:
    """Scrape full hotel detail page."""
    from playwright.async_api import async_playwright

    # Always include dates in URL for pricing
    if "checkin=" not in hotel_url:
        sep = "&" if "?" in hotel_url else "?"
        hotel_url = f"{hotel_url}{sep}checkin={checkin}&checkout={checkout}&group_adults=2&no_rooms=1&selected_currency=USD"

    async with async_playwright() as p:
        launch_opts = {"headless": True, "args": ["--no-sandbox"]}
        if proxy_server:
            launch_opts["proxy"] = {"server": proxy_server}

        browser = await p.chromium.launch(**launch_opts)
        context = await browser.new_context(
            user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36",
            viewport={"width": 1920, "height": 1080},
            locale="en-US",
        )
        await context.add_init_script(STEALTH_SCRIPT)

        page = await context.new_page()
        await page.goto(hotel_url, wait_until="networkidle", timeout=60000)
        await page.wait_for_timeout(2000)

        # Extract structured data via page evaluation
        data = await page.evaluate("""
            () => {
                const get = (sel) => {
                    const el = document.querySelector(sel);
                    return el ? el.innerText.trim() : null;
                };
                const getAttr = (sel, attr) => {
                    const el = document.querySelector(sel);
                    return el ? el.getAttribute(attr) : null;
                };
                const getAll = (sel) => Array.from(document.querySelectorAll(sel)).map(e => e.innerText.trim()).filter(Boolean);

                // Room types from availability table
                const rooms = Array.from(document.querySelectorAll('.hprt-table tbody tr, [data-block="room_block"]')).slice(0, 15).map(row => {
                    const name = row.querySelector('.hprt-roomtype-icon-link, [data-testid="room-type-name"]');
                    const price = row.querySelector('.prco-valign-middle-helper, .bui-price-display__value');
                    const max_occ = row.querySelector('.c-occupancy-icons');
                    return {
                        name: name ? name.innerText.trim() : '',
                        price: price ? price.innerText.trim() : '',
                        max_occupancy: max_occ ? max_occ.innerText.trim() : '',
                    };
                }).filter(r => r.name);

                return {
                    name: get('h2.pp-header__title') || get('[itemprop="name"]'),
                    description: get('#property_description_content, .hp-desc-highlighted, [data-testid="property-description"]'),
                    address: get('.hp_address_subtitle, [data-testid="PropertyHeaderDescription"]'),
                    review_score: parseFloat(get('.d10a6220b4, .b5cd09854e') || '0') || null,
                    review_count: parseInt((get('.d935416c47, .bui-review-score__text') || '0').replace(/[^\d]/g, '')) || 0,
                    latitude: parseFloat(getAttr('[data-atlas-latlng]', 'data-atlas-latlng')?.split(',')[0]) || null,
                    longitude: parseFloat(getAttr('[data-atlas-latlng]', 'data-atlas-latlng')?.split(',')[1]) || null,
                    facilities: getAll('.hp_facilities li, [data-testid="property-facilities-group"] li').slice(0, 40),
                    rooms: rooms,
                    star_class: document.querySelectorAll('.b_star_icon, .hp_hotel_star').length || null,
                    check_in_from: get('.policies_room .bui-key-value__definition:first-child'),
                    check_out_until: get('.policies_room .bui-key-value__definition:last-child'),
                };
            }
        """)

        # Extract JSON-LD Hotel schema
        ld_data = await page.evaluate("""
            () => {
                for (const s of document.querySelectorAll('script[type="application/ld+json"]')) {
                    try {
                        const d = JSON.parse(s.textContent);
                        if (d['@type'] === 'Hotel' || d['@type'] === 'LodgingBusiness') return d;
                    } catch(e) {}
                }
                return null;
            }
        """)

        if ld_data:
            data["schema_rating"] = ld_data.get("aggregateRating", {})
            data["schema_amenities"] = [
                a.get("name") for a in ld_data.get("amenityFeature", [])
                if isinstance(a, dict)
            ][:20]

        data["url"] = hotel_url
        data["checkin"] = checkin
        data["checkout"] = checkout
        data["scraped_at"] = datetime.utcnow().isoformat()

        await browser.close()

    return data

Extracting Reviews

Booking.com reviews load via paginated requests. The review list endpoint accepts offset parameters.

async def scrape_reviews(
    hotel_pagename: str,
    max_reviews: int = 100,
    proxy_server: Optional[str] = None,
) -> List[Dict]:
    """Scrape hotel reviews via Playwright."""
    from playwright.async_api import async_playwright

    reviews = []
    offset = 0
    per_page = 25

    async with async_playwright() as p:
        launch_opts = {"headless": True, "args": ["--no-sandbox"]}
        if proxy_server:
            launch_opts["proxy"] = {"server": proxy_server}

        browser = await p.chromium.launch(**launch_opts)
        context = await browser.new_context(
            user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36",
            locale="en-US",
        )
        page = await context.new_page()

        while offset < max_reviews:
            url = (
                f"https://www.booking.com/reviewlist.html"
                f"?pagename={hotel_pagename}&offset={offset}&rows={per_page}"
                f"&sort=f_recent_desc&cc1="
            )

            await page.goto(url, wait_until="networkidle", timeout=30000)
            await page.wait_for_timeout(1500)

            review_cards = await page.query_selector_all(
                "li.review_list_new_item_block, [data-testid='review-block']"
            )

            if not review_cards:
                break

            for card in review_cards:
                try:
                    score_el = await card.query_selector(".bui-review-score__badge, [data-testid='review-score']")
                    title_el = await card.query_selector(".c-review-block__title, [data-testid='review-title']")
                    pos_el = await card.query_selector(".c-review__body, [data-testid='review-positive']")
                    neg_el = await card.query_selector(".c-review__body:last-of-type, [data-testid='review-negative']")
                    date_el = await card.query_selector(".c-review-block__date, [data-testid='review-date']")
                    author_el = await card.query_selector("[data-testid='reviewer-name'], .bui-avatar-block__title")
                    country_el = await card.query_selector("[data-testid='reviewer-country'], .bui-avatar-block__subtitle")
                    room_el = await card.query_selector(".c-review-block__room-info, [data-testid='room-type']")

                    reviews.append({
                        "score": await score_el.inner_text() if score_el else "",
                        "title": await title_el.inner_text() if title_el else "",
                        "positive": await pos_el.inner_text() if pos_el else "",
                        "negative": await neg_el.inner_text() if neg_el else "",
                        "date": await date_el.inner_text() if date_el else "",
                        "author": await author_el.inner_text() if author_el else "",
                        "country": await country_el.inner_text() if country_el else "",
                        "room_type": await room_el.inner_text() if room_el else "",
                    })
                except Exception:
                    continue

            offset += per_page
            print(f"  Reviews: {len(reviews)}/{max_reviews}")
            await asyncio.sleep(random.uniform(2.0, 4.0))

        await browser.close()

    return reviews

Price Monitoring Pipeline

Hotel prices fluctuate constantly — sometimes multiple times per day. Systematic tracking reveals pricing patterns and arbitrage opportunities.

def init_price_db(path: str = "booking_prices.db") -> sqlite3.Connection:
    """Initialize price monitoring database."""
    conn = sqlite3.connect(path)
    conn.execute("PRAGMA journal_mode=WAL")

    conn.executescript("""
        CREATE TABLE IF NOT EXISTS hotels (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            booking_url TEXT UNIQUE,
            pagename TEXT,
            name TEXT,
            city TEXT,
            star_rating INTEGER,
            avg_review_score REAL,
            review_count INTEGER,
            latitude REAL,
            longitude REAL,
            added_at TEXT DEFAULT CURRENT_TIMESTAMP
        );

        CREATE TABLE IF NOT EXISTS price_checks (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            hotel_id INTEGER,
            checkin TEXT,
            checkout TEXT,
            price_usd REAL,
            currency TEXT,
            room_type TEXT,
            free_cancellation INTEGER DEFAULT 0,
            checked_at TEXT DEFAULT CURRENT_TIMESTAMP,
            FOREIGN KEY (hotel_id) REFERENCES hotels(id)
        );

        CREATE TABLE IF NOT EXISTS price_alerts (
            hotel_id INTEGER,
            checkin TEXT,
            target_price REAL,
            current_price REAL,
            alert_triggered INTEGER DEFAULT 0,
            created_at TEXT DEFAULT CURRENT_TIMESTAMP,
            PRIMARY KEY (hotel_id, checkin)
        );

        CREATE INDEX IF NOT EXISTS idx_checks_hotel ON price_checks(hotel_id, checkin, checked_at DESC);
        CREATE INDEX IF NOT EXISTS idx_hotels_city ON hotels(city);
    """)

    conn.commit()
    return conn


def log_price_check(
    conn: sqlite3.Connection,
    hotel_id: int,
    checkin: str,
    checkout: str,
    price: Optional[float],
    currency: str = "USD",
    room_type: str = "",
    free_cancel: bool = False,
):
    """Log a price check result."""
    conn.execute(
        """INSERT INTO price_checks
           (hotel_id, checkin, checkout, price_usd, currency, room_type, free_cancellation, checked_at)
           VALUES (?,?,?,?,?,?,?,?)""",
        (hotel_id, checkin, checkout, price, currency, room_type, int(free_cancel), datetime.utcnow().isoformat())
    )
    conn.commit()

    # Check if any alerts should fire
    alerts = conn.execute(
        "SELECT rowid, target_price FROM price_alerts WHERE hotel_id=? AND checkin=? AND alert_triggered=0",
        (hotel_id, checkin)
    ).fetchall()
    for alert_id, target in alerts:
        if price and price <= target:
            print(f"[ALERT] Hotel {hotel_id}: price ${price} hit target ${target} for {checkin}")
            conn.execute(
                "UPDATE price_alerts SET alert_triggered=1, current_price=? WHERE rowid=?",
                (price, alert_id)
            )
            conn.commit()


def get_price_statistics(
    conn: sqlite3.Connection,
    hotel_id: int,
    checkin: str,
) -> Dict:
    """Compute price statistics for a hotel/date combination."""
    rows = conn.execute(
        """SELECT price_usd, checked_at FROM price_checks
           WHERE hotel_id=? AND checkin=? AND price_usd IS NOT NULL
           ORDER BY checked_at ASC""",
        (hotel_id, checkin)
    ).fetchall()

    if not rows:
        return {}

    prices = [r[0] for r in rows]
    return {
        "hotel_id": hotel_id,
        "checkin": checkin,
        "current_price": prices[-1],
        "min_price": min(prices),
        "max_price": max(prices),
        "avg_price": round(sum(prices) / len(prices), 2),
        "price_checks": len(prices),
        "first_checked": rows[0][1],
        "last_checked": rows[-1][1],
        "price_trend": "down" if len(prices) > 1 and prices[-1] < prices[0] else "up" if len(prices) > 1 and prices[-1] > prices[0] else "flat",
    }

ThorData Proxy Integration

Datadome's IP reputation database blocks all datacenter IP ranges immediately. Residential proxies are non-negotiable for any serious Booking.com scraping.

ThorData provides rotating residential proxies with city-level geo-targeting. This matters for Booking.com specifically because prices are localized — a UK user searching for a Paris hotel sees different rates than a US user.

class ThorDataProxyPool:
    """ThorData residential proxy pool with geo-targeting."""

    def __init__(self, username: str, password: str):
        self.username = username
        self.password = password
        self.host = "gate.thordata.com"
        self.port = 9000

    def get_proxy(self, country: str = "US", session_id: Optional[str] = None) -> str:
        user = f"{self.username}-country-{country.upper()}"
        if session_id:
            user = f"{user}-session-{session_id}"
        return f"http://{user}:{self.password}@{self.host}:{self.port}"

    def get_rotating(self, country: str = "US") -> str:
        """Fresh IP per request — best for search pages."""
        return self.get_proxy(country)

    def get_sticky(self, session_id: str, country: str = "US") -> str:
        """Same IP for a browsing session — best for property pages."""
        return self.get_proxy(country, session_id=session_id)

    def match_country_to_destination(self, destination_city: str) -> str:
        """Return appropriate country code for a destination."""
        europe = ["paris", "rome", "barcelona", "amsterdam", "berlin", "madrid", "prague", "lisbon"]
        uk = ["london", "edinburgh", "manchester", "birmingham"]
        us = ["new york", "los angeles", "miami", "chicago", "las vegas"]

        dest_lower = destination_city.lower()
        if any(c in dest_lower for c in europe):
            return random.choice(["DE", "FR", "NL", "ES"])
        if any(c in dest_lower for c in uk):
            return "GB"
        if any(c in dest_lower for c in us):
            return "US"
        return "US"  # Default


async def run_booking_scraper_with_thordata(
    destinations: List[Dict],
    thordata_user: str,
    thordata_pass: str,
    db_path: str = "booking_prices.db",
) -> Dict:
    """Full scraping run with ThorData proxy rotation."""
    pool = ThorDataProxyPool(thordata_user, thordata_pass)
    conn = init_price_db(db_path)
    stats = {"destinations": 0, "hotels": 0, "prices": 0, "errors": 0}

    for dest in destinations:
        city = dest["city"]
        checkin = dest["checkin"]
        checkout = dest["checkout"]

        country = pool.match_country_to_destination(city)
        proxy = pool.get_rotating(country)

        print(f"\n[{city}] {checkin} → {checkout} (proxy: {country})")

        result = await scrape_booking_search(
            city, checkin, checkout,
            proxy_server=proxy,
        )

        hotels = result.get("intercepted_hotels") or result.get("dom_hotels", [])
        print(f"  Found {len(hotels)} hotels")

        for hotel in hotels:
            name = hotel.get("hotel_name") or hotel.get("name", "Unknown")
            price = hotel.get("min_total_price") or hotel.get("price_numeric")

            # Upsert hotel record
            existing = conn.execute(
                "SELECT id FROM hotels WHERE name=? AND city=?", (name, city)
            ).fetchone()

            if existing:
                hotel_id = existing[0]
            else:
                cursor = conn.execute(
                    "INSERT INTO hotels (name, city, star_rating, avg_review_score, review_count) VALUES (?,?,?,?,?)",
                    (
                        name[:200], city,
                        hotel.get("class") or hotel.get("star_rating"),
                        hotel.get("review_score"),
                        hotel.get("review_nr") or hotel.get("review_count"),
                    )
                )
                hotel_id = cursor.lastrowid
                conn.commit()

            if price:
                log_price_check(conn, hotel_id, checkin, checkout, price)
                stats["prices"] += 1

            stats["hotels"] += 1

        stats["destinations"] += 1
        await asyncio.sleep(random.uniform(10.0, 20.0))

    conn.close()
    return stats

Complete Usage Example

import asyncio


async def main():
    DESTINATIONS = [
        {"city": "Paris",     "checkin": "2026-07-14", "checkout": "2026-07-17"},
        {"city": "Barcelona", "checkin": "2026-08-01", "checkout": "2026-08-04"},
        {"city": "Amsterdam", "checkin": "2026-06-20", "checkout": "2026-06-23"},
    ]

    # With ThorData (recommended for production)
    # stats = await run_booking_scraper_with_thordata(
    #     DESTINATIONS,
    #     thordata_user="YOUR_USER",
    #     thordata_pass="YOUR_PASS",
    # )

    # Without proxy (for testing — will get blocked quickly)
    for dest in DESTINATIONS:
        result = await scrape_booking_search(
            dest["city"], dest["checkin"], dest["checkout"]
        )
        hotels = result.get("intercepted_hotels") or result.get("dom_hotels", [])
        print(f"{dest['city']}: {len(hotels)} hotels")
        for h in hotels[:3]:
            name = h.get("hotel_name") or h.get("name", "Unknown")
            price = h.get("min_total_price") or h.get("price_numeric")
            score = h.get("review_score")
            print(f"  {name} — ${price} (score: {score})")

        await asyncio.sleep(15)


asyncio.run(main())

Practical Tips

Currency control: Add selected_currency=USD to all search URLs to normalize pricing across scrapes for consistent comparison.

Date flexibility matters: Booking.com shows different availability for different date ranges. Standardize your comparison windows — run all monitoring with the same lead time (e.g., always check prices for "45 days from now").

Mobile vs desktop pricing: Booking.com sometimes offers different prices on mobile. Compare by varying viewport and UA string. Use iPhone 14 Pro user agent with 390x844 viewport to simulate mobile.

Per-request proxy rotation for search, sticky for property pages: Search pages: rotate IP every request. Property pages (simulating browsing through room options): use sticky sessions (same IP for 5-10 minutes) to appear more realistic.

Affiliate API alternative: If you're building a legitimate travel comparison product, the Booking.com Affiliate Partner API provides official access to property data, pricing, and availability. The application process takes 2-4 weeks, but you get clean JSON responses, no Datadome battles, and much higher rate limits. Worth it for commercial applications where reliability matters over flexibility.

ThorData's residential proxy network with geo-targeting makes sustained Booking.com scraping reliable — their residential pool has the clean Datadome reputation scores that datacenter IPs categorically lack.