How to Scrape OpenTable Reservations in 2026 (Availability, Reviews, Waitlists)

2026-04-09 [opentable scraping restaurants python reservations]

How to Scrape OpenTable Reservations in 2026 (Availability, Reviews, Waitlists)

OpenTable's API is locked behind a partner program that requires a restaurant POS integration. For anyone building reservation analytics, availability trackers, or restaurant research tools, that's a non-starter.

But OpenTable's website leaks plenty of structured data if you know where to look. Their frontend hits internal REST endpoints that return JSON — availability slots, review scores, waitlist status, and more. This guide covers every angle: the internal API, anti-bot evasion, pagination, data storage, and how to run this at scale without burning your IPs.

What Data You Can Extract

From OpenTable's public pages and internal API:

Restaurant name, cuisine, price range, neighborhood
Available reservation slots by date and party size
Booking windows (how far out you can reserve)
Review scores (overall, food, service, ambiance)
Individual review text and dates
Waitlist availability and estimated wait times
Dining points and promotion eligibility
Photos, hours, and dress code
Popular tags and top review keywords
Special experiences (chef's table, tasting menus, etc.)

What's off-limits: actual bookings, customer data, or anything requiring authentication.

Discovering the Internal API

Open any restaurant page on OpenTable and watch your browser's Network tab. Filter by XHR/Fetch. The key endpoints:

GET /dapi/fe/gql  — GraphQL endpoint for restaurant details
GET /restref/api/availability — reservation slots
GET /restref/api/reviews — paginated reviews
GET /restref/api/waitlist/status — waitlist info

The availability endpoint is the most valuable. It takes a restaurant ID, date, party size, and time — and returns open slots in JSON.

To find a restaurant ID: visit a restaurant page, right-click, View Page Source, and search for "rid" or "restaurantId". It's a 5-7 digit integer. The slug in the URL (e.g., le-bernardin-new-york) maps to a numeric ID.

Anti-Bot Measures

OpenTable's defenses are moderate but effective:

Akamai Bot Manager: Handles fingerprinting and behavioral analysis. It generates sensor data that gets sent back to verify you're a real browser. It checks TLS fingerprints, JavaScript environment properties, and mouse/keyboard event patterns.
Rate limiting: Aggressive per-IP limits, especially on the availability endpoint. More than ~20 requests/minute from one IP triggers soft blocks.
Cookie requirements: Requests without valid session cookies get redirected to the homepage.
JavaScript rendering: Some content loads dynamically via client-side JS (though the API endpoints themselves return JSON directly).
CAPTCHA challenges: Appear after sustained scraping from the same IP.

The Akamai layer is the main obstacle. Datacenter IPs (AWS, GCP, DigitalOcean) get caught almost immediately. You need residential proxies that rotate per request — Akamai's detection is weaker against residential traffic because it can't distinguish scrapers from real diners checking reservations.

ThorData's residential proxies work well here since their pool covers enough geographic diversity to match OpenTable's US-heavy user base. They support sticky sessions (maintaining the same IP across a session for cookie consistency) and rotating sessions (new IP per request for anonymity).

Setting Up Your Environment

pip install httpx beautifulsoup4 sqlite3

Required headers for all requests:

HEADERS = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
                  "AppleWebKit/537.36 (KHTML, like Gecko) "
                  "Chrome/126.0.0.0 Safari/537.36",
    "Accept": "application/json, text/plain, */*",
    "Accept-Language": "en-US,en;q=0.9",
    "Accept-Encoding": "gzip, deflate, br",
    "Referer": "https://www.opentable.com/",
    "Origin": "https://www.opentable.com",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "same-origin",
    "sec-ch-ua": '"Not/A)Brand";v="8", "Chromium";v="126", "Google Chrome";v="126"',
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": '"macOS"',
}

Skipping the Sec-Fetch-* and sec-ch-ua headers is one of the most common mistakes. They're part of the browser fingerprint that Akamai validates.

Working Python Code: Availability Scraper

import httpx
from datetime import date, timedelta
import time
import random

PROXY_URL = "http://USER:[email protected]:9000"

BASE_URL = "https://www.opentable.com"


def create_client(proxy_url: str | None = None) -> httpx.Client:
    """Create an httpx client with browser-realistic settings."""
    kwargs = {
        "headers": HEADERS,
        "timeout": 20,
        "follow_redirects": True,
    }
    if proxy_url:
        kwargs["proxy"] = proxy_url
    client = httpx.Client(**kwargs)
    # Warm up the session: visit the homepage to get cookies
    client.get(f"{BASE_URL}/")
    return client


def get_availability(
    client: httpx.Client,
    restaurant_id: int,
    check_date: str,
    party_size: int = 2,
    time_slot: str = "19:00"
) -> list[dict]:
    """
    Fetch available reservation slots.
    check_date format: YYYY-MM-DD
    time_slot format: HH:MM (24h)
    """
    url = f"{BASE_URL}/restref/api/availability"
    params = {
        "rid": restaurant_id,
        "dt": f"{check_date}T{time_slot}",
        "ps": party_size,
        "include": "suggestions",
        "format": "datetime",
    }
    resp = client.get(url, params=params)
    resp.raise_for_status()
    data = resp.json()

    slots = []
    for slot in data.get("availability", {}).get("slots", []):
        slots.append({
            "time": slot["dateTime"],
            "type": slot.get("type", "standard"),
            "token": slot.get("token"),
            "points": slot.get("loyaltyPoints", 0),
            "experience": slot.get("experienceTitle"),
        })
    return slots


def get_reviews(
    client: httpx.Client,
    restaurant_id: int,
    page: int = 1,
    page_size: int = 25,
    sort_by: str = "newest"
) -> dict:
    """Fetch paginated reviews for a restaurant."""
    url = f"{BASE_URL}/restref/api/reviews"
    params = {
        "rid": restaurant_id,
        "page": page,
        "pageSize": page_size,
        "sortBy": sort_by,  # newest, highest, lowest, most_helpful
    }
    resp = client.get(url, params=params)
    resp.raise_for_status()
    data = resp.json()

    reviews = []
    for r in data.get("reviews", []):
        reviews.append({
            "rating_overall": r.get("overallRating"),
            "rating_food": r.get("foodRating"),
            "rating_service": r.get("serviceRating"),
            "rating_ambiance": r.get("ambianceRating"),
            "rating_value": r.get("valueRating"),
            "text": r.get("text", ""),
            "date": r.get("reviewDate"),
            "dining_date": r.get("diningDate"),
            "diner_name": r.get("displayName", "Anonymous"),
            "party_size": r.get("partySize"),
            "occasion": r.get("occasion"),
            "helpful_count": r.get("helpfulCount", 0),
        })
    return {
        "total": data.get("totalResults", 0),
        "page": page,
        "reviews": reviews,
    }


def get_restaurant_details(client: httpx.Client, slug: str) -> dict:
    """Fetch restaurant metadata from the GraphQL endpoint."""
    url = f"{BASE_URL}/dapi/fe/gql"
    payload = {
        "operationName": "RestaurantProfile",
        "variables": {"slug": slug},
        "query": """
        query RestaurantProfile($slug: String!) {
            restaurant(slug: $slug) {
                id
                name
                cuisine
                priceRange
                neighborhood
                city
                state
                overallRating
                reviewCount
                topReviewKeywords
                dressCode
                hours {
                    dayOfWeek
                    open
                    close
                }
                photos {
                    url
                    caption
                }
            }
        }
        """
    }
    headers_gql = {**HEADERS, "Content-Type": "application/json"}
    resp = client.post(url, json=payload, headers=headers_gql)
    resp.raise_for_status()
    return resp.json()["data"]["restaurant"]

Scanning Availability Across Dates

A common use case: finding the next available reservation at a popular restaurant. This function scans the next 30 days across multiple party sizes and time slots:

def find_next_available(
    client: httpx.Client,
    restaurant_id: int,
    party_size: int = 2,
    days_ahead: int = 30,
    time_slots: list[str] | None = None
) -> list[dict]:
    """Scan multiple dates and return all available slots."""
    if time_slots is None:
        time_slots = ["12:00", "18:00", "19:00", "20:00", "21:00"]

    results = []
    today = date.today()

    for offset in range(1, days_ahead + 1):
        check = (today + timedelta(days=offset)).isoformat()
        day_slots = []

        for slot_time in time_slots:
            try:
                slots = get_availability(client, restaurant_id, check, party_size, slot_time)
                day_slots.extend(slots)
            except httpx.HTTPStatusError as e:
                if e.response.status_code == 429:
                    print(f"Rate limited on {check}/{slot_time}, backing off...")
                    time.sleep(random.uniform(30, 60))
                # skip other errors

            time.sleep(random.uniform(1.5, 3.0))

        if day_slots:
            # Deduplicate by time
            seen = set()
            unique_slots = []
            for s in day_slots:
                if s["time"] not in seen:
                    seen.add(s["time"])
                    unique_slots.append(s)
            results.append({"date": check, "slots": unique_slots})
            print(f"{check}: {len(unique_slots)} available slots")
        else:
            print(f"{check}: no availability")

    return results

Scraping All Reviews with Pagination

For building a reviews dataset, you'll want all pages:

def scrape_all_reviews(
    client: httpx.Client,
    restaurant_id: int,
    max_reviews: int = 500
) -> list[dict]:
    """Fetch all reviews across pages up to max_reviews."""
    all_reviews = []
    page = 1
    page_size = 25

    while len(all_reviews) < max_reviews:
        result = get_reviews(client, restaurant_id, page=page, page_size=page_size)
        reviews = result["reviews"]

        if not reviews:
            break

        all_reviews.extend(reviews)
        total = result["total"]

        print(f"Page {page}: {len(reviews)} reviews (total: {total})")

        if page * page_size >= total or page * page_size >= max_reviews:
            break

        page += 1
        time.sleep(random.uniform(2, 4))

    return all_reviews[:max_reviews]

Bulk Discovery: Finding Restaurant IDs

For building a city-wide dataset, you need to discover restaurant IDs at scale. OpenTable's search endpoint returns paginated results:

def search_restaurants(
    client: httpx.Client,
    metro_id: int,
    cuisine: str | None = None,
    price_range: int | None = None,
    page: int = 1
) -> dict:
    """
    Search for restaurants in a metro area.
    Metro IDs: 4 = New York, 13 = Los Angeles, 3 = Chicago, etc.
    """
    url = "https://www.opentable.com/s/"
    params = {
        "metroId": metro_id,
        "page": page,
        "pageSize": 20,
    }
    if cuisine:
        params["cuisine"] = cuisine
    if price_range:
        params["price"] = price_range

    resp = client.get(url, params=params)
    resp.raise_for_status()
    data = resp.json()

    restaurants = []
    for r in data.get("restaurants", []):
        restaurants.append({
            "id": r.get("id"),
            "name": r.get("name"),
            "slug": r.get("urlText"),
            "cuisine": r.get("cuisine"),
            "price_range": r.get("priceBand"),
            "neighborhood": r.get("neighborhood"),
            "rating": r.get("statistics", {}).get("reviews", {}).get("ratings", {}).get("overall", {}).get("rating"),
            "review_count": r.get("statistics", {}).get("reviews", {}).get("ratings", {}).get("overall", {}).get("reviewCount"),
        })
    return {
        "total": data.get("total", 0),
        "page": page,
        "restaurants": restaurants,
    }


def discover_all_restaurants(
    client: httpx.Client,
    metro_id: int,
    max_restaurants: int = 500
) -> list[dict]:
    """Paginate through search results to get all restaurants in a metro."""
    all_restaurants = []
    page = 1

    while len(all_restaurants) < max_restaurants:
        result = search_restaurants(client, metro_id, page=page)
        batch = result["restaurants"]
        if not batch:
            break

        all_restaurants.extend(batch)
        total = result["total"]
        print(f"Page {page}: found {len(batch)} restaurants (total: {total})")

        if page * 20 >= total:
            break

        page += 1
        time.sleep(random.uniform(2, 4))

    return all_restaurants[:max_restaurants]

Storing Results in SQLite

Track availability over time to detect patterns (e.g., when a popular restaurant releases new slots):

import sqlite3
import json
from datetime import datetime, timezone


def init_db(path: str = "opentable.db") -> sqlite3.Connection:
    conn = sqlite3.connect(path)
    conn.executescript("""
        CREATE TABLE IF NOT EXISTS restaurants (
            id INTEGER PRIMARY KEY,
            name TEXT,
            slug TEXT UNIQUE,
            cuisine TEXT,
            price_range INTEGER,
            neighborhood TEXT,
            city TEXT,
            state TEXT,
            overall_rating REAL,
            review_count INTEGER,
            raw_json TEXT,
            fetched_at TEXT
        );

        CREATE TABLE IF NOT EXISTS availability_snapshots (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            restaurant_id INTEGER,
            check_date TEXT,
            party_size INTEGER,
            slots_json TEXT,
            recorded_at TEXT,
            FOREIGN KEY(restaurant_id) REFERENCES restaurants(id)
        );

        CREATE TABLE IF NOT EXISTS reviews (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            restaurant_id INTEGER,
            rating_overall INTEGER,
            rating_food INTEGER,
            rating_service INTEGER,
            rating_ambiance INTEGER,
            text TEXT,
            review_date TEXT,
            dining_date TEXT,
            diner_name TEXT,
            FOREIGN KEY(restaurant_id) REFERENCES restaurants(id)
        );

        CREATE INDEX IF NOT EXISTS idx_avail_restaurant_date
            ON availability_snapshots(restaurant_id, check_date);
    """)
    conn.commit()
    return conn


def save_availability_snapshot(
    conn: sqlite3.Connection,
    restaurant_id: int,
    check_date: str,
    party_size: int,
    slots: list[dict]
) -> None:
    conn.execute(
        "INSERT INTO availability_snapshots (restaurant_id, check_date, party_size, slots_json, recorded_at) "
        "VALUES (?, ?, ?, ?, ?)",
        (restaurant_id, check_date, party_size, json.dumps(slots),
         datetime.now(timezone.utc).isoformat())
    )
    conn.commit()


def get_availability_history(
    conn: sqlite3.Connection,
    restaurant_id: int,
    check_date: str
) -> list[dict]:
    """Get all snapshots for a specific restaurant/date combo."""
    rows = conn.execute(
        "SELECT recorded_at, slots_json FROM availability_snapshots "
        "WHERE restaurant_id = ? AND check_date = ? ORDER BY recorded_at",
        (restaurant_id, check_date)
    ).fetchall()
    return [{"recorded_at": r[0], "slots": json.loads(r[1])} for r in rows]

Scaling Without Getting Blocked

OpenTable's rate limits are stricter than most restaurant platforms. Key rules for production-scale scraping:

Request pacing: - Minimum 2-3 seconds between availability requests - Minimum 1-2 seconds between review page requests - Use random.uniform() for jitter — consistent intervals look robotic

IP management: - Never use datacenter IPs (AWS, GCP, Azure, etc.) — Akamai pre-blocks these - Use residential proxies that rotate per request for review scraping - Use sticky sessions (same IP for a session) for availability checking - ThorData's residential proxies support both modes via username suffix: user-rotate vs user-session-abc123

Session management: - Create a new httpx.Client() for each session - Visit the homepage first to get cookies before hitting API endpoints - Don't reuse sessions across proxy rotations

Caching: - Restaurant metadata changes rarely — cache for 24 hours - Reviews update slowly — pull at most once per day - Availability is the only data worth polling frequently (every 15-60 minutes)

Error handling with exponential backoff:

import time
import random


def resilient_request(
    client: httpx.Client,
    url: str,
    params: dict | None = None,
    max_retries: int = 5
) -> httpx.Response | None:
    for attempt in range(max_retries):
        try:
            resp = client.get(url, params=params)
            if resp.status_code == 429:
                wait = (2 ** attempt) + random.uniform(0, 2)
                print(f"Rate limited. Waiting {wait:.1f}s (attempt {attempt + 1})")
                time.sleep(wait)
                continue
            if resp.status_code == 403:
                print("Blocked — rotate IP and session")
                return None
            resp.raise_for_status()
            return resp
        except httpx.ConnectError:
            print(f"Connection error on attempt {attempt + 1}")
            time.sleep(2 ** attempt)
        except httpx.TimeoutException:
            print(f"Timeout on attempt {attempt + 1}")
            time.sleep(2)
    return None

Complete Working Example

Here's a full script that discovers restaurants in a city, tracks their availability, and stores everything in SQLite:

import httpx
import sqlite3
import time
import random
import json
from datetime import date, timedelta, datetime, timezone

PROXY_URL = "http://USER:[email protected]:9000"


def main():
    conn = init_db("opentable.db")

    with create_client(PROXY_URL) as client:
        # Step 1: Discover restaurants in New York (metro 4)
        print("Discovering restaurants...")
        restaurants = discover_all_restaurants(client, metro_id=4, max_restaurants=100)
        print(f"Found {len(restaurants)} restaurants\n")

        # Step 2: For each restaurant, check next 7 days of availability
        for rest in restaurants:
            rid = rest["id"]
            name = rest["name"]
            print(f"\nChecking: {name} (ID: {rid})")

            # Save restaurant metadata
            conn.execute("""
                INSERT OR REPLACE INTO restaurants
                (id, name, slug, cuisine, price_range, neighborhood, raw_json, fetched_at)
                VALUES (?, ?, ?, ?, ?, ?, ?, ?)
            """, (
                rid, name, rest.get("slug"), rest.get("cuisine"),
                rest.get("price_range"), rest.get("neighborhood"),
                json.dumps(rest), datetime.now(timezone.utc).isoformat()
            ))
            conn.commit()

            # Check availability for next 7 days
            today = date.today()
            for offset in range(1, 8):
                check_date = (today + timedelta(days=offset)).isoformat()
                try:
                    slots = get_availability(client, rid, check_date, party_size=2)
                    save_availability_snapshot(conn, rid, check_date, 2, slots)
                    print(f"  {check_date}: {len(slots)} slots")
                except Exception as e:
                    print(f"  {check_date}: error — {e}")

                time.sleep(random.uniform(2, 4))


if __name__ == "__main__":
    main()

Waitlist Data

The waitlist endpoint gives current wait times for walk-in diners:

def get_waitlist_status(client: httpx.Client, restaurant_id: int) -> dict:
    url = f"{BASE_URL}/restref/api/waitlist/status"
    params = {"rid": restaurant_id}
    resp = client.get(url, params=params)
    if resp.status_code == 404:
        return {"available": False}  # restaurant doesn't use waitlist
    resp.raise_for_status()
    data = resp.json()
    return {
        "available": data.get("waitlistEnabled", False),
        "current_wait_minutes": data.get("currentWaitTime"),
        "party_size_options": data.get("partySizeOptions", []),
        "quote_updated_at": data.get("quoteUpdatedAt"),
    }

Not all restaurants use OpenTable's waitlist feature. If waitlistEnabled is false, the endpoint returns a 404 or an empty response.

Common Gotchas

Restaurant IDs vs slugs: The availability and reviews endpoints require the numeric ID. The GraphQL endpoint accepts the slug. Don't confuse them.

Date/time format: The availability endpoint expects YYYY-MM-DDTHH:MM (ISO 8601 without timezone). Sending a UTC timestamp with Z or +00:00 breaks the request.

Party size limits: Most restaurants configure minimum and maximum party sizes. Requesting party=1 at a restaurant that requires minimum 2 returns an empty slots array, not an error.

Empty responses vs blocked requests: When Akamai blocks you, the response is often a valid 200 with an HTML challenge page — not a 403. Always check that resp.json() actually parses without error, and that the response contains your expected keys.

Review pagination limits: OpenTable caps reviews at a total offset around 2000-2500 regardless of review count. If a restaurant has 5,000 reviews, you can only get the most recent ~2,500.

Timezone handling: Availability slots return in the restaurant's local timezone. A restaurant in New York returns 2026-09-19T19:00:00-04:00. Parse with datetime.fromisoformat() in Python 3.11+ or use python-dateutil for older versions.

Legal Notes

OpenTable's data is publicly visible — anyone can check availability without logging in. Scraping for personal research or analytics is generally permissible under the legal framework established in hiQ v. LinkedIn (public data). However, OpenTable's Terms of Service prohibit automated access, so commercial products built on this data carry legal risk. Never store or re-publish personal data from reviews (names, dining occasions), and keep your request rates reasonable to avoid disrupting their service.