How to Scrape OpenTable Reservations in 2026 (Availability, Reviews, Waitlists)
How to Scrape OpenTable Reservations in 2026 (Availability, Reviews, Waitlists)
OpenTable's API is locked behind a partner program that requires a restaurant POS integration. For anyone building reservation analytics, availability trackers, or restaurant research tools, that's a non-starter.
But OpenTable's website leaks plenty of structured data if you know where to look. Their frontend hits internal REST endpoints that return JSON — availability slots, review scores, waitlist status, and more. This guide covers every angle: the internal API, anti-bot evasion, pagination, data storage, and how to run this at scale without burning your IPs.
What Data You Can Extract
From OpenTable's public pages and internal API:
- Restaurant name, cuisine, price range, neighborhood
- Available reservation slots by date and party size
- Booking windows (how far out you can reserve)
- Review scores (overall, food, service, ambiance)
- Individual review text and dates
- Waitlist availability and estimated wait times
- Dining points and promotion eligibility
- Photos, hours, and dress code
- Popular tags and top review keywords
- Special experiences (chef's table, tasting menus, etc.)
What's off-limits: actual bookings, customer data, or anything requiring authentication.
Discovering the Internal API
Open any restaurant page on OpenTable and watch your browser's Network tab. Filter by XHR/Fetch. The key endpoints:
GET /dapi/fe/gql — GraphQL endpoint for restaurant details
GET /restref/api/availability — reservation slots
GET /restref/api/reviews — paginated reviews
GET /restref/api/waitlist/status — waitlist info
The availability endpoint is the most valuable. It takes a restaurant ID, date, party size, and time — and returns open slots in JSON.
To find a restaurant ID: visit a restaurant page, right-click, View Page Source, and search for "rid" or "restaurantId". It's a 5-7 digit integer. The slug in the URL (e.g., le-bernardin-new-york) maps to a numeric ID.
Anti-Bot Measures
OpenTable's defenses are moderate but effective:
- Akamai Bot Manager: Handles fingerprinting and behavioral analysis. It generates sensor data that gets sent back to verify you're a real browser. It checks TLS fingerprints, JavaScript environment properties, and mouse/keyboard event patterns.
- Rate limiting: Aggressive per-IP limits, especially on the availability endpoint. More than ~20 requests/minute from one IP triggers soft blocks.
- Cookie requirements: Requests without valid session cookies get redirected to the homepage.
- JavaScript rendering: Some content loads dynamically via client-side JS (though the API endpoints themselves return JSON directly).
- CAPTCHA challenges: Appear after sustained scraping from the same IP.
The Akamai layer is the main obstacle. Datacenter IPs (AWS, GCP, DigitalOcean) get caught almost immediately. You need residential proxies that rotate per request — Akamai's detection is weaker against residential traffic because it can't distinguish scrapers from real diners checking reservations.
ThorData's residential proxies work well here since their pool covers enough geographic diversity to match OpenTable's US-heavy user base. They support sticky sessions (maintaining the same IP across a session for cookie consistency) and rotating sessions (new IP per request for anonymity).
Setting Up Your Environment
pip install httpx beautifulsoup4 sqlite3
Required headers for all requests:
HEADERS = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/126.0.0.0 Safari/537.36",
"Accept": "application/json, text/plain, */*",
"Accept-Language": "en-US,en;q=0.9",
"Accept-Encoding": "gzip, deflate, br",
"Referer": "https://www.opentable.com/",
"Origin": "https://www.opentable.com",
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin",
"sec-ch-ua": '"Not/A)Brand";v="8", "Chromium";v="126", "Google Chrome";v="126"',
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": '"macOS"',
}
Skipping the Sec-Fetch-* and sec-ch-ua headers is one of the most common mistakes. They're part of the browser fingerprint that Akamai validates.
Working Python Code: Availability Scraper
import httpx
from datetime import date, timedelta
import time
import random
PROXY_URL = "http://USER:[email protected]:9000"
BASE_URL = "https://www.opentable.com"
def create_client(proxy_url: str | None = None) -> httpx.Client:
"""Create an httpx client with browser-realistic settings."""
kwargs = {
"headers": HEADERS,
"timeout": 20,
"follow_redirects": True,
}
if proxy_url:
kwargs["proxy"] = proxy_url
client = httpx.Client(**kwargs)
# Warm up the session: visit the homepage to get cookies
client.get(f"{BASE_URL}/")
return client
def get_availability(
client: httpx.Client,
restaurant_id: int,
check_date: str,
party_size: int = 2,
time_slot: str = "19:00"
) -> list[dict]:
"""
Fetch available reservation slots.
check_date format: YYYY-MM-DD
time_slot format: HH:MM (24h)
"""
url = f"{BASE_URL}/restref/api/availability"
params = {
"rid": restaurant_id,
"dt": f"{check_date}T{time_slot}",
"ps": party_size,
"include": "suggestions",
"format": "datetime",
}
resp = client.get(url, params=params)
resp.raise_for_status()
data = resp.json()
slots = []
for slot in data.get("availability", {}).get("slots", []):
slots.append({
"time": slot["dateTime"],
"type": slot.get("type", "standard"),
"token": slot.get("token"),
"points": slot.get("loyaltyPoints", 0),
"experience": slot.get("experienceTitle"),
})
return slots
def get_reviews(
client: httpx.Client,
restaurant_id: int,
page: int = 1,
page_size: int = 25,
sort_by: str = "newest"
) -> dict:
"""Fetch paginated reviews for a restaurant."""
url = f"{BASE_URL}/restref/api/reviews"
params = {
"rid": restaurant_id,
"page": page,
"pageSize": page_size,
"sortBy": sort_by, # newest, highest, lowest, most_helpful
}
resp = client.get(url, params=params)
resp.raise_for_status()
data = resp.json()
reviews = []
for r in data.get("reviews", []):
reviews.append({
"rating_overall": r.get("overallRating"),
"rating_food": r.get("foodRating"),
"rating_service": r.get("serviceRating"),
"rating_ambiance": r.get("ambianceRating"),
"rating_value": r.get("valueRating"),
"text": r.get("text", ""),
"date": r.get("reviewDate"),
"dining_date": r.get("diningDate"),
"diner_name": r.get("displayName", "Anonymous"),
"party_size": r.get("partySize"),
"occasion": r.get("occasion"),
"helpful_count": r.get("helpfulCount", 0),
})
return {
"total": data.get("totalResults", 0),
"page": page,
"reviews": reviews,
}
def get_restaurant_details(client: httpx.Client, slug: str) -> dict:
"""Fetch restaurant metadata from the GraphQL endpoint."""
url = f"{BASE_URL}/dapi/fe/gql"
payload = {
"operationName": "RestaurantProfile",
"variables": {"slug": slug},
"query": """
query RestaurantProfile($slug: String!) {
restaurant(slug: $slug) {
id
name
cuisine
priceRange
neighborhood
city
state
overallRating
reviewCount
topReviewKeywords
dressCode
hours {
dayOfWeek
open
close
}
photos {
url
caption
}
}
}
"""
}
headers_gql = {**HEADERS, "Content-Type": "application/json"}
resp = client.post(url, json=payload, headers=headers_gql)
resp.raise_for_status()
return resp.json()["data"]["restaurant"]
Scanning Availability Across Dates
A common use case: finding the next available reservation at a popular restaurant. This function scans the next 30 days across multiple party sizes and time slots:
def find_next_available(
client: httpx.Client,
restaurant_id: int,
party_size: int = 2,
days_ahead: int = 30,
time_slots: list[str] | None = None
) -> list[dict]:
"""Scan multiple dates and return all available slots."""
if time_slots is None:
time_slots = ["12:00", "18:00", "19:00", "20:00", "21:00"]
results = []
today = date.today()
for offset in range(1, days_ahead + 1):
check = (today + timedelta(days=offset)).isoformat()
day_slots = []
for slot_time in time_slots:
try:
slots = get_availability(client, restaurant_id, check, party_size, slot_time)
day_slots.extend(slots)
except httpx.HTTPStatusError as e:
if e.response.status_code == 429:
print(f"Rate limited on {check}/{slot_time}, backing off...")
time.sleep(random.uniform(30, 60))
# skip other errors
time.sleep(random.uniform(1.5, 3.0))
if day_slots:
# Deduplicate by time
seen = set()
unique_slots = []
for s in day_slots:
if s["time"] not in seen:
seen.add(s["time"])
unique_slots.append(s)
results.append({"date": check, "slots": unique_slots})
print(f"{check}: {len(unique_slots)} available slots")
else:
print(f"{check}: no availability")
return results
Scraping All Reviews with Pagination
For building a reviews dataset, you'll want all pages:
def scrape_all_reviews(
client: httpx.Client,
restaurant_id: int,
max_reviews: int = 500
) -> list[dict]:
"""Fetch all reviews across pages up to max_reviews."""
all_reviews = []
page = 1
page_size = 25
while len(all_reviews) < max_reviews:
result = get_reviews(client, restaurant_id, page=page, page_size=page_size)
reviews = result["reviews"]
if not reviews:
break
all_reviews.extend(reviews)
total = result["total"]
print(f"Page {page}: {len(reviews)} reviews (total: {total})")
if page * page_size >= total or page * page_size >= max_reviews:
break
page += 1
time.sleep(random.uniform(2, 4))
return all_reviews[:max_reviews]
Bulk Discovery: Finding Restaurant IDs
For building a city-wide dataset, you need to discover restaurant IDs at scale. OpenTable's search endpoint returns paginated results:
def search_restaurants(
client: httpx.Client,
metro_id: int,
cuisine: str | None = None,
price_range: int | None = None,
page: int = 1
) -> dict:
"""
Search for restaurants in a metro area.
Metro IDs: 4 = New York, 13 = Los Angeles, 3 = Chicago, etc.
"""
url = "https://www.opentable.com/s/"
params = {
"metroId": metro_id,
"page": page,
"pageSize": 20,
}
if cuisine:
params["cuisine"] = cuisine
if price_range:
params["price"] = price_range
resp = client.get(url, params=params)
resp.raise_for_status()
data = resp.json()
restaurants = []
for r in data.get("restaurants", []):
restaurants.append({
"id": r.get("id"),
"name": r.get("name"),
"slug": r.get("urlText"),
"cuisine": r.get("cuisine"),
"price_range": r.get("priceBand"),
"neighborhood": r.get("neighborhood"),
"rating": r.get("statistics", {}).get("reviews", {}).get("ratings", {}).get("overall", {}).get("rating"),
"review_count": r.get("statistics", {}).get("reviews", {}).get("ratings", {}).get("overall", {}).get("reviewCount"),
})
return {
"total": data.get("total", 0),
"page": page,
"restaurants": restaurants,
}
def discover_all_restaurants(
client: httpx.Client,
metro_id: int,
max_restaurants: int = 500
) -> list[dict]:
"""Paginate through search results to get all restaurants in a metro."""
all_restaurants = []
page = 1
while len(all_restaurants) < max_restaurants:
result = search_restaurants(client, metro_id, page=page)
batch = result["restaurants"]
if not batch:
break
all_restaurants.extend(batch)
total = result["total"]
print(f"Page {page}: found {len(batch)} restaurants (total: {total})")
if page * 20 >= total:
break
page += 1
time.sleep(random.uniform(2, 4))
return all_restaurants[:max_restaurants]
Storing Results in SQLite
Track availability over time to detect patterns (e.g., when a popular restaurant releases new slots):
import sqlite3
import json
from datetime import datetime, timezone
def init_db(path: str = "opentable.db") -> sqlite3.Connection:
conn = sqlite3.connect(path)
conn.executescript("""
CREATE TABLE IF NOT EXISTS restaurants (
id INTEGER PRIMARY KEY,
name TEXT,
slug TEXT UNIQUE,
cuisine TEXT,
price_range INTEGER,
neighborhood TEXT,
city TEXT,
state TEXT,
overall_rating REAL,
review_count INTEGER,
raw_json TEXT,
fetched_at TEXT
);
CREATE TABLE IF NOT EXISTS availability_snapshots (
id INTEGER PRIMARY KEY AUTOINCREMENT,
restaurant_id INTEGER,
check_date TEXT,
party_size INTEGER,
slots_json TEXT,
recorded_at TEXT,
FOREIGN KEY(restaurant_id) REFERENCES restaurants(id)
);
CREATE TABLE IF NOT EXISTS reviews (
id INTEGER PRIMARY KEY AUTOINCREMENT,
restaurant_id INTEGER,
rating_overall INTEGER,
rating_food INTEGER,
rating_service INTEGER,
rating_ambiance INTEGER,
text TEXT,
review_date TEXT,
dining_date TEXT,
diner_name TEXT,
FOREIGN KEY(restaurant_id) REFERENCES restaurants(id)
);
CREATE INDEX IF NOT EXISTS idx_avail_restaurant_date
ON availability_snapshots(restaurant_id, check_date);
""")
conn.commit()
return conn
def save_availability_snapshot(
conn: sqlite3.Connection,
restaurant_id: int,
check_date: str,
party_size: int,
slots: list[dict]
) -> None:
conn.execute(
"INSERT INTO availability_snapshots (restaurant_id, check_date, party_size, slots_json, recorded_at) "
"VALUES (?, ?, ?, ?, ?)",
(restaurant_id, check_date, party_size, json.dumps(slots),
datetime.now(timezone.utc).isoformat())
)
conn.commit()
def get_availability_history(
conn: sqlite3.Connection,
restaurant_id: int,
check_date: str
) -> list[dict]:
"""Get all snapshots for a specific restaurant/date combo."""
rows = conn.execute(
"SELECT recorded_at, slots_json FROM availability_snapshots "
"WHERE restaurant_id = ? AND check_date = ? ORDER BY recorded_at",
(restaurant_id, check_date)
).fetchall()
return [{"recorded_at": r[0], "slots": json.loads(r[1])} for r in rows]
Scaling Without Getting Blocked
OpenTable's rate limits are stricter than most restaurant platforms. Key rules for production-scale scraping:
Request pacing:
- Minimum 2-3 seconds between availability requests
- Minimum 1-2 seconds between review page requests
- Use random.uniform() for jitter — consistent intervals look robotic
IP management:
- Never use datacenter IPs (AWS, GCP, Azure, etc.) — Akamai pre-blocks these
- Use residential proxies that rotate per request for review scraping
- Use sticky sessions (same IP for a session) for availability checking
- ThorData's residential proxies support both modes via username suffix: user-rotate vs user-session-abc123
Session management: - Create a new httpx.Client() for each session - Visit the homepage first to get cookies before hitting API endpoints - Don't reuse sessions across proxy rotations
Caching: - Restaurant metadata changes rarely — cache for 24 hours - Reviews update slowly — pull at most once per day - Availability is the only data worth polling frequently (every 15-60 minutes)
Error handling with exponential backoff:
import time
import random
def resilient_request(
client: httpx.Client,
url: str,
params: dict | None = None,
max_retries: int = 5
) -> httpx.Response | None:
for attempt in range(max_retries):
try:
resp = client.get(url, params=params)
if resp.status_code == 429:
wait = (2 ** attempt) + random.uniform(0, 2)
print(f"Rate limited. Waiting {wait:.1f}s (attempt {attempt + 1})")
time.sleep(wait)
continue
if resp.status_code == 403:
print("Blocked — rotate IP and session")
return None
resp.raise_for_status()
return resp
except httpx.ConnectError:
print(f"Connection error on attempt {attempt + 1}")
time.sleep(2 ** attempt)
except httpx.TimeoutException:
print(f"Timeout on attempt {attempt + 1}")
time.sleep(2)
return None
Complete Working Example
Here's a full script that discovers restaurants in a city, tracks their availability, and stores everything in SQLite:
import httpx
import sqlite3
import time
import random
import json
from datetime import date, timedelta, datetime, timezone
PROXY_URL = "http://USER:[email protected]:9000"
def main():
conn = init_db("opentable.db")
with create_client(PROXY_URL) as client:
# Step 1: Discover restaurants in New York (metro 4)
print("Discovering restaurants...")
restaurants = discover_all_restaurants(client, metro_id=4, max_restaurants=100)
print(f"Found {len(restaurants)} restaurants\n")
# Step 2: For each restaurant, check next 7 days of availability
for rest in restaurants:
rid = rest["id"]
name = rest["name"]
print(f"\nChecking: {name} (ID: {rid})")
# Save restaurant metadata
conn.execute("""
INSERT OR REPLACE INTO restaurants
(id, name, slug, cuisine, price_range, neighborhood, raw_json, fetched_at)
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
""", (
rid, name, rest.get("slug"), rest.get("cuisine"),
rest.get("price_range"), rest.get("neighborhood"),
json.dumps(rest), datetime.now(timezone.utc).isoformat()
))
conn.commit()
# Check availability for next 7 days
today = date.today()
for offset in range(1, 8):
check_date = (today + timedelta(days=offset)).isoformat()
try:
slots = get_availability(client, rid, check_date, party_size=2)
save_availability_snapshot(conn, rid, check_date, 2, slots)
print(f" {check_date}: {len(slots)} slots")
except Exception as e:
print(f" {check_date}: error — {e}")
time.sleep(random.uniform(2, 4))
if __name__ == "__main__":
main()
Waitlist Data
The waitlist endpoint gives current wait times for walk-in diners:
def get_waitlist_status(client: httpx.Client, restaurant_id: int) -> dict:
url = f"{BASE_URL}/restref/api/waitlist/status"
params = {"rid": restaurant_id}
resp = client.get(url, params=params)
if resp.status_code == 404:
return {"available": False} # restaurant doesn't use waitlist
resp.raise_for_status()
data = resp.json()
return {
"available": data.get("waitlistEnabled", False),
"current_wait_minutes": data.get("currentWaitTime"),
"party_size_options": data.get("partySizeOptions", []),
"quote_updated_at": data.get("quoteUpdatedAt"),
}
Not all restaurants use OpenTable's waitlist feature. If waitlistEnabled is false, the endpoint returns a 404 or an empty response.
Common Gotchas
Restaurant IDs vs slugs: The availability and reviews endpoints require the numeric ID. The GraphQL endpoint accepts the slug. Don't confuse them.
Date/time format: The availability endpoint expects YYYY-MM-DDTHH:MM (ISO 8601 without timezone). Sending a UTC timestamp with Z or +00:00 breaks the request.
Party size limits: Most restaurants configure minimum and maximum party sizes. Requesting party=1 at a restaurant that requires minimum 2 returns an empty slots array, not an error.
Empty responses vs blocked requests: When Akamai blocks you, the response is often a valid 200 with an HTML challenge page — not a 403. Always check that resp.json() actually parses without error, and that the response contains your expected keys.
Review pagination limits: OpenTable caps reviews at a total offset around 2000-2500 regardless of review count. If a restaurant has 5,000 reviews, you can only get the most recent ~2,500.
Timezone handling: Availability slots return in the restaurant's local timezone. A restaurant in New York returns 2026-09-19T19:00:00-04:00. Parse with datetime.fromisoformat() in Python 3.11+ or use python-dateutil for older versions.
Legal Notes
OpenTable's data is publicly visible — anyone can check availability without logging in. Scraping for personal research or analytics is generally permissible under the legal framework established in hiQ v. LinkedIn (public data). However, OpenTable's Terms of Service prohibit automated access, so commercial products built on this data carry legal risk. Never store or re-publish personal data from reviews (names, dining occasions), and keep your request rates reasonable to avoid disrupting their service.