How to Scrape Costco Deals in 2026 (Weekly Offers, Kirkland Data, Warehouse Pricing)
How to Scrape Costco Deals in 2026 (Weekly Offers, Kirkland Data, Warehouse Pricing)
Costco doesn't have a public API. Their website is one of the most hostile to scrapers in all of e-commerce. But the data is valuable — warehouse pricing, Kirkland brand comparisons, and weekly deal rotations are gold for price comparison tools and consumer research.
Here's how to get it done.
What Data You Can Extract
From Costco's public website:
- Product name, description, item number
- Online price and member-only pricing
- Weekly deal prices and expiration dates
- Kirkland Signature product catalog
- Product ratings and review counts
- Category and department classification
- Product images and specifications
- Availability (online vs. in-warehouse)
- Shipping costs and delivery estimates
What's behind the login wall: in-store-only prices (differ from online), purchase history, warehouse-specific inventory levels.
Why Costco Is Harder Than Other Retailers
Most e-commerce sites are scrape-friendly compared to Costco. Here's what makes it challenging:
- Imperva (Incapsula): Costco uses Imperva's advanced bot management. It runs JavaScript challenges, behavioral analysis, and device fingerprinting before serving any page.
- Mandatory cookies: The site sets multiple tracking cookies during the JS challenge. Without them, every request returns a block page.
- Dynamic rendering: Product pages load pricing via XHR calls after the initial HTML loads. A simple GET request returns a shell without prices.
- Aggressive IP blocking: Costco blocks entire datacenter IP ranges. Even some residential IPs with poor reputation scores get flagged.
- Member-wall for some products: Certain pricing is only visible to logged-in members.
This is where proxy quality makes the biggest difference. Cheap datacenter proxies won't even get past the Imperva challenge. You need clean residential IPs that haven't been flagged. I've had consistent results with ThorData's residential proxy network — their IPs pass Imperva's reputation checks, which is the hardest part of scraping Costco.
Working Python Code
This scraper uses Playwright to handle JavaScript rendering, combined with residential proxies to bypass Imperva:
import asyncio
import json
from playwright.async_api import async_playwright
PROXY_CONFIG = {
"server": "http://proxy.thordata.com:9000",
"username": "USER",
"password": "PASS",
}
async def scrape_costco_search(query: str, max_pages: int = 3) -> list[dict]:
"""Search Costco and extract product listings."""
products = []
async with async_playwright() as p:
browser = await p.chromium.launch(
headless=True,
proxy=PROXY_CONFIG,
)
context = await browser.new_context(
user_agent=(
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/126.0.0.0 Safari/537.36"
),
viewport={"width": 1440, "height": 900},
)
page = await context.new_page()
# Navigate and wait for Imperva challenge to resolve
url = f"https://www.costco.com/CatalogSearch?dept=All&keyword={query}"
await page.goto(url, wait_until="networkidle", timeout=30000)
await page.wait_for_timeout(3000) # extra wait for JS
for page_num in range(max_pages):
# Extract product data from the page
items = await page.evaluate("""
() => {
const products = [];
document.querySelectorAll('[automation-id="productList"] .product').forEach(el => {
const name = el.querySelector('.description')?.textContent?.trim();
const priceEl = el.querySelector('.price');
const price = priceEl?.textContent?.trim()?.replace(/[^0-9.]/g, '');
const ratingEl = el.querySelector('.ratings .value');
const rating = ratingEl?.textContent?.trim();
const reviewEl = el.querySelector('.ratings .count');
const reviews = reviewEl?.textContent?.replace(/[^0-9]/g, '');
const link = el.querySelector('a.description')?.href;
const img = el.querySelector('img.product-img')?.src;
const itemNum = el.querySelector('.item-num')?.textContent?.replace('Item ', '');
if (name && price) {
products.push({
name, price: parseFloat(price),
rating: rating ? parseFloat(rating) : null,
reviews: reviews ? parseInt(reviews) : 0,
url: link, image: img,
item_number: itemNum?.trim(),
});
}
});
return products;
}
""")
products.extend(items)
# Try next page
next_btn = await page.query_selector('a[aria-label="Next"]')
if not next_btn or page_num == max_pages - 1:
break
await next_btn.click()
await page.wait_for_load_state("networkidle")
await page.wait_for_timeout(2000)
await browser.close()
return products
async def scrape_product_details(product_url: str) -> dict:
"""Scrape full details from a Costco product page."""
async with async_playwright() as p:
browser = await p.chromium.launch(
headless=True,
proxy=PROXY_CONFIG,
)
context = await browser.new_context(
user_agent=(
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/126.0.0.0 Safari/537.36"
),
)
page = await context.new_page()
await page.goto(product_url, wait_until="networkidle", timeout=30000)
await page.wait_for_timeout(3000)
details = await page.evaluate("""
() => {
const name = document.querySelector('h1[automation-id="productName"]')
?.textContent?.trim();
const price = document.querySelector('[automation-id="productPrice"]')
?.textContent?.trim()?.replace(/[^0-9.]/g, '');
const desc = document.querySelector('#product-detail-description')
?.textContent?.trim();
const specs = {};
document.querySelectorAll('.product-info-specs tr').forEach(row => {
const key = row.querySelector('th')?.textContent?.trim();
const val = row.querySelector('td')?.textContent?.trim();
if (key && val) specs[key] = val;
});
const images = [];
document.querySelectorAll('.product-image-carousel img').forEach(img => {
if (img.src) images.push(img.src);
});
const shipping = document.querySelector('.shipping-info')
?.textContent?.trim();
return {
name, price: price ? parseFloat(price) : null,
description: desc, specifications: specs,
images, shipping_info: shipping,
};
}
""")
await browser.close()
return details
async def scrape_weekly_deals() -> list[dict]:
"""Scrape current weekly deals / coupon book items."""
async with async_playwright() as p:
browser = await p.chromium.launch(
headless=True,
proxy=PROXY_CONFIG,
)
context = await browser.new_context(
user_agent=(
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/126.0.0.0 Safari/537.36"
),
)
page = await context.new_page()
await page.goto(
"https://www.costco.com/warehouse-savings.html",
wait_until="networkidle",
timeout=30000,
)
await page.wait_for_timeout(3000)
deals = await page.evaluate("""
() => {
const items = [];
document.querySelectorAll('.product').forEach(el => {
const name = el.querySelector('.description')?.textContent?.trim();
const original = el.querySelector('.strike-price')
?.textContent?.replace(/[^0-9.]/g, '');
const sale = el.querySelector('.sale-price, .price')
?.textContent?.replace(/[^0-9.]/g, '');
const savings = el.querySelector('.savings')
?.textContent?.trim();
const validThru = el.querySelector('.valid-dates')
?.textContent?.trim();
if (name) {
items.push({
name,
original_price: original ? parseFloat(original) : null,
sale_price: sale ? parseFloat(sale) : null,
savings: savings,
valid_through: validThru,
});
}
});
return items;
}
""")
await browser.close()
return deals
if __name__ == "__main__":
async def main():
# Search for Kirkland products
print("Searching: Kirkland Signature\n")
products = await scrape_costco_search("kirkland signature", max_pages=2)
for p in products[:10]:
rating = f" ★{p['rating']}" if p["rating"] else ""
print(f" ${p['price']:.2f} — {p['name'][:70]}{rating}")
print(f"\n Total found: {len(products)} products")
# Get weekly deals
print("\nWeekly deals:\n")
deals = await scrape_weekly_deals()
for d in deals[:10]:
save = f" (save {d['savings']})" if d["savings"] else ""
print(f" ${d['sale_price']:.2f} — {d['name'][:60]}{save}")
asyncio.run(main())
Installing Dependencies
pip install playwright httpx && playwright install chromium
Playwright is necessary here because Costco's Imperva protection requires full JavaScript execution. Lighter approaches like httpx alone won't work.
Kirkland Brand Price Tracking
Kirkland Signature is Costco's private label and one of the most interesting datasets to track. Prices change infrequently but when they do, it signals broader supply chain shifts.
Build a simple tracker by running the search scraper weekly and storing results:
import sqlite3
from datetime import date
def save_prices(products: list[dict], db_path: str = "costco_prices.db"):
conn = sqlite3.connect(db_path)
conn.execute("""
CREATE TABLE IF NOT EXISTS prices (
item_number TEXT, name TEXT, price REAL,
date TEXT, PRIMARY KEY (item_number, date)
)
""")
for p in products:
if p.get("item_number"):
conn.execute(
"INSERT OR REPLACE INTO prices VALUES (?, ?, ?, ?)",
(p["item_number"], p["name"], p["price"], date.today().isoformat()),
)
conn.commit()
conn.close()
Practical Tips
- Run during US business hours — your traffic blends in with real shoppers
- Limit to 50-100 pages per session — Imperva tracks session behavior over time
- Rotate browser fingerprints — vary viewport size, timezone, and language
- Don't scrape member-only pages — accessing authenticated content without permission creates legal risk
- Cache product pages — Costco's inventory changes weekly at most, no need for daily scrapes
- Handle Imperva retries — if you get a challenge page, wait 10 seconds and retry once before rotating IPs
Legal Notes
Costco's terms prohibit automated access. Scraping publicly visible product listings and pricing for personal research is generally low-risk. Building a competing retail service with their data is not. Stick to publicly accessible pages and don't circumvent any authentication barriers.
Tracking Price Changes Over Time
Costco's pricing model is different from most retailers. Prices change infrequently but when they do, changes are permanent until the next adjustment. Here is a SQLite-backed tracker:
import sqlite3
from datetime import date, datetime
def init_price_tracker(db_path: str = "costco_prices.db") -> sqlite3.Connection:
conn = sqlite3.connect(db_path)
conn.executescript("""
CREATE TABLE IF NOT EXISTS products (
item_number TEXT PRIMARY KEY,
name TEXT,
url TEXT,
category TEXT,
image_url TEXT,
first_seen TEXT,
last_seen TEXT
);
CREATE TABLE IF NOT EXISTS price_history (
id INTEGER PRIMARY KEY AUTOINCREMENT,
item_number TEXT,
price REAL,
sale_price REAL,
savings TEXT,
valid_through TEXT,
recorded_at TEXT,
FOREIGN KEY (item_number) REFERENCES products(item_number)
);
CREATE TABLE IF NOT EXISTS price_alerts (
item_number TEXT PRIMARY KEY,
target_price REAL,
email TEXT,
active INTEGER DEFAULT 1
);
CREATE INDEX IF NOT EXISTS idx_price_item
ON price_history(item_number, recorded_at DESC);
""")
conn.commit()
return conn
def record_price(
conn: sqlite3.Connection,
item: dict,
sale_price: float = None,
savings: str = None,
valid_through: str = None,
):
"""Record a price observation for a product."""
now = date.today().isoformat()
# Upsert product record
conn.execute(
"""INSERT INTO products (item_number, name, url, image_url, first_seen, last_seen)
VALUES (?, ?, ?, ?, ?, ?)
ON CONFLICT(item_number) DO UPDATE SET
name=excluded.name, last_seen=excluded.last_seen""",
(item.get("item_number"), item.get("name"), item.get("url"),
item.get("image"), now, now)
)
# Check if price has changed since last record
last_price = conn.execute(
"""SELECT price FROM price_history
WHERE item_number = ?
ORDER BY recorded_at DESC LIMIT 1""",
(item.get("item_number"),)
).fetchone()
current_price = item.get("price")
if last_price is None or last_price[0] != current_price:
conn.execute(
"""INSERT INTO price_history
(item_number, price, sale_price, savings, valid_through, recorded_at)
VALUES (?, ?, ?, ?, ?, ?)""",
(item.get("item_number"), current_price, sale_price,
savings, valid_through, datetime.now().isoformat())
)
conn.commit()
def get_price_drops(conn: sqlite3.Connection, min_drop_pct: float = 10.0) -> list:
"""Find products where current price is significantly below historical average."""
return conn.execute("""
WITH latest AS (
SELECT item_number, price, recorded_at,
ROW_NUMBER() OVER (PARTITION BY item_number ORDER BY recorded_at DESC) rn
FROM price_history
),
historical AS (
SELECT item_number, AVG(price) as avg_price
FROM price_history
WHERE recorded_at < datetime('now', '-7 days')
GROUP BY item_number
)
SELECT p.name, l.price as current_price,
h.avg_price as historical_avg,
ROUND((h.avg_price - l.price) / h.avg_price * 100, 1) as drop_pct
FROM latest l
JOIN historical h ON l.item_number = h.item_number
JOIN products p ON l.item_number = p.item_number
WHERE l.rn = 1
AND h.avg_price > 0
AND (h.avg_price - l.price) / h.avg_price * 100 >= ?
ORDER BY drop_pct DESC
""", (min_drop_pct,)).fetchall()
Category-Based Product Discovery
Instead of searching, browse Costco's category URLs directly for more systematic coverage:
COSTCO_CATEGORIES = {
"electronics": "https://www.costco.com/electronics.html",
"grocery": "https://www.costco.com/grocery.html",
"health-beauty": "https://www.costco.com/health-beauty.html",
"clothing": "https://www.costco.com/clothing.html",
"garden": "https://www.costco.com/garden-patio.html",
"kitchen": "https://www.costco.com/kitchen.html",
"furniture": "https://www.costco.com/furniture.html",
"toys": "https://www.costco.com/toys-games.html",
"sporting-goods": "https://www.costco.com/sports-fitness.html",
"auto": "https://www.costco.com/auto-accessories.html",
}
async def scrape_category(category_name: str, category_url: str) -> list:
"""Scrape all products from a Costco category page."""
async with async_playwright() as p:
browser = await p.chromium.launch(
headless=True,
proxy=PROXY_CONFIG,
)
context = await browser.new_context(
user_agent=(
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/126.0.0.0 Safari/537.36"
),
viewport={"width": 1440, "height": 900},
)
page = await context.new_page()
await page.goto(category_url, wait_until="networkidle", timeout=30000)
await page.wait_for_timeout(3000)
# Scroll to load lazy products
for _ in range(5):
await page.evaluate("window.scrollBy(0, window.innerHeight)")
await page.wait_for_timeout(1500)
products = await page.evaluate("""
() => {
const items = [];
document.querySelectorAll('[automation-id="productList"] .product').forEach(el => {
const name = el.querySelector('.description')?.textContent?.trim();
const price = el.querySelector('.price')?.textContent?.replace(/[^0-9.]/g, '');
const rating = el.querySelector('.ratings .value')?.textContent?.trim();
const link = el.querySelector('a.description')?.href;
const itemNum = el.querySelector('.item-num')?.textContent?.replace('Item ', '').trim();
const img = el.querySelector('img.product-img')?.src;
if (name && price) {
items.push({name, price: parseFloat(price), rating, url: link,
item_number: itemNum, image: img});
}
});
return items;
}
""")
await browser.close()
# Tag with category
for p in products:
p["category"] = category_name
return products
async def scrape_all_categories() -> list:
"""Scrape all major Costco categories systematically."""
import asyncio
import random
all_products = []
for category_name, url in COSTCO_CATEGORIES.items():
print(f"Scraping category: {category_name}")
products = await scrape_category(category_name, url)
all_products.extend(products)
print(f" Found {len(products)} products")
await asyncio.sleep(random.uniform(15, 30))
return all_products
Kirkland Signature Data Analysis
The Kirkland brand is particularly interesting for pricing analysis. Here is how to build a Kirkland product database:
async def collect_kirkland_products(db_path: str = "costco_prices.db") -> list:
"""Collect all currently listed Kirkland Signature products."""
products = await scrape_costco_search("kirkland signature", max_pages=5)
# Filter to Kirkland-only
kirkland = [p for p in products if "kirkland" in p.get("name", "").lower()]
conn = init_price_tracker(db_path)
for product in kirkland:
if product.get("item_number"):
record_price(conn, product)
conn.close()
print(f"Collected {len(kirkland)} Kirkland products")
return kirkland
def analyze_kirkland_by_category(products: list) -> dict:
"""Group Kirkland products by category and compute pricing stats."""
import re
import statistics
# Group by detected category
categories = {}
category_keywords = {
"organic": ["organic"],
"supplements": ["vitamin", "supplement", "omega", "probiotic", "protein"],
"snacks": ["snack", "nut", "chip", "cracker", "trail mix", "popcorn"],
"beverages": ["water", "coffee", "tea", "juice", "olive oil"],
"household": ["laundry", "detergent", "paper", "trash", "zip"],
"personal_care": ["shampoo", "conditioner", "soap", "dental", "floss"],
"meat": ["chicken", "beef", "salmon", "shrimp", "turkey"],
"dairy": ["butter", "cheese", "yogurt", "milk", "cream"],
}
for product in products:
name_lower = product.get("name", "").lower()
assigned = "other"
for cat, keywords in category_keywords.items():
if any(kw in name_lower for kw in keywords):
assigned = cat
break
categories.setdefault(assigned, []).append(product.get("price", 0))
analysis = {}
for cat, prices in categories.items():
valid_prices = [p for p in prices if p > 0]
if valid_prices:
analysis[cat] = {
"count": len(valid_prices),
"min": min(valid_prices),
"max": max(valid_prices),
"median": round(statistics.median(valid_prices), 2),
"avg": round(statistics.mean(valid_prices), 2),
}
return dict(sorted(analysis.items(), key=lambda x: x[1]["count"], reverse=True))
Integrating with Price Alert Services
For personal use, connect price drops to notifications:
import smtplib
from email.mime.text import MIMEText
def send_price_alert(
product_name: str,
item_number: str,
current_price: float,
original_price: float,
product_url: str,
to_email: str,
from_email: str,
smtp_password: str,
):
"""Send email notification when a tracked product drops in price."""
savings_pct = round((original_price - current_price) / original_price * 100, 1)
savings_amt = round(original_price - current_price, 2)
subject = f"Price Drop: {product_name[:50]} ({savings_pct}% off)"
body = f"""
Price Alert: {product_name}
Current Price: ${current_price:.2f}
Previous Price: ${original_price:.2f}
Savings: ${savings_amt:.2f} ({savings_pct}% off)
Product URL: {product_url}
Item Number: {item_number}
""".strip()
msg = MIMEText(body)
msg["Subject"] = subject
msg["From"] = from_email
msg["To"] = to_email
with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
server.login(from_email, smtp_password)
server.sendmail(from_email, to_email, msg.as_string())
def check_and_alert(db_path: str = "costco_prices.db", alert_email: str = None):
"""Check for price drops on watched items and send alerts."""
conn = sqlite3.connect(db_path)
drops = get_price_drops(conn, min_drop_pct=15.0)
for drop in drops:
name, current_price, historical_avg, drop_pct = drop
print(f"Drop: {name} - ${current_price:.2f} (was ~${historical_avg:.2f}, {drop_pct}% off)")
if alert_email:
product_url = conn.execute(
"SELECT url FROM products WHERE name = ?", (name,)
).fetchone()
url = product_url[0] if product_url else ""
send_price_alert(
product_name=name,
item_number="",
current_price=current_price,
original_price=historical_avg,
product_url=url,
to_email=alert_email,
from_email="[email protected]",
smtp_password="app_password",
)
conn.close()
Automating with a Cron Job
Run the scraper daily and let it build a price history automatically:
# Add to crontab: crontab -e
# Run every day at 6am
0 6 * * * cd /home/user/costco-tracker && python3 scrape.py >> logs/scrape.log 2>&1
# scrape.py -- daily scraper script
import asyncio
import sqlite3
from datetime import date
async def daily_scrape():
conn = init_price_tracker("costco_prices.db")
print(f"Daily Costco scrape: {date.today().isoformat()}")
# Weekly deals
deals = await scrape_weekly_deals()
print(f"Weekly deals: {len(deals)} items")
for deal in deals:
if deal.get("sale_price"):
record_price(
conn, deal,
sale_price=deal.get("sale_price"),
savings=deal.get("savings"),
valid_through=deal.get("valid_through"),
)
# Kirkland products
kirkland = await scrape_costco_search("kirkland signature", max_pages=3)
print(f"Kirkland products: {len(kirkland)} items")
for product in kirkland:
if product.get("item_number"):
record_price(conn, product)
# Check for price drops
drops = get_price_drops(conn, min_drop_pct=20.0)
if drops:
print(f"\nPrice drops found: {len(drops)}")
for drop in drops:
print(f" {drop[0]}: ${drop[1]:.2f} (was ${drop[2]:.2f}, -{drop[3]}%)")
conn.close()
print("Done.")
asyncio.run(daily_scrape())
Legal Notes
Costco's terms prohibit automated access. Scraping publicly visible product listings and pricing for personal research is generally low-risk legally. Building a competing retail service with their data is not. Key boundaries:
- Publicly visible prices and product names: generally safe for personal research
- Member-only prices or data behind authentication: avoid completely
- Bulk redistribution of their product catalog: high legal risk
- Price comparison tools for personal use: low risk with reasonable request rates