← Back to blog

Scraping IKEA Product Data and Prices Across Countries (2026)

Scraping IKEA Product Data and Prices Across Countries (2026)

IKEA operates separate storefronts for over 50 countries, but all of them draw from the same internal API infrastructure. The search backend at sik.search.blue.cdtapps.com accepts straightforward JSON requests, and product detail pages expose structured data from predictable REST endpoints.

What makes IKEA uniquely interesting as a scraping target is the price divergence across countries. The KALLAX shelving unit costs roughly 40% more in Australia than in Poland. The BILLY bookcase has a 60% spread between Norway and the US. These gaps are persistent and predictable — flat-pack furniture is heavy to ship, so IKEA prices locally rather than harmonizing globally.

This guide covers the full picture: API endpoints, anti-bot defenses, cross-country price comparison, store availability checking, and a complete SQLite-backed pipeline.

What Data Is Available

Through IKEA's internal APIs you can extract:

The article number is the key that ties everything together across countries. Find a product in Germany, note its article number, then query that same number against the US, UK, and Australian stores to compare prices.

Anti-Bot Architecture

IKEA deploys layered defenses that vary by country storefront:

Akamai Bot Manager. Most IKEA country storefronts run behind Akamai, which performs TLS fingerprinting, IP reputation scoring, and behavioral analysis. Datacenter IPs trigger challenges quickly. Akamai also evaluates request cadence — back-to-back search queries without delays is a fast path to blocking.

Client-side JavaScript challenges. Some country stores (notably the US and UK) serve browser challenges before loading catalog content. These verify that a real browser executed JavaScript. Pure HTTP clients without correct TLS fingerprints fail these.

Geo-IP enforcement. This is the biggest practical obstacle for cross-country work. IKEA's CDN serves different catalog data depending on origin IP geography. The internal APIs enforce geographic origin more aggressively than the HTML pages. To pull US pricing, your requests must originate from US IPs.

Rate limiting. The search API rate-limits by IP. A safe ceiling is one request every 2-4 seconds per IP. Beyond that, responses start returning empty result sets before explicit 429s appear.

IKEA's Internal APIs

Two endpoints cover 90% of use cases:

Search API:

GET https://sik.search.blue.cdtapps.com/{country}/{lang}/search
    ?q={query}&size=24&start=0&c=sr&v=20

country and lang follow ISO codes: us/en, de/de, gb/en, au/en, pl/pl, se/sv, etc. Response is JSON with a searchResultPage.products.productWindow array containing full product objects.

Product detail API:

GET https://www.ikea.com/{country}/{lang}/product-list/?key={articleNumber}

Returns structured JSON for a single product, including all metadata fields.

Store availability API:

GET https://api.ingka.ikea.com/cia/availabilities/ru/{country}
    ?itemNos={articleNumber}&expand=StoresList,Restocks,SalesLocations

Requires frontend-embedded client credentials — extract them once from DevTools.

Setting Up the Client

import asyncio
import httpx
import json
import random
from typing import Optional

USER_AGENTS = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36",
    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36",
]

def build_headers() -> dict:
    return {
        "User-Agent": random.choice(USER_AGENTS),
        "Accept": "application/json",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
        "Referer": "https://www.ikea.com/",
        "Origin": "https://www.ikea.com",
        "Sec-Fetch-Dest": "empty",
        "Sec-Fetch-Mode": "cors",
        "Sec-Fetch-Site": "cross-site",
    }

def build_client(proxy: Optional[str] = None) -> httpx.AsyncClient:
    """
    Build async HTTP client.
    http2=True brings TLS fingerprint closer to a real browser — helps with Akamai.
    """
    return httpx.AsyncClient(
        headers=build_headers(),
        proxies={"all://": proxy} if proxy else None,
        timeout=30.0,
        follow_redirects=True,
        http2=True,
    )

HTTP/2 matters here. Akamai's TLS fingerprinting is partly based on which HTTP version and cipher suites the client advertises. HTTP/2 with httpx is closer to a real browser than HTTP/1.1.

Searching Products

async def search_ikea(
    query: str,
    country: str = "us",
    lang: str = "en",
    size: int = 24,
    start: int = 0,
    proxy: Optional[str] = None,
) -> list[dict]:
    """
    Search IKEA product catalog for a country.

    Returns list of product dicts with article numbers, names, prices.
    """
    url = f"https://sik.search.blue.cdtapps.com/{country}/{lang}/search"
    params = {
        "q": query,
        "size": size,
        "start": start,
        "c": "sr",
        "v": "20",
        "rank": "RELEVANCE",
    }

    async with build_client(proxy) as client:
        try:
            resp = await client.get(url, params=params)
            resp.raise_for_status()
            data = resp.json()
        except (httpx.HTTPError, ValueError) as e:
            print(f"  Search failed: {e}")
            return []

    products = (
        data.get("searchResultPage", {})
        .get("products", {})
        .get("productWindow", [])
    )

    results = []
    for p in products:
        article_number = str(p.get("id", "")).replace(".", "")
        results.append({
            "article_number": article_number,
            "name": p.get("name"),
            "type_name": p.get("typeName"),
            "price": p.get("salesPrice", {}).get("numeral"),
            "currency": p.get("salesPrice", {}).get("currencyCode"),
            "price_formatted": p.get("salesPrice", {}).get("formatted"),
            "price_original": p.get("previousPrice", {}).get("numeral") if p.get("previousPrice") else None,
            "rating": p.get("ratingValue"),
            "review_count": p.get("ratingCount"),
            "url": p.get("pipUrl"),
            "image_url": p.get("mainImageUrl"),
            "category_path": p.get("categoryPath"),
            "country": country,
        })

    return results


async def search_ikea_paginated(
    query: str,
    country: str = "us",
    lang: str = "en",
    max_results: int = 100,
    proxy: Optional[str] = None,
) -> list[dict]:
    """Paginate through IKEA search results."""
    all_results = []
    page_size = 24
    start = 0

    while len(all_results) < max_results:
        batch = await search_ikea(query, country, lang, page_size, start, proxy)
        if not batch:
            break
        all_results.extend(batch)
        start += page_size
        await asyncio.sleep(random.uniform(2.0, 4.0))

    return all_results[:max_results]

Cross-Country Price Comparison

This is the most powerful IKEA scraping use case. The same KALLAX 2x2 unit (article number 40282181) has different prices in every market:

COUNTRY_CONFIGS = {
    "us": ("us", "en", "USD"),
    "gb": ("gb", "en", "GBP"),
    "de": ("de", "de", "EUR"),
    "au": ("au", "en", "AUD"),
    "pl": ("pl", "pl", "PLN"),
    "se": ("se", "sv", "SEK"),
    "no": ("no", "no", "NOK"),
    "fr": ("fr", "fr", "EUR"),
    "ca": ("ca", "en", "CAD"),
    "jp": ("jp", "ja", "JPY"),
    "nl": ("nl", "nl", "EUR"),
    "ch": ("ch", "de", "CHF"),
}

# Approximate USD rates — fetch live rates for production use
USD_RATES = {
    "USD": 1.0,
    "GBP": 1.27,
    "EUR": 1.09,
    "AUD": 0.65,
    "PLN": 0.25,
    "SEK": 0.096,
    "NOK": 0.094,
    "CAD": 0.74,
    "JPY": 0.0067,
    "CHF": 1.11,
}


async def fetch_price_for_country(
    article_number: str,
    country_code: str,
    proxy: Optional[str] = None,
) -> Optional[dict]:
    """
    Fetch price for an IKEA article number in a specific country.
    Searches by article number — IKEA's search handles numeric queries.
    """
    country, lang, currency = COUNTRY_CONFIGS[country_code]
    products = await search_ikea(article_number, country, lang, size=5, proxy=proxy)

    clean_number = article_number.replace(".", "")

    for p in products:
        if p.get("article_number", "").replace(".", "") == clean_number:
            price_num = p.get("price")
            if price_num is None:
                return None
            price_usd = float(price_num) * USD_RATES.get(currency, 1.0)
            return {
                "country": country_code,
                "currency": currency,
                "price_local": float(price_num),
                "price_usd": round(price_usd, 2),
                "price_formatted": p.get("price_formatted"),
                "name": p.get("name"),
                "on_sale": p.get("price_original") is not None,
            }
    return None


async def compare_prices_across_countries(
    article_number: str,
    proxy_template: Optional[str] = None,
) -> list[dict]:
    """
    Fetch price across all configured countries.
    For geo-IP enforcement, use country-targeted proxy sessions.
    """
    tasks = [
        fetch_price_for_country(article_number, cc, proxy_template)
        for cc in COUNTRY_CONFIGS
    ]
    results = await asyncio.gather(*tasks, return_exceptions=True)
    valid = [r for r in results if r and not isinstance(r, Exception)]
    valid.sort(key=lambda x: x["price_usd"])
    return valid


async def main():
    # Compare KALLAX 2x2 prices
    article = "40282181"
    prices = await compare_prices_across_countries(article)

    print(f"Price comparison for KALLAX (article {article}):\n")
    for r in prices:
        on_sale = " [SALE]" if r.get("on_sale") else ""
        print(
            f"  {r['country'].upper():4s}  {r['price_formatted']:>12s}"
            f"  (~${r['price_usd']:>7.2f} USD){on_sale}"
        )

    if len(prices) >= 2:
        cheapest = prices[0]
        priciest = prices[-1]
        spread = (priciest["price_usd"] / cheapest["price_usd"] - 1) * 100
        print(f"\n  Cheapest:  {cheapest['country'].upper()} at ${cheapest['price_usd']:.2f}")
        print(f"  Most expensive: {priciest['country'].upper()} at ${priciest['price_usd']:.2f}")
        print(f"  Price spread: {spread:.1f}%")

asyncio.run(main())

Store Availability API

Check stock levels at individual IKEA stores:

# These credentials are embedded in IKEA's frontend JavaScript
# Extract them from DevTools: open any IKEA product page, filter XHR requests,
# look for requests to api.ingka.ikea.com — the headers contain the credentials
INGKA_CLIENT_ID = "b6c117e5-ae61-4ef5-b4cc-e0b1e37f0631"  # Example — extract fresh
INGKA_CLIENT_SECRET = "b6c117e5-ae61-4ef5-b4cc-e0b1e37f0631"  # Extract from DevTools

async def check_store_availability(
    article_number: str,
    country_code: str,
    proxy: Optional[str] = None,
) -> list[dict]:
    """
    Check stock levels at IKEA stores for an article number.

    Returns list of stores with quantity available and in-stock status.
    """
    url = f"https://api.ingka.ikea.com/cia/availabilities/ru/{country_code}"
    params = {
        "itemNos": article_number.replace(".", ""),
        "expand": "StoresList,Restocks,SalesLocations",
    }

    headers = {
        "User-Agent": random.choice(USER_AGENTS),
        "Accept": "application/json",
        "x-client-id": INGKA_CLIENT_ID,
        "x-client-secret": INGKA_CLIENT_SECRET,
        "Origin": "https://www.ikea.com",
        "Referer": "https://www.ikea.com/",
    }

    async with httpx.AsyncClient(
        headers=headers,
        proxies={"all://": proxy} if proxy else None,
        timeout=30.0,
        http2=True,
    ) as client:
        resp = await client.get(url, params=params)
        resp.raise_for_status()
        data = resp.json()

    stores = []
    for availability in data.get("availabilities", []):
        cash_carry = (
            availability.get("buyingOption", {})
            .get("cashCarry", {})
            .get("availability", {})
        )
        store_id = availability.get("classUnitKey", {}).get("classUnitCode")
        quantity = cash_carry.get("quantity", 0)

        stores.append({
            "store_id": store_id,
            "quantity": quantity,
            "in_stock": quantity > 0,
            "restock_date": cash_carry.get("restockDateTime"),
        })

    return sorted(stores, key=lambda s: s["quantity"], reverse=True)

SQLite Schema

import sqlite3

def init_ikea_db(db_path: str = "ikea.db") -> sqlite3.Connection:
    conn = sqlite3.connect(db_path)
    conn.executescript("""
        CREATE TABLE IF NOT EXISTS products (
            article_number TEXT NOT NULL,
            country TEXT NOT NULL,
            name TEXT,
            type_name TEXT,
            price_local REAL,
            currency TEXT,
            price_usd REAL,
            price_formatted TEXT,
            on_sale INTEGER DEFAULT 0,
            rating REAL,
            review_count INTEGER,
            url TEXT,
            image_url TEXT,
            scraped_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
            PRIMARY KEY (article_number, country)
        );

        CREATE TABLE IF NOT EXISTS store_availability (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            article_number TEXT NOT NULL,
            country TEXT NOT NULL,
            store_id TEXT,
            quantity INTEGER,
            in_stock INTEGER,
            restock_date TEXT,
            checked_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        );

        CREATE INDEX IF NOT EXISTS idx_products_article
            ON products(article_number);

        CREATE INDEX IF NOT EXISTS idx_products_country
            ON products(country);

        CREATE INDEX IF NOT EXISTS idx_availability_article
            ON store_availability(article_number, country);
    """)
    conn.commit()
    return conn


def save_product(conn: sqlite3.Connection, product: dict):
    conn.execute(
        """INSERT OR REPLACE INTO products
           (article_number, country, name, type_name, price_local, currency,
            price_usd, price_formatted, on_sale, rating, review_count, url, image_url)
           VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?)""",
        (
            product.get("article_number"),
            product.get("country"),
            product.get("name"),
            product.get("type_name"),
            product.get("price_local"),
            product.get("currency"),
            product.get("price_usd"),
            product.get("price_formatted"),
            int(product.get("on_sale", False)),
            product.get("rating"),
            product.get("review_count"),
            product.get("url"),
            product.get("image_url"),
        ),
    )
    conn.commit()

Proxy Configuration

Geo-IP enforcement makes residential proxies non-optional for cross-country price comparison. IKEA's CDN checks origin IP against the country store being requested.

ThorData's residential proxies support city-level targeting, which maps cleanly onto IKEA's country API segments. To pull from us/en, target a US city. To pull from de/de, target a German city.

# Standard proxy URL
PROXY = "http://USER:[email protected]:9000"

# For country-specific targeting (check ThorData docs for session syntax)
# PROXY_US = "http://USER:[email protected]:9000"
# PROXY_DE = "http://USER:[email protected]:9000"

For the comparison function, ideally use a different country-targeted session for each fetch_price_for_country call.

Complete Pipeline

A pipeline that searches a product category across multiple countries and stores results:

async def run_catalog_pipeline(
    search_query: str,
    countries: list[str] = None,
    db_path: str = "ikea.db",
    proxy: Optional[str] = None,
    max_results_per_country: int = 50,
):
    """
    Search IKEA for a product across multiple countries and store prices.
    """
    if countries is None:
        countries = list(COUNTRY_CONFIGS.keys())

    conn = init_ikea_db(db_path)

    for country_code in countries:
        country, lang, currency = COUNTRY_CONFIGS[country_code]
        print(f"\nSearching {country_code.upper()}: {search_query}")

        products = await search_ikea_paginated(
            search_query, country, lang,
            max_results=max_results_per_country,
            proxy=proxy,
        )
        print(f"  Found {len(products)} products")

        for p in products:
            # Add USD equivalent
            price = p.get("price")
            if price:
                p["price_usd"] = round(float(price) * USD_RATES.get(currency, 1.0), 2)
            save_product(conn, p)

        # Pause between country queries to respect rate limits
        await asyncio.sleep(random.uniform(3.0, 6.0))

    conn.close()
    print(f"\nPipeline complete. Data saved to {db_path}")


# Run it
asyncio.run(run_catalog_pipeline(
    "bookcase",
    countries=["us", "de", "gb", "se", "pl"],
    proxy=PROXY,
))

Error Handling

import asyncio

async def safe_search(
    query: str,
    country: str,
    lang: str,
    max_retries: int = 3,
    proxy: Optional[str] = None,
) -> list[dict]:
    """Search with retry on failure."""
    for attempt in range(max_retries):
        try:
            results = await search_ikea(query, country, lang, proxy=proxy)
            if results:
                return results
            # Empty result — may be rate limited
            if attempt < max_retries - 1:
                wait = 2 ** attempt * 10
                print(f"  Empty result on attempt {attempt + 1}, waiting {wait}s")
                await asyncio.sleep(wait)
        except httpx.HTTPStatusError as e:
            if e.response.status_code == 429 and attempt < max_retries - 1:
                await asyncio.sleep(60)
                continue
            raise
        except (httpx.ConnectError, httpx.TimeoutException):
            if attempt < max_retries - 1:
                await asyncio.sleep(10)
                continue
            raise
    return []

Tips and Gotchas

Article numbers and dots. IKEA article numbers sometimes include dots in formatted display (402.821.81) but APIs expect them without dots (40282181). Strip dots before passing to any endpoint.

Non-English search results. IKEA's search returns results in the local language for non-English locales. For cross-country name matching, compare by article number, not product name string.

Currency conversion. The hardcoded USD rates above are a starting point. For production price comparison, pull live rates from a currency API before each comparison run.

Assembly instruction PDFs. These follow a predictable URL pattern and are hosted on static files without bot protection:

https://www.ikea.com/us/en/assembly_instructions/{article_with_dots}.pdf

Example: https://www.ikea.com/us/en/assembly_instructions/402.821.81.pdf These can be fetched directly without proxies.

Rate limiting per endpoint. The search API and availability API have separate rate limits. The availability API is more lenient — you can check stock for several stores in quick succession.

Tracking Price Changes Over Time

With periodic scraping, you can build a price history for specific IKEA products. Prices change for promotions, seasonal sales, and currency fluctuations:

def init_ikea_timeseries_db(db_path: str = "ikea_ts.db") -> sqlite3.Connection:
    """Initialize a time-series database for IKEA price tracking."""
    conn = sqlite3.connect(db_path)
    conn.executescript("""
        CREATE TABLE IF NOT EXISTS price_history (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            article_number TEXT NOT NULL,
            country TEXT NOT NULL,
            name TEXT,
            price_local REAL,
            currency TEXT,
            price_usd REAL,
            on_sale INTEGER DEFAULT 0,
            original_price_local REAL,
            scraped_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        );

        CREATE INDEX IF NOT EXISTS idx_history_article_country
            ON price_history(article_number, country, scraped_at);
    """)
    conn.commit()
    return conn


def record_price_snapshot(
    conn: sqlite3.Connection,
    article_number: str,
    country: str,
    product: dict,
):
    """Store a price observation with timestamp."""
    conn.execute(
        """INSERT INTO price_history
           (article_number, country, name, price_local, currency, price_usd, on_sale)
           VALUES (?,?,?,?,?,?,?)""",
        (
            article_number,
            country,
            product.get("name"),
            product.get("price_local"),
            product.get("currency"),
            product.get("price_usd"),
            int(product.get("on_sale", False)),
        ),
    )
    conn.commit()


def get_price_history(
    conn: sqlite3.Connection,
    article_number: str,
    country: str,
) -> list[dict]:
    """Retrieve full price history for an article in a country."""
    rows = conn.execute(
        """SELECT price_usd, price_local, currency, on_sale, scraped_at
           FROM price_history
           WHERE article_number = ? AND country = ?
           ORDER BY scraped_at ASC""",
        (article_number, country),
    ).fetchall()

    return [
        {
            "price_usd": r[0],
            "price_local": r[1],
            "currency": r[2],
            "on_sale": bool(r[3]),
            "date": r[4],
        }
        for r in rows
    ]

Finding Articles with the Biggest Price Spreads

To systematically identify which IKEA products have the highest cross-country price spreads, query your collected data:

def find_biggest_spreads(conn: sqlite3.Connection, min_countries: int = 5) -> list[dict]:
    """Find articles with the largest USD price spread across countries."""
    rows = conn.execute(
        """SELECT
               article_number,
               name,
               COUNT(DISTINCT country) AS country_count,
               MIN(price_usd) AS min_price,
               MAX(price_usd) AS max_price,
               ROUND((MAX(price_usd) / MIN(price_usd) - 1) * 100, 1) AS spread_pct,
               GROUP_CONCAT(country || ':' || ROUND(price_usd, 2), ', ') AS prices
           FROM products
           WHERE price_usd IS NOT NULL
           GROUP BY article_number
           HAVING country_count >= ?
           ORDER BY spread_pct DESC
           LIMIT 20""",
        (min_countries,),
    ).fetchall()

    return [
        {
            "article_number": r[0],
            "name": r[1],
            "countries": r[2],
            "min_usd": r[3],
            "max_usd": r[4],
            "spread_pct": r[5],
            "prices": r[6],
        }
        for r in rows
    ]


conn = sqlite3.connect("ikea.db")
spreads = find_biggest_spreads(conn, min_countries=5)
for product in spreads[:10]:
    print(f"\n{product['name']} ({product['article_number']})")
    print(f"  Spread: {product['spread_pct']:.1f}% | Min: ${product['min_usd']:.2f} | Max: ${product['max_usd']:.2f}")
    print(f"  Prices: {product['prices']}")

This type of analysis typically surfaces furniture categories (sofas, wardrobes, beds) with the highest spreads because they are bulky and shipping costs vary dramatically by country. Smaller, lighter items like LACK side tables tend to have lower spreads. The POÄNG armchair is a classic high-spread example.

Rate limiting per endpoint. The search API and availability API have separate rate limits. The availability API is more lenient — you can check stock for several stores in quick succession.