Your Scraper's TLS Handshake Is Ratting You Out: A Complete Guide to JA3/JA4+ Fingerprinting

2026-03-27 scraping tls fingerprinting python anti-bot ja3 ja4

You have set up perfect headers. Your User-Agent rotates between Chrome, Firefox, and Safari. You are sending sec-ch-ua hints, Accept-Language headers, even Sec-Fetch-Dest metadata. Your request looks identical to what a real browser sends. And you are still getting blocked on the very first request.

Here is what is actually happening: the site never looked at your HTTP headers. It fingerprinted your TLS handshake before your HTTP request even started. Your beautifully crafted headers are riding inside an encrypted tunnel that already screamed "I am a Python script" during the initial handshake. The anti-bot system made its decision at the transport layer — your application-layer disguise was irrelevant.

This is one of the most common and least understood reasons scrapers fail. I have spent months investigating TLS fingerprinting after debugging a scraper that had perfect headers, rotating residential proxies, randomized timing, and still got blocked 100% of the time on Cloudflare-protected sites. The moment I understood the TLS layer, my success rate went from zero to over 90%. This guide covers everything I learned.

If you are building any kind of web scraper, data collection tool, or automated HTTP client that targets sites with bot protection, understanding TLS fingerprinting is not optional — it is the single most important anti-detection concept you need to master in 2025.

What Is a TLS Fingerprint?

Every HTTPS connection begins with a TLS handshake. Before a single byte of your HTTP request is transmitted, your client and the server negotiate encryption parameters through a series of messages. The very first message your client sends — the ClientHello — contains a wealth of identifying information.

The ClientHello Message

When your HTTP library initiates a TLS connection, it sends a ClientHello that contains:

TLS version: The maximum TLS version your client supports (typically TLS 1.3)
Cipher suites: An ordered list of encryption algorithms your client can use
Extensions: Additional capabilities like Server Name Indication (SNI), supported groups, signature algorithms, ALPN protocols
Supported groups: The elliptic curves your client supports for key exchange
Signature algorithms: Which signing algorithms your client accepts
Compression methods: Usually just "null" in modern clients

The critical insight is that every HTTP library has a unique combination of these parameters. The cipher suite order, extension list, and supported groups are determined by the underlying TLS implementation, not by your application code. You cannot change them by setting HTTP headers.

How JA3 Hashing Works

JA3 (developed by John Althouse, Jeff Atkinson, and Josh Atkins at Salesforce) creates a fingerprint by concatenating five fields from the ClientHello:

JA3 = MD5(TLSVersion,Ciphers,Extensions,EllipticCurves,EllipticCurvePointFormats)

For example, a Python requests library ClientHello might produce:

TLSVersion:    771
Ciphers:       4866-4867-4865-49196-49200-159-52393-52392-52394-49195-49199-158-49188-49192-107-49187-49191-103-49162-49172-57-49161-49171-51-157-156-61-60-53-47-255
Extensions:    0-11-10-35-22-23-13-43-45-51
Curves:        29-23-30-25-24
Point:         0

JA3 Hash:      e7d705a3286e19ea42f587b344ee6865

Meanwhile, Chrome 131 produces a completely different hash because it uses BoringSSL (not OpenSSL) and has different cipher preferences:

TLSVersion:    771
Ciphers:       4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53
Extensions:    0-23-65281-10-11-35-16-5-13-18-51-45-43-27-17513-21
Curves:        29-23-24
Point:         0

JA3 Hash:      cd08e31494f9531f560d64c695473da9

These two hashes are completely different. Any server that maintains a database of known JA3 fingerprints can instantly tell that the first connection is a Python script and the second is Chrome.

JA3S: Server Fingerprinting

JA3S is the server-side counterpart. It fingerprints the server's ServerHello response. Since servers often respond differently depending on the client's capabilities, the JA3/JA3S pair together provides an even more specific identification of both endpoints. This is used more for network security monitoring than bot detection, but it is worth understanding the full picture.

The Fingerprint Landscape in 2025

Here is what common tools and libraries look like to a modern Web Application Firewall:

Tool / Library	TLS Backend	JA3 Looks Like	Detection Risk
Python `requests`	OpenSSL (via urllib3)	Python/OpenSSL	Instant block
Python `httpx`	OpenSSL (via h11/httpcore)	Python/OpenSSL	Instant block
Python `aiohttp`	OpenSSL	Python/OpenSSL	Instant block
Python `urllib3`	OpenSSL	Python/OpenSSL	Instant block
Node.js `axios`/`node-fetch`	Node.js OpenSSL	Node.js-specific	High
Go `net/http`	Go crypto/tls	Go stdlib	High
Rust `reqwest`	rustls or OpenSSL	Rust-specific	High
Java `HttpClient`	JSSE	Java-specific	High
`curl` (default)	OpenSSL/LibreSSL	curl-specific	High
Headless Chrome (Puppeteer)	BoringSSL	Chrome-ish	Medium
Headless Chrome (Playwright)	BoringSSL	Chrome-ish	Medium
`curl_cffi` (impersonating)	BoringSSL	Matches target browser	Low
Real Chrome browser	BoringSSL	Chrome (authentic)	Very Low
Real Firefox browser	NSS	Firefox (authentic)	Very Low

Notice the pattern: anything built on OpenSSL's default settings gets caught immediately. It is not that OpenSSL is a bad TLS library — it is that its default cipher suite ordering and extension configuration is well-documented and distinctive. Anti-bot vendors maintain databases of thousands of JA3 hashes mapped to specific tools and library versions.

Why All Python HTTP Libraries Look the Same

Python's requests, httpx, and aiohttp all ultimately use OpenSSL for TLS through Python's ssl module. Even though they are different libraries with different APIs, their TLS behavior is identical because they all delegate to the same underlying C library. The JA3 hash is determined by OpenSSL's default configuration, not by the Python library wrapping it.

This means that switching from requests to httpx does not change your TLS fingerprint. You are swapping the HTTP-layer code while keeping the same TLS-layer identity. From a fingerprinting perspective, they are the same client.

# All three produce the SAME JA3 fingerprint:
import requests
import httpx
import aiohttp

# requests uses urllib3 -> OpenSSL
requests.get("https://target.com")

# httpx uses httpcore -> h11/h2 -> OpenSSL
httpx.get("https://target.com")

# aiohttp uses its own connector -> OpenSSL
# async with aiohttp.ClientSession() as session:
#     await session.get("https://target.com")

# From the server's perspective, all three connections
# have identical TLS ClientHello messages

Why Headless Chrome Is Not Safe Either

You might think switching to Puppeteer or Playwright solves the problem. After all, they launch a real Chrome binary with a real BoringSSL TLS stack. Better, but not bulletproof.

The Headless Fingerprint Difference

Headless Chrome's JA3 fingerprint is almost identical to regular Chrome, but there are subtle differences:

TLS extension ordering: In some Chrome versions, the headless mode produces a slightly different extension order in the ClientHello. This difference is small enough to look identical to basic JA3, but JA4+ and custom fingerprinting catch it.
GREASE values: Chrome uses GREASE (Generate Random Extensions And Sustain Extensibility) to insert random cipher suite and extension values. The GREASE values differ between headed and headless instances, and some fingerprinting systems track these patterns.
ALPN preferences: The Application-Layer Protocol Negotiation extension can differ between headed and headless modes, particularly regarding HTTP/2 vs HTTP/3 preferences.

Version Mismatch Detection

Chrome's JA3 hash changes between versions because cipher preferences and extensions evolve. If your headless Chrome is version 120 but real users are on 131, the version mismatch in the TLS fingerprint is a signal:

# Chrome 120 JA3: abc123...
# Chrome 124 JA3: def456...
# Chrome 131 JA3: ghi789...

# If Cloudflare sees Chrome/131 User-Agent but Chrome/120 JA3,
# that mismatch is a detection signal

This means you need to keep your headless browser updated. A three-month-old Chrome binary has a JA3 that no real user is sending anymore.

Beyond JA3: HTTP/2 Fingerprinting

Modern anti-bot systems do not stop at TLS fingerprinting. They also fingerprint your HTTP/2 behavior. When an HTTP/2 connection is established, the client sends a SETTINGS frame with configuration parameters:

SETTINGS Frame:
  HEADER_TABLE_SIZE: 65536
  ENABLE_PUSH: 0 (or 1)
  MAX_CONCURRENT_STREAMS: 1000
  INITIAL_WINDOW_SIZE: 6291456
  MAX_FRAME_SIZE: 16384
  MAX_HEADER_LIST_SIZE: 262144

These values differ between browsers. Chrome, Firefox, and Safari each send distinct SETTINGS values. A Python HTTP/2 client (like httpx with h2) sends different values than any browser. This creates a secondary fingerprint that can be checked alongside JA3.

# Chrome's HTTP/2 SETTINGS (typical):
# HEADER_TABLE_SIZE=65536, INITIAL_WINDOW_SIZE=6291456, MAX_HEADER_LIST_SIZE=262144

# httpx with h2 (typical):
# HEADER_TABLE_SIZE=4096, INITIAL_WINDOW_SIZE=65535, MAX_HEADER_LIST_SIZE=16384

# These differences are detectable and logged by CDN providers

Additionally, browsers send HTTP/2 frames in a specific order with specific priority values (PRIORITY frames, WINDOW_UPDATE timing). This behavior is called the HTTP/2 fingerprint and is tracked by Cloudflare, Akamai, and other CDN/anti-bot providers.

JA4+: The Next Generation

JA4+ is a suite of fingerprinting methods developed by FoxIO that provides much more granular identification than JA3. It includes:

JA4: Improved TLS Client Fingerprinting

JA4 improves on JA3 by: - Separating cipher suites from extensions (so they can be analyzed independently) - Sorting cipher suites and extensions alphabetically (removing ordering as a variable that changes between library versions) - Including the ALPN protocol negotiation - Using truncated SHA256 instead of MD5

JA4 format: [type][version][SNI][ciphers_count][extensions_count]_[sorted_ciphers_hash]_[sorted_extensions_hash]

Example: t13d1517h2_8daaf6152771_b0da82dd1658
  t = TCP
  13 = TLS 1.3
  d = destination: domain (vs IP)
  15 = 15 cipher suites
  17 = 17 extensions
  h2 = ALPN: HTTP/2
  8daaf6152771 = truncated hash of sorted cipher suites
  b0da82dd1658 = truncated hash of sorted extensions

JA4H: HTTP Client Fingerprinting

JA4H fingerprints HTTP headers — not just which headers are present, but their exact order. Browsers send headers in a specific order that differs from programmatic HTTP clients:

# Chrome sends headers in this order:
# Host, Connection, sec-ch-ua, sec-ch-ua-mobile, sec-ch-ua-platform,
# Upgrade-Insecure-Requests, User-Agent, Accept, Sec-Fetch-Site, ...

# Python httpx sends headers in this order:
# Host, User-Agent, Accept, Accept-Encoding, Connection, ...

# Even if the header VALUES are identical, the ORDER reveals the client

JA4S, JA4X, JA4SSH

The JA4+ suite also includes: - JA4S: Server TLS fingerprint (ServerHello analysis) - JA4X: X.509 certificate fingerprint - JA4SSH: SSH client/server fingerprint - JA4T: TCP fingerprint (window size, TTL, options)

Together, these create a multi-dimensional fingerprint that is extremely difficult to spoof comprehensively.

What Actually Works: Practical Solutions

Solution 1: curl_cffi (Best Python Option)

curl_cffi is a Python wrapper around libcurl compiled with BoringSSL. It can impersonate specific browser versions by reproducing their exact TLS ClientHello, including cipher suite order, extension order, GREASE values, and ALPN settings.

from curl_cffi import requests as cffi_requests
import json

class TLSStealthClient:
    """HTTP client with browser-grade TLS fingerprinting."""

    # Supported browser impersonation targets
    BROWSERS = {
        "chrome131": "chrome131",
        "chrome130": "chrome130",
        "chrome124": "chrome124",
        "chrome120": "chrome120",
        "edge131": "edge131",
        "safari18": "safari18_0",
        "firefox132": "firefox132",
    }

    def __init__(
        self,
        browser: str = "chrome131",
        proxy: str | None = None,
        timeout: int = 15,
    ):
        if browser not in self.BROWSERS:
            raise ValueError(f"Unknown browser: {browser}. Use one of: {list(self.BROWSERS.keys())}")

        proxy_dict = {"https": proxy, "http": proxy} if proxy else None

        self.session = cffi_requests.Session(
            impersonate=self.BROWSERS[browser],
            proxies=proxy_dict,
            timeout=timeout,
        )
        self.browser = browser

    def get(self, url: str, **kwargs) -> cffi_requests.Response:
        """Send GET request with browser TLS fingerprint."""
        return self.session.get(url, **kwargs)

    def post(self, url: str, **kwargs) -> cffi_requests.Response:
        """Send POST request with browser TLS fingerprint."""
        return self.session.post(url, **kwargs)

    def verify_fingerprint(self) -> dict:
        """Check your TLS fingerprint against a public checker."""
        resp = self.session.get("https://tls.browserleaks.com/json")
        data = resp.json()
        return {
            "ja3_hash": data.get("ja3_hash"),
            "ja3_text": data.get("ja3_text"),
            "user_agent": data.get("user_agent"),
            "akamai_hash": data.get("akamai_hash"),
            "impersonating": self.browser,
        }

    def close(self):
        self.session.close()

    def __enter__(self):
        return self

    def __exit__(self, *args):
        self.close()


# Usage: verify your fingerprint matches Chrome
with TLSStealthClient(browser="chrome131") as client:
    fp = client.verify_fingerprint()
    print(json.dumps(fp, indent=2))

    # Then scrape with the same client
    resp = client.get("https://target-site.com/api/data")
    print(resp.status_code)

Why curl_cffi Works So Well

When you set impersonate="chrome131", curl_cffi does not just change the User-Agent. It reproduces the exact TLS ClientHello that Chrome 131 sends:

Same cipher suite list in the same order
Same TLS extensions in the same order
Same GREASE values and patterns
Same supported groups and signature algorithms
Same ALPN protocols (h2, http/1.1)
Same BoringSSL-specific behaviors

The result is a JA3 hash that is identical to a real Chrome 131 browser. From the server's perspective, there is no TLS-layer difference between your Python script and a real Chrome user.

Combining curl_cffi with Rotating Proxies

For maximum effectiveness, pair curl_cffi with residential proxy rotation:

from curl_cffi import requests as cffi_requests
import random
import time

class StealthScraper:
    """Production scraper with TLS impersonation and proxy rotation."""

    def __init__(self, proxy_url: str):
        self.proxy_url = proxy_url
        self.browsers = ["chrome131", "chrome130", "chrome124"]
        self._new_session()

    def _new_session(self):
        """Create a new session with random browser impersonation."""
        browser = random.choice(self.browsers)
        self.session = cffi_requests.Session(
            impersonate=browser,
            proxies={"https": self.proxy_url, "http": self.proxy_url},
            timeout=15,
        )
        self.request_count = 0

    def get(self, url: str, **kwargs) -> cffi_requests.Response:
        """GET request with automatic session rotation."""
        if self.request_count >= 15:  # Rotate every 15 requests
            self.session.close()
            self._new_session()

        resp = self.session.get(url, **kwargs)
        self.request_count += 1
        return resp

    def close(self):
        self.session.close()


# With ThorData rotating residential proxies
scraper = StealthScraper(
    proxy_url="http://user:[email protected]:9000"
)

# Each request gets a real browser TLS fingerprint + residential IP
resp = scraper.get("https://protected-site.com/data")
print(resp.status_code)
scraper.close()

Using ThorData residential proxies with curl_cffi is particularly effective because you are combining two critical anti-detection layers: genuine browser TLS fingerprints from BoringSSL impersonation plus real residential IP addresses that have high trust scores with CDN providers. This combination defeats both the network-layer (IP reputation) and transport-layer (TLS fingerprint) detection that catches most scrapers.

Solution 2: Use a Real Browser via CDP

If you need full browser rendering anyway, connect to a real (headed) Chrome instance via Chrome DevTools Protocol for a 100% authentic fingerprint:

import subprocess
import websocket
import json
import time
import tempfile
import os

class RealBrowserClient:
    """Control a real Chrome browser for authentic TLS fingerprints."""

    def __init__(self, chrome_path: str | None = None, port: int = 9222):
        self.port = port
        self.chrome_path = chrome_path or self._find_chrome()
        self.process = None
        self.user_data_dir = tempfile.mkdtemp()

    def _find_chrome(self) -> str:
        """Find Chrome binary on the system."""
        candidates = [
            "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome",
            "/usr/bin/google-chrome",
            "/usr/bin/chromium-browser",
            "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe",
        ]
        for path in candidates:
            if os.path.exists(path):
                return path
        raise FileNotFoundError("Chrome not found. Specify chrome_path.")

    def start(self):
        """Launch Chrome with remote debugging enabled."""
        self.process = subprocess.Popen([
            self.chrome_path,
            f"--remote-debugging-port={self.port}",
            f"--user-data-dir={self.user_data_dir}",
            "--no-first-run",
            "--no-default-browser-check",
        ])
        time.sleep(3)  # Wait for Chrome to start

    def navigate(self, url: str) -> str:
        """Navigate to a URL and return the page HTML."""
        import httpx

        # Get the WebSocket debugger URL
        resp = httpx.get(f"http://localhost:{self.port}/json")
        pages = resp.json()
        ws_url = pages[0]["webSocketDebuggerUrl"]

        # Connect and navigate
        ws = websocket.create_connection(ws_url)

        ws.send(json.dumps({
            "id": 1,
            "method": "Page.navigate",
            "params": {"url": url},
        }))
        ws.recv()  # Navigation response

        time.sleep(3)  # Wait for page to load

        # Get the page HTML
        ws.send(json.dumps({
            "id": 2,
            "method": "Runtime.evaluate",
            "params": {"expression": "document.documentElement.outerHTML"},
        }))
        result = json.loads(ws.recv())
        html = result["result"]["result"]["value"]

        ws.close()
        return html

    def stop(self):
        """Shut down Chrome."""
        if self.process:
            self.process.terminate()
            self.process.wait()


# Usage
browser = RealBrowserClient()
browser.start()
html = browser.navigate("https://protected-site.com/data")
print(f"Got {len(html)} bytes of HTML")
browser.stop()

The TLS fingerprint is perfectly authentic because it IS a real Chrome browser. The downside is the resource overhead of running a full browser process.

Solution 3: tls-client (Go-based Python Library)

Another option is tls-client, a Python library that uses Go's crypto/tls under the hood to impersonate browsers:

import tls_client

session = tls_client.Session(
    client_identifier="chrome_131",
    random_tls_extension_order=True,
)

# Set headers to match the impersonated browser
session.headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.9",
    "Accept-Encoding": "gzip, deflate, br",
}

resp = session.get("https://protected-site.com")
print(resp.status_code, len(resp.text))

Building a Complete Anti-Detection Stack

TLS fingerprinting is one layer of a multi-layer detection system. Here is how to build a comprehensive anti-detection stack:

from curl_cffi import requests as cffi_requests
import random
import time
import json
from dataclasses import dataclass

@dataclass
class AntiDetectionConfig:
    """Configuration for multi-layer anti-detection."""
    # TLS layer
    browser_impersonation: str = "chrome131"
    rotate_browser_version: bool = True

    # Network layer
    proxy_url: str | None = None
    rotate_proxy_per_request: bool = False

    # HTTP layer
    randomize_header_order: bool = True
    include_sec_ch_headers: bool = True

    # Behavioral layer
    min_delay: float = 2.0
    max_delay: float = 8.0
    max_requests_per_session: int = 20

class AntiDetectionScraper:
    """Multi-layer anti-detection scraper."""

    CHROME_VERSIONS = ["chrome131", "chrome130", "chrome124"]
    USER_AGENTS = {
        "chrome131": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
        "chrome130": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36",
        "chrome124": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
    }

    def __init__(self, config: AntiDetectionConfig):
        self.config = config
        self.request_count = 0
        self._create_session()

    def _create_session(self):
        """Create a new session with fresh fingerprint."""
        browser = (
            random.choice(self.CHROME_VERSIONS)
            if self.config.rotate_browser_version
            else self.config.browser_impersonation
        )
        self.current_browser = browser

        proxy_dict = None
        if self.config.proxy_url:
            proxy_dict = {
                "https": self.config.proxy_url,
                "http": self.config.proxy_url,
            }

        self.session = cffi_requests.Session(
            impersonate=browser,
            proxies=proxy_dict,
            timeout=15,
        )

    def _build_headers(self, url: str) -> dict:
        """Build browser-consistent headers for the current impersonation."""
        ua = self.USER_AGENTS.get(self.current_browser, self.USER_AGENTS["chrome131"])

        headers = {
            "User-Agent": ua,
            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
            "Accept-Language": "en-US,en;q=0.9",
            "Accept-Encoding": "gzip, deflate, br",
            "DNT": "1",
            "Upgrade-Insecure-Requests": "1",
            "Connection": "keep-alive",
        }

        if self.config.include_sec_ch_headers and "chrome" in self.current_browser:
            version = self.current_browser.replace("chrome", "")
            headers.update({
                "Sec-Ch-Ua": f'"Google Chrome";v="{version}", "Chromium";v="{version}", "Not_A Brand";v="24"',
                "Sec-Ch-Ua-Mobile": "?0",
                "Sec-Ch-Ua-Platform": '"Windows"',
                "Sec-Fetch-Dest": "document",
                "Sec-Fetch-Mode": "navigate",
                "Sec-Fetch-Site": "none",
                "Sec-Fetch-User": "?1",
            })

        return headers

    def get(self, url: str, **kwargs) -> cffi_requests.Response:
        """Make a GET request with full anti-detection measures."""
        # Rotate session if needed
        if self.request_count >= self.config.max_requests_per_session:
            self.session.close()
            self._create_session()
            self.request_count = 0

        # Add delay between requests
        if self.request_count > 0:
            delay = random.triangular(
                self.config.min_delay,
                self.config.max_delay,
                self.config.min_delay + 1.0,
            )
            time.sleep(delay)

        headers = self._build_headers(url)
        if "headers" in kwargs:
            headers.update(kwargs.pop("headers"))

        resp = self.session.get(url, headers=headers, **kwargs)
        self.request_count += 1
        return resp

    def close(self):
        self.session.close()


# Production usage with ThorData proxies
config = AntiDetectionConfig(
    proxy_url="http://user:[email protected]:9000",
    rotate_browser_version=True,
    min_delay=3.0,
    max_delay=10.0,
    max_requests_per_session=15,
)

scraper = AntiDetectionScraper(config)

urls = [
    "https://target-site.com/page/1",
    "https://target-site.com/page/2",
    "https://target-site.com/page/3",
]

for url in urls:
    resp = scraper.get(url)
    print(f"{url}: {resp.status_code} ({len(resp.text)} bytes)")

scraper.close()

How to Verify Your Fingerprint

Before deploying any scraper, verify what you actually look like to the target server. Do not assume — test.

Check Your JA3 Hash

from curl_cffi import requests as cffi_requests
import json

def check_fingerprint(impersonate: str = "chrome131", proxy: str | None = None):
    """Check your TLS fingerprint against multiple services."""

    proxy_dict = {"https": proxy, "http": proxy} if proxy else None

    session = cffi_requests.Session(
        impersonate=impersonate,
        proxies=proxy_dict,
    )

    # Service 1: BrowserLeaks TLS check
    try:
        resp = session.get("https://tls.browserleaks.com/json")
        tls_data = resp.json()
        print("=== BrowserLeaks TLS ===")
        print(f"JA3 Hash: {tls_data.get('ja3_hash')}")
        print(f"JA3 Text: {tls_data.get('ja3_text', '')[:80]}...")
        print(f"Protocol: {tls_data.get('tls_version')}")
        print()
    except Exception as e:
        print(f"BrowserLeaks failed: {e}")

    # Service 2: Scrapfly fingerprint check
    try:
        resp = session.get("https://tools.scrapfly.io/api/fp/ja3")
        fp_data = resp.json()
        print("=== Scrapfly JA3 ===")
        print(f"JA3 Hash: {fp_data.get('ja3_digest')}")
        print(f"JA3N Hash: {fp_data.get('ja3n_digest')}")
        print()
    except Exception as e:
        print(f"Scrapfly failed: {e}")

    session.close()

# Compare Python default vs browser impersonation
print("--- Default Python (httpx) ---")
import httpx
try:
    r = httpx.get("https://tls.browserleaks.com/json")
    d = r.json()
    print(f"JA3 Hash: {d.get('ja3_hash')}")
except Exception as e:
    print(f"Error: {e}")

print()
print("--- curl_cffi Chrome 131 ---")
check_fingerprint("chrome131")

Monitor for Fingerprint Changes

Browser TLS fingerprints change between versions. Set up monitoring to ensure your impersonation stays current:

import json
import hashlib
from datetime import datetime

KNOWN_FINGERPRINTS = {
    "chrome131": {
        "ja3_hash": None,  # Will be populated on first run
        "last_checked": None,
    },
    "chrome130": {
        "ja3_hash": None,
        "last_checked": None,
    },
}

def update_fingerprint_database(browser: str):
    """Check and record the current JA3 for a browser impersonation."""
    from curl_cffi import requests as cffi_requests

    session = cffi_requests.Session(impersonate=browser)
    resp = session.get("https://tls.browserleaks.com/json")
    data = resp.json()
    session.close()

    current_hash = data.get("ja3_hash")
    stored = KNOWN_FINGERPRINTS.get(browser, {})

    if stored.get("ja3_hash") and stored["ja3_hash"] != current_hash:
        print(f"WARNING: {browser} fingerprint changed!")
        print(f"  Old: {stored['ja3_hash']}")
        print(f"  New: {current_hash}")
        print("  -> Update your curl_cffi library")

    KNOWN_FINGERPRINTS[browser] = {
        "ja3_hash": current_hash,
        "last_checked": datetime.utcnow().isoformat(),
    }

    return current_hash

When your TLS fingerprint triggers a block, you need to detect and handle it properly:

from enum import Enum
from dataclasses import dataclass

class TLSBlockType(Enum):
    FINGERPRINT_MISMATCH = "fingerprint_mismatch"
    CDN_CHALLENGE = "cdn_challenge"
    WAF_BLOCK = "waf_block"
    RATE_LIMIT = "rate_limit"
    CLEAN = "clean"

@dataclass
class TLSBlockDetection:
    block_type: TLSBlockType
    confidence: float  # 0-1
    detail: str

def detect_tls_block(response) -> TLSBlockDetection:
    """Detect if a response indicates TLS-level blocking."""

    # Cloudflare challenge page
    if response.status_code == 403 and "cf-chl-bypass" in response.text:
        return TLSBlockDetection(
            TLSBlockType.CDN_CHALLENGE,
            0.95,
            "Cloudflare challenge page - likely TLS fingerprint mismatch",
        )

    # Cloudflare 1020 error
    if "error code: 1020" in response.text:
        return TLSBlockDetection(
            TLSBlockType.WAF_BLOCK,
            0.9,
            "Cloudflare 1020 Access Denied - WAF rule triggered",
        )

    # Akamai bot detection
    if response.status_code == 403 and "akamai" in response.headers.get("server", "").lower():
        return TLSBlockDetection(
            TLSBlockType.FINGERPRINT_MISMATCH,
            0.85,
            "Akamai 403 - likely bot detection via TLS/HTTP fingerprint",
        )

    # Generic 403 with empty or minimal body
    if response.status_code == 403 and len(response.text) < 500:
        return TLSBlockDetection(
            TLSBlockType.WAF_BLOCK,
            0.7,
            "Generic 403 with minimal body - possible fingerprint block",
        )

    # 429 rate limit
    if response.status_code == 429:
        return TLSBlockDetection(
            TLSBlockType.RATE_LIMIT,
            0.9,
            "Rate limited - may be IP or fingerprint based",
        )

    return TLSBlockDetection(TLSBlockType.CLEAN, 1.0, "No block detected")


def handle_tls_block(detection: TLSBlockDetection) -> str:
    """Return recommended action for a detected block."""
    actions = {
        TLSBlockType.FINGERPRINT_MISMATCH: (
            "Switch to curl_cffi with browser impersonation. "
            "Your current TLS fingerprint is being detected."
        ),
        TLSBlockType.CDN_CHALLENGE: (
            "The CDN is serving a JS challenge. Options: "
            "1) Use curl_cffi to pass TLS check, "
            "2) Use Playwright for JS execution, "
            "3) Use residential proxies to reduce challenge frequency."
        ),
        TLSBlockType.WAF_BLOCK: (
            "WAF is blocking this request. Check: "
            "1) TLS fingerprint matches a real browser, "
            "2) Headers are consistent with impersonated browser, "
            "3) Request rate is within human norms."
        ),
        TLSBlockType.RATE_LIMIT: (
            "Rate limited. Increase delay between requests and "
            "rotate to a fresh residential IP."
        ),
    }
    return actions.get(detection.block_type, "No action needed.")

Real-World Use Cases for TLS Fingerprint Awareness

Price Monitoring on Protected E-commerce Sites

from curl_cffi import requests as cffi_requests
from bs4 import BeautifulSoup
import json

def scrape_protected_prices(
    product_urls: list[str],
    proxy_url: str,
) -> list[dict]:
    """Scrape prices from Cloudflare-protected e-commerce sites."""
    session = cffi_requests.Session(
        impersonate="chrome131",
        proxies={"https": proxy_url, "http": proxy_url},
    )

    prices = []
    for url in product_urls:
        resp = session.get(url)

        # Check for blocks
        detection = detect_tls_block(resp)
        if detection.block_type != TLSBlockType.CLEAN:
            prices.append({"url": url, "error": detection.detail})
            continue

        soup = BeautifulSoup(resp.text, "lxml")

        # Extract price (adapt selectors to target site)
        price_el = soup.select_one("[data-price], .price, .product-price")
        title_el = soup.select_one("h1, .product-title")

        prices.append({
            "url": url,
            "title": title_el.get_text(strip=True) if title_el else "",
            "price": price_el.get_text(strip=True) if price_el else "N/A",
        })

        import time, random
        time.sleep(random.uniform(2, 5))

    session.close()
    return prices

API Scraping Behind Cloudflare

def scrape_api_behind_cloudflare(
    api_url: str,
    params: dict,
    proxy_url: str | None = None,
) -> dict:
    """Access APIs protected by Cloudflare bot management."""
    proxy_dict = {"https": proxy_url, "http": proxy_url} if proxy_url else None

    session = cffi_requests.Session(
        impersonate="chrome131",
        proxies=proxy_dict,
    )

    # Some APIs require you to first visit the main page to get cookies
    base_url = api_url.split("/api/")[0] if "/api/" in api_url else api_url.rsplit("/", 1)[0]
    session.get(base_url)  # Get cf_clearance cookie

    import time
    time.sleep(1)

    # Now make the API call with the Cloudflare cookies
    resp = session.get(api_url, params=params)
    session.close()

    return resp.json() if resp.status_code == 200 else {"error": resp.status_code}

The Practical Takeaway

If you are building scrapers or data collection tools in 2025, here is what you need to know:

Stop using requests/httpx/aiohttp for protected sites. Their TLS fingerprint is a neon sign saying "I am a script." No amount of header spoofing will fix this.
Use curl_cffi with browser impersonation for most use cases. It is the best effort-to-results ratio. Install it with pip install curl-cffi and add impersonate="chrome131" to your requests.
Keep impersonation versions current. A Chrome 120 fingerprint when everyone is on Chrome 131 is suspicious. Update curl_cffi regularly and use the latest browser identifier.
Test your fingerprint before deploying. Use the verification code above. Do not assume — verify that your JA3 hash matches a real browser.
Layer your approach: TLS fingerprint is necessary but not sufficient. You still need proper headers, realistic timing, and residential proxies from ThorData or similar providers for serious targets. The combination of genuine browser TLS + residential IP + human-like behavior is what gets you past modern anti-bot systems.
Understand that this is an arms race. JA3 was just the beginning. JA4+, HTTP/2 fingerprinting, and behavioral analysis are all advancing. The developers who understand the full detection stack have a significant advantage over those who only think about HTTP headers.

The anti-bot industry is getting better at this faster than most scrapers adapt. But with the right tools and understanding, you can build scrapers that reliably access even well-protected sites. The key is working at every layer of the stack — not just the HTTP layer that most tutorials focus on.