Scrape LinkedIn Post Engagement Data with Python (2026)

2026-04-09 [python linkedin scraping analytics social-media playwright voyager]

Scrape LinkedIn Post Engagement Data with Python (2026)

LinkedIn doesn't give you a real API for engagement data. The official API is scoped to recruiting and advertising — useless if you want post-level analytics for competitor research or content strategy work. So you either pay for an expensive SaaS tool, or you scrape it yourself.

This guide covers both major approaches in depth: hitting LinkedIn's internal Voyager API for structured engagement metrics, and scraping public company pages with Playwright for unauthenticated data collection. Both approaches are production-viable with the right anti-detection setup.

What You Can Extract

Via Voyager API (authenticated): - Reaction counts by type (Like, Celebrate, Support, Love, Insightful, Funny) - Comment count and full comment threads (text, author, timestamp) - Share count - Post text and embedded media references - Hashtags used in posts - Post publication timestamp - Company and author URN identifiers

Via Playwright HTML scraping (public pages): - Total reaction count (aggregate, not broken down by type) - Comment count (visible button label) - Post text preview (truncated at "see more") - Hashtags parsed from post text - Post relative timestamps ("3 days ago", "1 week ago") - Company display name and follower count visible on page

What requires premium access or is blocked: - Exact follower count via Voyager (visible on public page but not always in API) - Full audience analytics (impressions, reach) — requires Company Page admin access - Individual liker identities — available via API but rate-limited heavily

LinkedIn's Anti-Bot Measures

Before touching code, understand what you're up against.

Login walls. Most engagement data requires authentication. Reaction counts, comment threads, and share counts are not available on public-facing pages for non-authenticated users in most content types. The exception is company pages, where basic counts are sometimes visible.

The li_at cookie. LinkedIn's session authentication lives in a cookie called li_at. Every authenticated request needs this. It's tied to your account session and expires or rotates after suspicious activity. If you're doing serious volume, burning a scraping account is a real risk.

CSRF tokens. Voyager API requests require a csrf-token header that matches the JSESSIONID cookie value. LinkedIn validates these together server-side. You cannot fake one without the other.

Rate limiting per session. LinkedIn rate-limits at the session level, not just by IP. Even with fresh IPs, hammering requests from the same li_at value triggers 429s and eventually account flags. Conservative pacing (2-3 seconds between requests) is mandatory.

Browser fingerprinting. LinkedIn's frontend JavaScript collects canvas fingerprints, WebGL renderer strings, screen dimensions, font enumeration, and navigator properties. Headless browsers without stealth configuration get flagged within the first page load.

Voyager endpoint protection. Voyager endpoints return 999 status codes if you hit them without the right headers (x-restli-protocol-version, x-li-lang, x-li-page-instance, x-li-track). These headers must be present and structurally valid.

IP reputation. LinkedIn blocks datacenter IP ranges almost entirely. Residential proxies are required for any sustained scraping. Even residential IPs get flagged if they originate from ranges commonly associated with proxy providers.

Approach 1: Voyager API (Authenticated)

The Voyager API is LinkedIn's internal REST layer. It lives at www.linkedin.com/voyager/api/. You need a valid li_at cookie and matching CSRF token extracted from a browser session.

Extracting credentials from your browser: 1. Open LinkedIn in Chrome, log in normally 2. Open DevTools (F12) → Application → Storage → Cookies → www.linkedin.com 3. Copy the value of li_at cookie 4. Copy the value of JSESSIONID cookie (the CSRF token is this value stripped of quotes)

import requests
import json
import time
import re
from typing import Optional


class LinkedInVoyager:
    """Client for LinkedIn's internal Voyager API."""

    BASE = "https://www.linkedin.com/voyager/api"

    def __init__(self, li_at: str, jsessionid: str):
        self.session = requests.Session()
        self.session.cookies.set("li_at", li_at, domain=".linkedin.com")
        self.session.cookies.set("JSESSIONID", jsessionid, domain=".linkedin.com")

        # CSRF token is JSESSIONID value with surrounding quotes stripped
        csrf = jsessionid.strip('"')

        self.session.headers.update({
            "csrf-token": csrf,
            "x-restli-protocol-version": "2.0.0",
            "x-li-lang": "en_US",
            "x-li-page-instance": "urn:li:page:d_flagship3_feed;ABC123",
            "x-li-track": json.dumps({
                "clientVersion": "1.13.5",
                "mpVersion": "1.13.5",
                "osName": "web",
                "timezoneOffset": 0,
                "timezone": "UTC",
                "deviceFormFactor": "DESKTOP",
                "mpName": "voyager-web",
                "displayDensity": 1,
                "displayWidth": 1920,
                "displayHeight": 1080,
            }),
            "user-agent": (
                "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
                "AppleWebKit/537.36 (KHTML, like Gecko) "
                "Chrome/126.0.0.0 Safari/537.36"
            ),
            "accept": "application/vnd.linkedin.normalized+json+2.1",
            "accept-language": "en-US,en;q=0.9",
            "referer": "https://www.linkedin.com/feed/",
            "sec-fetch-dest": "empty",
            "sec-fetch-mode": "cors",
            "sec-fetch-site": "same-origin",
        })

        self._request_count = 0
        self._last_request_time = 0

    def _throttle(self, min_delay: float = 2.0, jitter: float = 1.5):
        """Enforce minimum delay between requests."""
        import random
        elapsed = time.time() - self._last_request_time
        required = min_delay + random.uniform(0, jitter)
        if elapsed < required:
            time.sleep(required - elapsed)
        self._last_request_time = time.time()
        self._request_count += 1

    def get_company_posts(
        self,
        company_urn: str,
        count: int = 20,
        start: int = 0,
    ) -> dict:
        """Fetch posts from a company feed."""
        self._throttle()
        url = f"{self.BASE}/feed/updatesV2"
        params = {
            "variables": (
                f"(start:{start},count:{count},"
                f"feedType:COMPANY_FEED,"
                f"companyUrn:urn%3Ali%3Aorganization%3A{company_urn})"
            ),
            "queryId": "voyagerFeedDashUpdates.COMPANY_FEED",
        }
        resp = self.session.get(url, params=params, timeout=15)

        if resp.status_code == 429:
            retry_after = int(resp.headers.get("Retry-After", 60))
            print(f"Rate limited. Waiting {retry_after}s...")
            time.sleep(retry_after)
            return self.get_company_posts(company_urn, count, start)

        resp.raise_for_status()
        return resp.json()

    def parse_post_metrics(self, raw: dict) -> list[dict]:
        """Extract structured post metrics from a Voyager feed response."""
        posts = []
        elements = raw.get("included", [])

        for el in elements:
            if el.get("$type") != "com.linkedin.voyager.feed.render.UpdateV2":
                continue

            social = el.get("socialDetail", {})
            reaction_summaries = social.get("reactionSummaries", [])

            total_reactions = sum(r.get("count", 0) for r in reaction_summaries)
            reaction_breakdown = {
                r.get("reactionType", "UNKNOWN"): r.get("count", 0)
                for r in reaction_summaries
            }

            # Extract post text
            commentary = el.get("commentary", {})
            text_obj = commentary.get("text", {})
            post_text = text_obj.get("text", "")

            # Extract hashtags from text attributes
            hashtags = []
            for seg in text_obj.get("attributesV2", []):
                if seg.get("type") == "HASHTAG":
                    tag = seg.get("text", "").lstrip("#")
                    if tag:
                        hashtags.append(tag.lower())

            # Activity counts
            activity_counts = social.get("totalSocialActivityCounts", {})

            # URN and timestamp
            update_meta = el.get("updateMetadata", {})
            actor = el.get("actor", {})
            time_text = actor.get("subDescription", {}).get("text", "")

            posts.append({
                "urn": update_meta.get("urn", ""),
                "share_urn": update_meta.get("shareUrn", ""),
                "text": post_text,
                "hashtags": hashtags,
                "reactions_total": total_reactions,
                "reaction_breakdown": reaction_breakdown,
                "comments": activity_counts.get("numComments", 0),
                "shares": activity_counts.get("numShares", 0),
                "posted_at_text": time_text,
            })

        return posts

    def paginate_company_posts(
        self,
        company_urn: str,
        max_posts: int = 100,
    ) -> list[dict]:
        """Collect all posts for a company, up to max_posts."""
        all_posts = []
        start = 0
        page_size = 20

        while len(all_posts) < max_posts:
            raw = self.get_company_posts(company_urn, count=page_size, start=start)
            batch = self.parse_post_metrics(raw)

            if not batch:
                break

            all_posts.extend(batch)
            start += page_size

            print(f"  Collected {len(all_posts)} posts so far...")

        return all_posts[:max_posts]

    def get_post_comments(self, post_urn: str, count: int = 20) -> list[dict]:
        """Fetch comments for a specific post URN."""
        self._throttle()
        encoded_urn = requests.utils.quote(post_urn, safe="")
        url = f"{self.BASE}/feed/comments"
        params = {
            "variables": f"(count:{count},start:0,updateUrn:{encoded_urn})",
            "queryId": "voyagerFeedDashComments.FETCH",
        }

        resp = self.session.get(url, params=params, timeout=15)
        if resp.status_code not in (200, 201):
            return []

        data = resp.json()
        comments = []

        for el in data.get("included", []):
            if "commentV2" not in el.get("$type", ""):
                continue
            text = el.get("commentV2", {}).get("text", {}).get("text", "")
            if text:
                comments.append({
                    "text": text,
                    "urn": el.get("entityUrn", ""),
                })

        return comments

Approach 2: HTML Scraping with Playwright

For public company pages where you don't want to risk an account, Playwright with stealth settings extracts basic engagement data. You get totals but not reaction-type breakdowns.

import asyncio
import re
import random
from playwright.async_api import async_playwright


async def scrape_company_page(
    company_slug: str,
    proxy_config: dict = None,
    max_posts: int = 15,
) -> list[dict]:
    """
    Scrape engagement metrics from a LinkedIn company's posts page.

    company_slug: the URL slug e.g. 'openai' from linkedin.com/company/openai
    proxy_config: optional dict with 'server', 'username', 'password'
    """
    async with async_playwright() as p:
        launch_kwargs = {
            "headless": True,
            "args": [
                "--disable-blink-features=AutomationControlled",
                "--no-sandbox",
                "--disable-dev-shm-usage",
            ],
        }
        if proxy_config:
            launch_kwargs["proxy"] = proxy_config

        browser = await p.chromium.launch(**launch_kwargs)
        context = await browser.new_context(
            user_agent=(
                "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
                "AppleWebKit/537.36 (KHTML, like Gecko) "
                "Chrome/126.0.0.0 Safari/537.36"
            ),
            viewport={"width": 1280, "height": 900},
            locale="en-US",
            timezone_id="America/New_York",
        )

        # Suppress automation detection
        await context.add_init_script("""
            Object.defineProperty(navigator, 'webdriver', { get: () => undefined });
            Object.defineProperty(navigator, 'plugins', { get: () => [1, 2, 3] });
            Object.defineProperty(navigator, 'languages', { get: () => ['en-US', 'en'] });
            window.chrome = { runtime: {} };
        """)

        page = await context.new_page()
        url = f"https://www.linkedin.com/company/{company_slug}/posts/"

        await page.goto(url, wait_until="networkidle", timeout=30000)
        await page.wait_for_timeout(3000)

        posts = []
        post_cards = await page.query_selector_all("div[data-id^='urn:li:activity']")

        def parse_count(raw: str) -> int:
            """Parse LinkedIn's abbreviated count format (e.g., '1.2K', '3,456')."""
            if not raw:
                return 0
            raw = raw.strip().replace(",", "")
            match = re.search(r"([\d.]+)\s*([KkMm]?)", raw)
            if not match:
                return 0
            num = float(match.group(1))
            suffix = match.group(2).upper()
            if suffix == "K":
                return int(num * 1000)
            elif suffix == "M":
                return int(num * 1_000_000)
            return int(num)

        for card in post_cards[:max_posts]:
            try:
                # Post text
                text_el = await card.query_selector(
                    ".attributed-text-segment-list__content, "
                    "[class*='break-words'], .feed-shared-text"
                )
                text = (await text_el.inner_text()).strip() if text_el else ""
                hashtags = re.findall(r"#(\w+)", text)

                # Reactions
                reaction_el = await card.query_selector(
                    "[aria-label*='reaction'], "
                    "[class*='reactions-count'], "
                    "button[aria-label*='like']"
                )
                reactions_raw = (
                    await reaction_el.get_attribute("aria-label") or
                    await reaction_el.inner_text()
                ) if reaction_el else "0"
                count_match = re.search(r"([\d,.KkMm]+)", reactions_raw or "0")
                reactions = parse_count(count_match.group(1)) if count_match else 0

                # Comments
                comment_el = await card.query_selector(
                    "button[aria-label*='comment'], "
                    "[class*='comments-count']"
                )
                comments_raw = (await comment_el.inner_text()).strip() if comment_el else "0"
                count_match2 = re.search(r"([\d,.KkMm]+)", comments_raw)
                comments = parse_count(count_match2.group(1)) if count_match2 else 0

                # Timestamp (relative)
                time_el = await card.query_selector(
                    "time, [class*='ago'], span[aria-label*='ago']"
                )
                timestamp = (await time_el.inner_text()).strip() if time_el else None

                posts.append({
                    "company_slug": company_slug,
                    "text": text[:500],  # truncate for storage
                    "hashtags": hashtags,
                    "reactions": reactions,
                    "comments": comments,
                    "timestamp_text": timestamp,
                })

                await page.wait_for_timeout(random.randint(800, 1500))

            except Exception as e:
                print(f"Error extracting post card: {e}")
                continue

        await browser.close()
        return posts


# Run with proxy
proxy = {
    "server": "http://proxy.thordata.com:9000",
    "username": "YOUR_USER",
    "password": "YOUR_PASS",
}
posts = asyncio.run(scrape_company_page("openai", proxy_config=proxy, max_posts=10))

Proxy and Anti-Detection Setup

LinkedIn aggressively blocks datacenter IPs. A fresh AWS or DigitalOcean IP gets a 999 status or redirects to a challenge page within minutes of any scraping activity. For any sustained scraping, residential proxies are required.

ThorData's residential proxies handle LinkedIn's geo-checks reliably. Their automatic rotation keeps sessions alive longer by preventing the IP-level rate limits that accumulate with a single residential address. Wire them into either approach:

import requests

# For Voyager (requests-based)
voyager = LinkedInVoyager(li_at="YOUR_LI_AT", jsessionid="YOUR_JSESSIONID")
voyager.session.proxies = {
    "http": "http://USER:[email protected]:9000",
    "https": "http://USER:[email protected]:9000",
}

# For Playwright
proxy_config = {
    "server": "http://proxy.thordata.com:9000",
    "username": "USER",
    "password": "PASS",
}
# Pass proxy_config to scrape_company_page() or create_browser()

Best practices: - Rotate proxies every 50-100 requests - Randomize request intervals between 2 and 5 seconds for Voyager, 1.5-4 seconds for Playwright - Never reuse a li_at cookie across multiple proxy IPs in the same session - Use geo-consistent proxies (match country to your account's registered location)

Engagement Rate Formulas

Consistent formulas matter for cross-company comparison:

from typing import Optional


def engagement_rate(
    reactions: int,
    comments: int,
    shares: int,
    followers: int,
) -> float:
    """
    Standard engagement rate: total engagements / follower count.
    Returns percentage (0-100 scale).
    """
    if followers == 0:
        return 0.0
    total = reactions + comments + shares
    return round(total / followers * 100, 4)


def weighted_engagement_rate(
    reactions: int,
    comments: int,
    shares: int,
    followers: int,
) -> float:
    """
    Weighted engagement rate where comments and shares score higher than reactions.
    Weights: reactions=1, comments=3, shares=5
    """
    if followers == 0:
        return 0.0
    score = reactions * 1 + comments * 3 + shares * 5
    return round(score / followers * 100, 4)


def reaction_diversity_score(reaction_breakdown: dict) -> float:
    """
    Score based on diversity of reaction types.
    A post with varied reactions (Celebrate, Insightful, etc.) scores higher
    than one with only Likes — indicates deeper emotional resonance.
    """
    if not reaction_breakdown:
        return 0.0
    total = sum(reaction_breakdown.values())
    if total == 0:
        return 0.0

    # Shannon entropy of reaction distribution
    import math
    entropy = 0.0
    for count in reaction_breakdown.values():
        if count > 0:
            p = count / total
            entropy -= p * math.log2(p)

    # Max entropy for n types
    max_types = len(reaction_breakdown)
    max_entropy = math.log2(max_types) if max_types > 1 else 1.0
    return round(entropy / max_entropy, 4) if max_entropy > 0 else 0.0


def posting_frequency(post_timestamps: list[float]) -> float:
    """
    Average posts per week given a list of Unix timestamps.
    Returns 0.0 if fewer than 2 timestamps provided.
    """
    if len(post_timestamps) < 2:
        return 0.0
    span_days = (max(post_timestamps) - min(post_timestamps)) / 86400
    if span_days < 1:
        return 0.0
    return round(len(post_timestamps) / (span_days / 7), 2)


def best_posting_day(post_timestamps: list[float]) -> str:
    """
    Identify which day of week has highest average engagement.
    Returns day name (Monday–Sunday).
    """
    from datetime import datetime
    from collections import defaultdict

    day_reactions = defaultdict(list)
    for ts in post_timestamps:
        day_name = datetime.fromtimestamp(ts).strftime("%A")
        day_reactions[day_name].append(ts)

    if not day_reactions:
        return "unknown"

    return max(day_reactions.items(), key=lambda x: len(x[1]))[0]

Hashtag Performance Analysis

Track which hashtags drive higher engagement across posts:

from collections import defaultdict
import statistics


def analyze_hashtags(posts: list[dict]) -> list[dict]:
    """
    Aggregate engagement metrics by hashtag across all posts.
    Returns sorted list by average reactions descending.
    """
    hashtag_stats = defaultdict(lambda: {
        "uses": 0,
        "total_reactions": 0,
        "total_comments": 0,
        "total_shares": 0,
        "reaction_counts": [],
    })

    for post in posts:
        for tag in post.get("hashtags", []):
            tag = tag.lower().strip()
            if not tag:
                continue
            hashtag_stats[tag]["uses"] += 1
            hashtag_stats[tag]["total_reactions"] += post.get("reactions_total", 0)
            hashtag_stats[tag]["total_comments"] += post.get("comments", 0)
            hashtag_stats[tag]["total_shares"] += post.get("shares", 0)
            hashtag_stats[tag]["reaction_counts"].append(post.get("reactions_total", 0))

    results = []
    for tag, stats in hashtag_stats.items():
        uses = stats["uses"]
        counts = stats["reaction_counts"]
        results.append({
            "hashtag": tag,
            "uses": uses,
            "avg_reactions": round(stats["total_reactions"] / uses, 1),
            "avg_comments": round(stats["total_comments"] / uses, 1),
            "avg_shares": round(stats["total_shares"] / uses, 1),
            "median_reactions": statistics.median(counts) if counts else 0,
            "max_reactions": max(counts) if counts else 0,
        })

    return sorted(results, key=lambda x: x["avg_reactions"], reverse=True)

SQLite Storage Schema

import sqlite3
import json
from datetime import datetime, timezone


def init_linkedin_db(db_path: str = "linkedin_engagement.db") -> sqlite3.Connection:
    """Initialize LinkedIn engagement tracking database."""
    conn = sqlite3.connect(db_path)

    conn.executescript("""
        CREATE TABLE IF NOT EXISTS companies (
            slug            TEXT PRIMARY KEY,
            name            TEXT,
            follower_count  INTEGER,
            description     TEXT,
            industry        TEXT,
            scraped_at      TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        );

        CREATE TABLE IF NOT EXISTS posts (
            urn             TEXT PRIMARY KEY,
            company_slug    TEXT NOT NULL,
            text            TEXT,
            hashtags        TEXT,    -- JSON array
            reactions_total INTEGER DEFAULT 0,
            reaction_breakdown TEXT, -- JSON object
            comments        INTEGER DEFAULT 0,
            shares          INTEGER DEFAULT 0,
            engagement_rate REAL,
            weighted_er     REAL,
            posted_at_text  TEXT,
            scraped_at      TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
            FOREIGN KEY (company_slug) REFERENCES companies(slug)
        );

        CREATE TABLE IF NOT EXISTS hashtag_stats (
            company_slug    TEXT NOT NULL,
            hashtag         TEXT NOT NULL,
            uses            INTEGER,
            avg_reactions   REAL,
            avg_comments    REAL,
            computed_at     TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
            PRIMARY KEY (company_slug, hashtag)
        );

        CREATE INDEX IF NOT EXISTS idx_posts_company ON posts(company_slug);
        CREATE INDEX IF NOT EXISTS idx_posts_engagement ON posts(engagement_rate DESC);
        CREATE INDEX IF NOT EXISTS idx_hashtag_stats_tag ON hashtag_stats(hashtag);
    """)

    conn.commit()
    return conn


def upsert_post(
    conn: sqlite3.Connection,
    post: dict,
    company_slug: str,
    follower_count: int = 0,
):
    """Insert or update a post with computed engagement metrics."""
    er = engagement_rate(
        post.get("reactions_total", 0),
        post.get("comments", 0),
        post.get("shares", 0),
        follower_count,
    )
    wer = weighted_engagement_rate(
        post.get("reactions_total", 0),
        post.get("comments", 0),
        post.get("shares", 0),
        follower_count,
    )

    conn.execute("""
        INSERT OR REPLACE INTO posts
            (urn, company_slug, text, hashtags, reactions_total,
             reaction_breakdown, comments, shares, engagement_rate,
             weighted_er, posted_at_text, scraped_at)
        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
    """, (
        post.get("urn") or post.get("text", "")[:50],  # fallback key
        company_slug,
        post.get("text"),
        json.dumps(post.get("hashtags", [])),
        post.get("reactions_total", post.get("reactions", 0)),
        json.dumps(post.get("reaction_breakdown", {})),
        post.get("comments", 0),
        post.get("shares", 0),
        er,
        wer,
        post.get("posted_at_text"),
        datetime.now(timezone.utc).isoformat(),
    ))
    conn.commit()


def get_top_posts(
    conn: sqlite3.Connection,
    company_slug: str,
    metric: str = "engagement_rate",
    limit: int = 10,
) -> list:
    """Get top performing posts for a company sorted by metric."""
    valid_metrics = {"engagement_rate", "weighted_er", "reactions_total", "comments"}
    if metric not in valid_metrics:
        metric = "engagement_rate"

    return conn.execute(f"""
        SELECT text, hashtags, reactions_total, comments, shares, {metric}
        FROM posts
        WHERE company_slug = ?
        ORDER BY {metric} DESC
        LIMIT ?
    """, (company_slug, limit)).fetchall()

Use Cases and Applications

Content strategy optimization. Collect 6-12 months of posts from your own company page using the Voyager approach (with your own li_at). Run the hashtag analysis and engagement rate calculations. You will quickly see which content formats (native video, carousels, text-only) and which topic clusters drive the highest weighted engagement. Adjust your posting calendar around what the data shows, not gut feeling.

Competitor benchmarking. Track 5-10 competitor companies weekly. Measure their posting frequency, average engagement rate per format type, and top-performing hashtags. Spot inflection points — when a competitor's engagement rate jumps significantly, dig into what changed in their content mix. Often it is a new content type or a campaign that you can analyze and adapt.

Influencer and partner identification. Pull posts mentioning specific hashtags or topics in your niche using the Voyager hashtag feed endpoint. Rank post authors by weighted engagement rate rather than follower count. High engagement rate on a smaller following often signals a more authentic audience — more valuable for partnership decisions than raw follower numbers.

Sales intelligence. Track posts from prospect companies. Spikes in hiring announcements, product launches, or executive commentary indicate buying signals. A company posting heavily about a pain point you solve is a warm outreach opportunity.

Industry benchmarking. Collect post-level data across 50-100 companies in a vertical. Build a benchmark report showing industry-average engagement rates, top hashtag themes, and posting frequency norms. This kind of benchmark content performs well as gated content or as a foundation for thought leadership.

Ethical Considerations and Rate Limits

LinkedIn's Terms of Service prohibit automated scraping. Understand the risk profile before running this in production:

Accounts used for Voyager scraping can be restricted or terminated without notice
Scraping public company pages with Playwright carries lower risk than API extraction
Using this against competitor public pages differs from mass-scraping personal profiles

Minimum safe delays: 2.5 seconds between Voyager calls with 1-2 seconds of additional jitter; 1.5 seconds between Playwright page interactions. If you see 429 responses, back off for at least 60 seconds before retrying.

Do not store personally identifiable data on employees or individual users. Do not use scraped engagement data to build shadow profiles or power unsolicited outreach at scale. The techniques documented here are for content analytics, competitive research, and personal strategy work.