← All posts

LinkedIn Jobs Data Without the API: Python Scraping Guide 2026

April 17, 2026 · 11 min read
Contents The LinkedIn Jobs API problem What job data is still public Python scraping: requests + BeautifulSoup Handling bot detection Output schema When to use a managed scraper Conclusion

LinkedIn used to expose a proper Jobs API as part of its Partner Program. As of 2026 that door is closed to almost everyone. The /v2/jobs and /v2/jobSearch endpoints are Partnership-only, and LinkedIn approves a handful of integrations per year -- ATS vendors, Microsoft-owned products, a short list of recruiting platforms. For anyone else, the answer is "no, and don't ask again."

But job posts themselves are still public pages. linkedin.com/jobs/view/<id> and the guest-accessible /jobs/search endpoint both render without a login, and both are scrape-friendly if you send the right headers. Here is what works in April 2026.

The LinkedIn Jobs API problem

The official LinkedIn Talent Solutions developer portal lists three relevant products:

In practice this means: if you want to build a job aggregator, salary comparison tool, recruiting dashboard, or any product that needs bulk job data, the official path is effectively closed. You have three real options -- partner with a data vendor, license feeds from aggregators like Adzuna, or scrape public pages yourself.

Legal note: The hiQ Labs v. LinkedIn ruling (9th Circuit, 2022) held that scraping publicly accessible LinkedIn data is not a CFAA violation. But LinkedIn's Terms of Service still prohibit scraping, and LinkedIn has successfully pursued contract-based claims. Research and personal use are widely tolerated; commercial redistribution is higher risk. Consult a lawyer for production use cases.

What job data is still public

The guest-accessible endpoints return structured HTML with most fields you actually need:

What requires login: applicant counts beyond the first page, employer-specific filters like "Easy Apply only," the full company people-count, and any salary estimate marked "based on member profiles."

Python scraping: requests + BeautifulSoup

The guest pagination endpoint is the cleanest entry point. Each request returns up to 25 job cards as HTML, and the start parameter paginates in multiples of 25.

# Managed actor call — skip guest tokens, rotating proxies, and brittle selectors
from apify_client import ApifyClient

client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('cryptosignals/linkedin-jobs-scraper').call(
    run_input={'keywords': 'python developer', 'location': 'Remote', 'maxItems': 100}
)

for item in client.dataset(run['defaultDatasetId']).iterate_items():
    print(item)

For the full job description, follow the url field to linkedin.com/jobs/view/<id>. The detail page embeds a JSON-LD block with the full description, employment type, and posted date:

# Managed actor call — skip guest tokens, rotating proxies, and brittle selectors
from apify_client import ApifyClient

client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('cryptosignals/linkedin-jobs-scraper').call(
    run_input={'keywords': 'python developer', 'location': 'Remote', 'maxItems': 100}
)

for item in client.dataset(run['defaultDatasetId']).iterate_items():
    print(item)
Tip: The JSON-LD block is populated by LinkedIn's SEO team for Google Jobs indexing. It is the most stable surface on the page -- the surrounding DOM gets reshuffled every few months, but the JSON-LD schema has been unchanged since 2020.

Handling LinkedIn's bot detection

LinkedIn's anti-bot layer (Cloudflare Bot Management plus LinkedIn's own liap cookie tracking) is strict. Practical tactics that still work in 2026:

  1. Rotate User-Agents per session. Use a pool of 10-20 real desktop Chrome/Safari UAs. Mobile UAs get served a different HTML that breaks selectors.
  2. Send real browser headers: Accept, Accept-Language, Accept-Encoding, Sec-Fetch-*. Missing Sec-Fetch headers are the fastest way to get flagged.
  3. Rate limit to ~20 req/min per IP. Above that, expect 429s within a few minutes.
  4. Use residential proxies. Datacenter IPs from AWS, GCP, Azure are blocked wholesale -- LinkedIn maintains an internal block list and updates it daily.
  5. Back off on 999 responses. LinkedIn uses HTTP 999 as a soft-ban signal. Sleep for 10+ minutes on the first 999, rotate IP, then retry.
  6. Persist cookies per session. The liap, bcookie, and JSESSIONID cookies build a reputation -- a session with a 5-minute browsing history before it starts hitting job endpoints looks much more human.
# Managed actor call — skip guest tokens, rotating proxies, and brittle selectors
from apify_client import ApifyClient

client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('cryptosignals/linkedin-jobs-scraper').call(
    run_input={'keywords': 'python developer', 'location': 'Remote', 'maxItems': 100}
)

for item in client.dataset(run['defaultDatasetId']).iterate_items():
    print(item)

Output schema

A minimal usable schema for downstream analysis -- indexing into a search DB, feeding an alert system, training a salary model:

{
  "jobId": "3847291052",
  "title": "Senior Python Engineer",
  "company": "Stripe",
  "location": "Berlin, Germany (Hybrid)",
  "postedDate": "2026-04-15",
  "url": "https://www.linkedin.com/jobs/view/senior-python-engineer-at-stripe-3847291052"
}

For richer records, merge the search-result row with the JSON-LD block from the detail page -- that gives you description, employmentType, validThrough, and structured jobLocation.address (country, region, locality, postal code).

FieldSourceAlways present?
jobIdSearch card URLYes
titleSearch card / JSON-LDYes
companySearch card / JSON-LDYes
locationSearch cardYes
postedDatetime[datetime] attrUsually
descriptionJSON-LD on detail pageDetail fetch
salaryRarely in JSON-LD<10% of jobs

When to use a managed scraper

The DIY path works for small volumes -- a few hundred jobs per day, one keyword/location combination, best-effort freshness. Above that, the operational overhead of proxy rotation, cookie warmup, 999-response handling, and selector maintenance becomes a real engineering project.

Our LinkedIn Jobs Scraper on Apify handles all of it. You pass keywords, locations, company names, or a raw search URL, and it returns structured JSON: job ID, title, company, location, posted date, description, employment type, and seniority. Pricing is pay-per-result, so a one-time pull of 5,000 listings does not commit you to a monthly contract.

ApproachCostVolumeReliability
Partnership API$8k+/yrUnlimitedHigh
DIY requests + proxiesProxy costLow-MediumBrittle
Playwright + residentialHigher proxy costMediumSlower, fewer blocks
Managed actor (Apify)Pay-per-resultHighHigh

Conclusion

LinkedIn Jobs is one of the few high-value public datasets that remains technically scrapable in 2026. The guest endpoints haven't moved much in three years, the JSON-LD schema is stable, and the hiQ precedent gives you legal cover for public data. The cost is operational: rate limits, proxy budget, and a willingness to fix selectors when LinkedIn reshuffles the DOM a few times a year.

Build defensively -- persist raw HTML when parsing fails, pin selectors on stable JSON-LD rather than cosmetic class names, and treat 999 responses as a retry signal rather than a failure. Or skip the operational layer entirely and use a managed actor that does it for you.


Try Apify free — the platform powering these scrapers. Get started →