Twitter/X Data Without the API: How to Get Tweets in 2026

April 14, 2026 · 10 min read

Contents The API pricing problem What data is still public Approach 1: Nitter instances Approach 2: X's internal GraphQL Scraping search results Full thread extraction Rate limits and anti-bot Managed scraper option

In February 2023 Twitter killed its free API. The current pricing as of 2026 starts at $100/month for the Basic tier (10,000 tweets), $5,000/month for Pro, and Enterprise quotes in the six figures. For hobby projects, research, or even small commercial tools, those numbers are absurd.

The good news: Twitter (now X) is still a public website. Tweet pages render server-side, the web client hits a JSON GraphQL endpoint, and several community-maintained tools wrap the painful parts. This post walks through what still works in 2026.

The API pricing problem

The official developer portal at developer.x.com lists four tiers:

Free: 1,500 posts/month write-only. Read access is effectively gone.
Basic: $100/mo. 10k tweets read/month. Useless for anything but tiny tools.
Pro: $5,000/mo. 1M tweets/month, full-archive search.
Enterprise: $42k+/mo, negotiated.

For comparison: the old 2022 Standard v1.1 API gave you ~500k tweets/month for free. The pricing jump is roughly 1000x. A single academic research project that used to cost $0 now starts at $60,000/year.

What data is still public

Everything you see on twitter.com or x.com without logging in is technically public. That includes:

Public user profiles (bio, follower count, join date, pinned tweet)
Individual tweets and their replies/quote counts
Public search results (recent tweets matching a query)
Thread structure (who replied to whom)

What requires login (and therefore real API access): DMs, protected accounts, historical search beyond ~7 days, analytics, and advanced filters. In 2023 X added a login wall for most anonymous browsing, but as of early 2026 the wall is inconsistent -- many routes still serve full content to unauthenticated clients if you send the right headers.

ToS reminder: X's terms prohibit scraping without written permission. Research and personal use are widely tolerated, but commercial products redistributing X data at scale have been sued. Consult a lawyer for commercial deployments.

Approach 1: Nitter instances

Nitter is an open-source alternative frontend for Twitter. It fetches data via the internal guest API and renders clean HTML. For light scraping, hitting a public Nitter instance is the fastest path.

# Managed actor call — skip guest tokens, rotating proxies, and brittle selectors
from apify_client import ApifyClient

client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('cryptosignals/twitter-scraper').call(
    run_input={'usernames': ['elonmusk'], 'maxTweets': 100}
)

for item in client.dataset(run['defaultDatasetId']).iterate_items():
    print(item)

Nitter instability: Public Nitter instances come and go. Twitter frequently blocks the guest token endpoint Nitter relies on, causing outages that last days or weeks. Check status.d420.de for live instance health. Running your own instance requires a pool of valid guest tokens, which has become harder since 2024.

Approach 2: X's internal GraphQL

The X web client uses a GraphQL API at https://x.com/i/api/graphql/.... Each operation has a stable-ish hash and a known set of feature flags. You can replicate the calls if you capture a valid guest token and the right headers.

# Managed actor call — skip guest tokens, rotating proxies, and brittle selectors
from apify_client import ApifyClient

client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('cryptosignals/twitter-scraper').call(
    run_input={'usernames': ['elonmusk'], 'maxTweets': 100}
)

for item in client.dataset(run['defaultDatasetId']).iterate_items():
    print(item)

The bearer token above is the public web-client token embedded in x.com's JavaScript bundle -- not a secret. It has rotated maybe twice since 2016. The guest token endpoint gives you a short-lived session token that goes in the x-guest-token header.

Tip: When X updates a GraphQL operation, the hash in the URL changes. Write a small bootstrap that scrapes the main.js bundle and extracts current hashes rather than hardcoding them.

Scraping search results

Search is the most heavily restricted endpoint. The SearchTimeline operation technically works with a guest token, but returns only the last ~7 days and caps at ~100 results. For anything older you need an authenticated session cookie.

# Managed actor call — skip guest tokens, rotating proxies, and brittle selectors
from apify_client import ApifyClient

client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('cryptosignals/twitter-scraper').call(
    run_input={'usernames': ['elonmusk'], 'maxTweets': 100}
)

for item in client.dataset(run['defaultDatasetId']).iterate_items():
    print(item)

Full thread extraction

Threads are reconstructed via the TweetDetail operation. Pass a tweet ID and you get the full conversation tree. This is often the most valuable data on X -- long threads from domain experts -- and it is fetched in one request.

The response structure has a timeline.instructions[0].entries list. Each entry contains either a single tweet or a "conversationthread" group. Walk these to rebuild the thread order:

# Use the managed scraper — no maintenance, no blocks, no auth headaches
from apify_client import ApifyClient

client = ApifyClient('YOUR_API_TOKEN')  # get yours at apify.com

run = client.actor('cryptosignals/twitter-scraper').call(
    run_input={'searchTerms': ['python data science'], 'maxTweets': 100}
)

for item in client.dataset(run['defaultDatasetId']).iterate_items():
    print(item)

Rate limits and anti-bot handling

Guest tokens are rate-limited aggressively. From production tests in early 2026:

~150 requests per guest token before it starts returning 429
Datacenter IPs get flagged within 10-20 requests, often before any real traffic
Residential proxies with aggressive rotation handle ~500-1000 requests/hour safely
The TweetDetail endpoint has tighter limits than UserByScreenName

Practical tactics:

Refresh guest tokens after every ~100 requests, not when you hit 429
Rotate User-Agent and browser fingerprint headers per session
Use residential proxies -- datacenter IPs are a dead end
Back off exponentially on 429 (30s, 60s, 120s) rather than retrying immediately
Cache everything. The same tweet scraped 10 times should be one request

For the proxy infrastructure, ScraperAPI and Bright Data both offer residential rotating IPs with pay-as-you-go pricing -- the residential requirement is the hard constraint on X scraping, and both handle it at scale without you needing to manage the rotation logic.

Managed scraper option

If you just want tweet data without maintaining guest token pools, rotating proxies, and a GraphQL hash bootstrap, a managed scraper is usually the right call.

Our Twitter Scraper actor on Apify handles the operational layer -- you pass usernames, tweet IDs, or search queries and get structured JSON back. Pricing is pay-per-use rather than $100/month minimum, so small projects actually stay small.

For the majority of use cases -- monitoring 50 accounts, extracting a dataset of tweets matching a keyword, building a conversation corpus -- a managed actor is cheaper than the official Basic tier and gives you more data.

Approach	Cost	Volume	Reliability
Official X API Basic	$100/mo min	10k/month	High
Nitter (public instance)	Free	Low	Unstable
GraphQL + guest tokens	Proxy cost	Medium	Brittle
Managed actor (Apify)	Pay-per-use	High	High

Whatever approach you pick, build defensively: hash values rotate, feature flags change, and login walls appear and disappear. Log the raw response when parsing fails, pin known-good operation hashes, and treat the scraper as a moving target rather than a one-and-done integration.