Meta's Instagram Graph API requires business verification, a Facebook Page connected to the Instagram account, and approval from Meta for any useful permission scope. Even after approval, you can only read data from accounts you own or manage -- not arbitrary public profiles. For competitive research, influencer analytics, or brand monitoring, the official API is effectively useless.
Instagram's public web pages, however, still render everything you need to know about a public profile. The trick is knowing which internal JSON endpoints the web app uses and how to call them without getting flagged by Meta's anti-bot systems.
?__a=1 (the old JSON twin of any profile URL) started requiring authentication.?__a=1&__d=dis workaround was patched. Anonymous access to profile JSON stopped working.x-ig-app-id header. That's what this guide uses.For any non-private Instagram account, the following fields are still accessible without login:
Private accounts, story views, direct messages, and analytics require auth. Don't attempt to scrape these -- it's both a ToS violation and a CFAA risk in the US.
Instagram's web client uses an internal endpoint at /api/v1/users/web_profile_info/. It returns the full profile JSON including the first 12 posts. No auth required if you send the right headers.
# Managed actor call — skip guest tokens, rotating proxies, and brittle selectors
from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('cryptosignals/instagram-profile-scraper').call(
run_input={'usernames': ['nasa'], 'resultsPerUser': 30}
)
for item in client.dataset(run['defaultDatasetId']).iterate_items():
print(item)
If the internal API call starts failing (Meta rotates the endpoint paths periodically), you can fall back to parsing the public profile page HTML. The first 12 posts and basic profile data are embedded in a JavaScript payload called window.__additionalDataLoaded.
# Managed actor call — skip guest tokens, rotating proxies, and brittle selectors
from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('cryptosignals/instagram-profile-scraper').call(
run_input={'usernames': ['nasa'], 'resultsPerUser': 30}
)
for item in client.dataset(run['defaultDatasetId']).iterate_items():
print(item)
The HTML approach is brittle -- Meta changes the JSON structure on the profile page roughly every 6 months. Keep the parsing logic isolated so you can swap it out when it breaks, and always have the web_profile_info path as your primary.
The web_profile_info response includes the first 12 posts under edge_owner_to_timeline_media. Each post has a shortcode, caption, like count, comment count, and media URLs.
# Managed actor returns posts already parsed — no need to walk Meta's internal JSON
from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('cryptosignals/instagram-profile-scraper').call(
run_input={'usernames': ['nasa'], 'resultsPerUser': 12}
)
for post in client.dataset(run['defaultDatasetId']).iterate_items():
print(f"[{post['likes']:,} likes] {post['caption'][:60]}")
The edge_owner_to_timeline_media.page_info.end_cursor field is a pagination token. To fetch the next batch you'd use the GraphQL endpoint at /graphql/query/ with the cursor and a query hash. That endpoint is heavily rate-limited for logged-out clients in 2026 -- if you need deep post history, use a managed scraper rather than building your own pagination loop.
product_type == "clips" on post nodes./api/v1/users/<id>/user_tagged_feed/). Often blocked without auth.edge_highlight_reels, but the individual highlight contents require auth.Meta's bot detection stack has improved significantly since 2023. What works now:
Legal note: Instagram's ToS prohibits automated scraping. The hiQ v. LinkedIn ruling established that scraping public data is not CFAA-violating in the US, but ToS-based contract claims still apply. For commercial products, get legal advice or use a service that handles compliance.
If you're doing competitive analysis, influencer vetting, or brand monitoring and don't want to maintain a proxy pool + endpoint rotation logic, a managed actor is the right call.
Our Instagram Profile Scraper on Apify handles the full pipeline: residential proxy rotation, session management, fallback logic when internal endpoints rotate, and post pagination. You pass a list of usernames and get back structured profile data plus recent posts -- no Meta App Review, no business verification, no Page setup.
| Approach | Cost | Setup | Scope |
|---|---|---|---|
| Meta Graph API | Free | App review + Page | Own accounts only |
| web_profile_info (direct) | Proxy cost | None | Public profiles, first 12 posts |
| Playwright + login | Proxy + accounts | Account farming | More data, risk of bans |
| Managed actor (Apify) | Pay-per-use | API key | Full profile + posts |
For a one-off research project, the web_profile_info endpoint with a mobile UA and a residential proxy is the fastest path. For anything ongoing -- weekly follower tracking, post engagement monitoring, influencer discovery -- a managed scraper absorbs the operational cost of Meta's constantly changing anti-bot measures.
Whatever you build, log raw responses when parsing fails, cache aggressively, and treat the integration as a moving target. Instagram breaks scrapers on a schedule -- roughly every 3-6 months a key field gets renamed or a rate limit tightens.
Try Apify free — the platform powering these scrapers. Get started →