Skip to content

Instantly share code, notes, and snippets.

@olegopro
Created March 14, 2026 23:05
Show Gist options
  • Select an option

  • Save olegopro/73cf7fda4ff9d2f9698bc8d96ab6cbc6 to your computer and use it in GitHub Desktop.

Select an option

Save olegopro/73cf7fda4ff9d2f9698bc8d96ab6cbc6 to your computer and use it in GitHub Desktop.
Instagram Timeline Feed Pagination with instagrapi — Complete Guide

Instagram Timeline Feed Pagination with instagrapi — Complete Guide

The Problem

get_timeline_feed() from instagrapi accepts a max_id parameter for pagination, but Instagram ignores it and returns the same posts every time. This makes feed pagination appear completely broken.

Related issue: subzeroid/instagrapi#1789

Root Cause

Instagram's feed/timeline/ endpoint requires three things to paginate properly — not just max_id:

Parameter Purpose
max_id Cursor pointing to the next chunk (from response["next_max_id"])
seen_posts Comma-separated list of media_ids already shown to the user
feed_view_info JSON array of view telemetry objects for each seen post

Without seen_posts and feed_view_info, the server has no context about what has already been displayed — so it returns the same algorithmically-ranked posts regardless of the max_id value.

The Clue in instagrapi Source Code

The get_timeline_feed() method in instagrapi/mixins/auth.py contains a commented-out example that hints at the solution:

data = {
    "feed_view_info": "[]",  # e.g. [{"media_id":"2634223601739446191_7450075998","version":24,
    # "media_pct":1.0,"time_info":{"10":63124,"25":63124,"50":63124,"75":63124},"latest_timestamp":1628253523186}]
    # ...
}

The library hardcodes feed_view_info as an empty "[]" and never populates seen_posts. When max_id is provided, the method sets reason to "pagination" but still sends empty view data — making the cursor useless.

The Solution

Do not use get_timeline_feed() for pagination. Instead:

  1. Use get_timeline_feed("cold_start_fetch") for the first page only
  2. For subsequent pages, call cl.private_request("feed/timeline/", data=params, with_signature=False) directly with properly constructed parameters

Step 1: First Page — cold_start_fetch

from instagrapi import Client

cl = Client()
cl.set_settings(session_data)  # restore session

# First page — use the standard method
raw = cl.get_timeline_feed("cold_start_fetch")

feed_items = raw.get("feed_items", [])
next_max_id = raw.get("next_max_id")
more_available = raw.get("more_available", False)

# Extract posts and track what was "seen"
seen_posts = []
for item in feed_items:
    media = item.get("media_or_ad")
    if media:
        seen_posts.append(media["id"])  # format: "pk_userpk"

Step 2: Build Pagination Parameters

The key insight is constructing feed_view_info — a JSON array where each element simulates the user having viewed a post:

import json
import random
import time

def build_view_info(media_id: str) -> dict:
    """Simulate view telemetry for a single post."""
    view_ms = random.randint(5000, 15000)
    return {
        "media_id": media_id,
        "version": 23,
        "media_pct": 1.0,
        "time_info": {
            "10": view_ms,
            "25": view_ms,
            "50": view_ms,
            "75": view_ms,
        },
        "latest_timestamp": int(time.time() * 1000),
    }

def build_pagination_params(cl: Client, max_id: str, seen_posts: list[str]) -> dict:
    """Build the full parameter dict for pagination requests."""
    return {
        # Pagination
        "max_id": max_id,
        "reason": "pagination",
        "is_pull_to_refresh": "0",
        "is_prefetch": "0",

        # View data — THIS is what makes pagination work
        "feed_view_info": json.dumps([build_view_info(mid) for mid in seen_posts]),
        "seen_posts": ",".join(seen_posts),

        # Device & session identifiers (from the Client instance)
        "phone_id": cl.phone_id,
        "device_id": cl.uuid,
        "_uuid": cl.uuid,
        "_csrftoken": cl.token,
        "client_session_id": cl.client_session_id,

        # Device state
        "battery_level": 100,
        "timezone_offset": cl.timezone_offset,
        "is_charging": "1",
        "will_sound_on": "0",

        # Ads params
        "is_async_ads_in_headload_enabled": "0",
        "is_async_ads_double_request": "0",
        "is_async_ads_rti": "0",
        "rti_delivery_backend": "0",
    }

Step 3: Fetch Next Pages

if next_max_id and more_available:
    params = build_pagination_params(cl, next_max_id, seen_posts)
    raw = cl.private_request("feed/timeline/", data=params, with_signature=False)

    # Process posts
    for item in raw.get("feed_items", []):
        media = item.get("media_or_ad")
        if media:
            seen_posts.append(media["id"])  # accumulate for next page

    next_max_id = raw.get("next_max_id")
    more_available = raw.get("more_available", False)

Complete Minimal Example

import json
import random
import time
from instagrapi import Client

# --- Helpers ---

def build_view_info(media_id: str) -> dict:
    view_ms = random.randint(5000, 15000)
    return {
        "media_id": media_id,
        "version": 23,
        "media_pct": 1.0,
        "time_info": {"10": view_ms, "25": view_ms, "50": view_ms, "75": view_ms},
        "latest_timestamp": int(time.time() * 1000),
    }

def build_pagination_params(cl, max_id, seen_posts):
    return {
        "max_id": max_id,
        "reason": "pagination",
        "is_pull_to_refresh": "0",
        "is_prefetch": "0",
        "feed_view_info": json.dumps([build_view_info(mid) for mid in seen_posts]),
        "seen_posts": ",".join(seen_posts),
        "phone_id": cl.phone_id,
        "device_id": cl.uuid,
        "_uuid": cl.uuid,
        "_csrftoken": cl.token,
        "client_session_id": cl.client_session_id,
        "battery_level": 100,
        "timezone_offset": cl.timezone_offset,
        "is_charging": "1",
        "will_sound_on": "0",
        "is_async_ads_in_headload_enabled": "0",
        "is_async_ads_double_request": "0",
        "is_async_ads_rti": "0",
        "rti_delivery_backend": "0",
    }

def extract_media_ids(feed_items):
    ids = []
    for item in feed_items:
        media = item.get("media_or_ad")
        if media and media.get("id"):
            ids.append(media["id"])
    return ids

# --- Main ---

cl = Client()
cl.set_settings(json.load(open("session.json")))

# Page 1
raw = cl.get_timeline_feed("cold_start_fetch")
seen_posts = extract_media_ids(raw.get("feed_items", []))
next_max_id = raw.get("next_max_id")
print(f"Page 1: {len(seen_posts)} posts")

# Pages 2..N
page = 2
while next_max_id and raw.get("more_available"):
    time.sleep(random.uniform(3, 7))  # be polite

    params = build_pagination_params(cl, next_max_id, seen_posts)
    raw = cl.private_request("feed/timeline/", data=params, with_signature=False)

    new_ids = extract_media_ids(raw.get("feed_items", []))
    seen_posts.extend(new_ids)
    next_max_id = raw.get("next_max_id")
    print(f"Page {page}: {len(new_ids)} posts, total seen: {len(seen_posts)}")
    page += 1

feed_view_info Object Structure

Each element in the feed_view_info array represents simulated view telemetry:

{
  "media_id": "3210456789012345678_1234567890",
  "version": 23,
  "media_pct": 1.0,
  "time_info": {
    "10": 8234,
    "25": 8234,
    "50": 8234,
    "75": 8234
  },
  "latest_timestamp": 1710432000000
}
Field Description
media_id Post ID in pk_userpk format (same as media["id"])
version Telemetry schema version (23 works as of March 2026)
media_pct Fraction of the post visible on screen (1.0 = fully visible)
time_info Milliseconds spent at each visibility quartile (10%, 25%, 50%, 75%)
latest_timestamp Unix timestamp in milliseconds when the post was last viewed

Why reason Matters

The reason parameter tells Instagram why the feed is being requested:

Reason When to use
cold_start_fetch App opened fresh — first feed load
pull_to_refresh User pulled down to refresh the feed
warm_start_fetch App returned from background
pagination Loading more posts while scrolling (set automatically when max_id is present)

For initial loads, use cold_start_fetch. For user-initiated refresh, use pull_to_refresh. For pagination, the reason is always pagination (set in build_pagination_params).

Important Notes

  • Rate limiting: Add delays between pagination requests (3-7 seconds). Instagram's anti-bot detection will flag rapid sequential requests.
  • seen_posts growth: The seen_posts list grows with every page. For very long sessions, consider capping it at the last 200-300 entries — Instagram doesn't need the full history.
  • Session reuse: Always reuse sessions (cl.set_settings()) rather than re-logging in. Each login creates a new session state, resetting feed context.
  • Tested with: instagrapi 2.2.1 and 2.3.0 (March 2026).

Sources

  • The commented-out feed_view_info structure in instagrapi/mixins/auth.py (line ~216)
  • Reverse-engineering Instagram Android app traffic (parameters, field names, and values)
  • Empirical testing confirming that max_id alone is insufficient and seen_posts + feed_view_info are required
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment