Scraping Dailymotion Video Data and Channel Stats with Python (2026)

2026-04-09 ["dailymotion" "web scraping" "python" "api" "video"]

Scraping Dailymotion Video Data and Channel Stats with Python (2026)

Dailymotion flies under the radar compared to YouTube, but it remains one of the world's largest video platforms with over 400 million unique monthly visitors. For data projects — competitive analysis in the video space, content trend tracking, media monitoring, journalism research — Dailymotion is an underexploited source. The platform hosts significant news and sports content that doesn't appear on YouTube due to licensing agreements, particularly European broadcast media.

The good news: Dailymotion still maintains a public Data API that's more generous than most video platforms. The less good news: the API has quirks, undocumented rate limits, and some endpoints return inconsistent data. Geographic content restrictions are a significant complication for cross-market research. This guide covers how to work with the API effectively, handle its limitations, and supplement it with direct page scraping where the API falls short.

What Data Is Available

Through the Dailymotion Data API and page scraping combined:

Video-level data: - Title, description, tags, duration (seconds) - Creation date, last publication date - View counts: total, last hour, last 24 hours, last week - Like counts, bookmark counts - Thumbnail URLs in multiple resolutions - Language and country tags - Channel name and owner info - Content moderation status, adult content flag - Direct embed code and player URL

Channel data: - Subscriber (fan) count - Total video count - Total view count across all videos - Channel description and creation date - Verified badge status - Country and language preferences

Trending data: - Videos trending in each country - Category-level trending - New video feeds by category

Search: - Full text search with sort and filter options - Filter by duration, upload date, language, country - Sort by relevance, date, views, likes

Why Geographic Data Varies So Much

Dailymotion has significant content licensing agreements with major media companies. A soccer highlights clip from a French broadcaster might be available on Dailymotion in France but blocked everywhere else. American news broadcasts may geo-restrict their Dailymotion uploads to US viewers.

This isn't just about available videos — it affects the view count data too. Dailymotion historically reports country-regional view counts for some content categories, and the trending lists vary substantially by country. If you're doing cross-market research, you need to collect data from multiple geographic vantage points.

Anti-Bot Measures

Dailymotion's defenses are moderate compared to platforms like YouTube or Instagram, but they exist:

Rate limiting. The API enforces rate limits per IP and per API key. Unauthenticated requests are limited to roughly 600 requests per 10-minute window. Authenticated requests get approximately 5,000 per 10-minute window. Exceeding the limit returns HTTP 403 with error code limit_reached.

API key requirement for bulk operations. While individual video lookups work without an API key, search and listing endpoints require one for paginated results beyond the first page. Registration at the Dailymotion Partner HQ is free.

Cloudflare protection on the website. The Data API itself sits behind light protection, but the main Dailymotion website uses Cloudflare with JavaScript challenges. Direct HTML scraping of dailymotion.com requires browser automation or challenge-solving.

Geographic restrictions. Some videos and API responses are filtered by the IP's geographic location. A video visible from France may return 404 from a US IP. Particularly common for sports and news content.

Referrer checking. Certain embed and player endpoints validate the referrer header. Requests without a plausible referrer get empty responses.

Setting Up API Access

import httpx
import time
import json
import sqlite3
import random
from dataclasses import dataclass, field
from typing import Optional
import threading

API_KEY = "your_api_key"
API_SECRET = "your_api_secret"
BASE_URL = "https://api.dailymotion.com"

class DailymotionClient:
    """
    Thread-safe Dailymotion API client with automatic token refresh.
    """

    def __init__(self, api_key: str, api_secret: str):
        self.api_key = api_key
        self.api_secret = api_secret
        self._token: Optional[str] = None
        self._token_expires: float = 0
        self._lock = threading.Lock()
        self._session = httpx.Client(
            base_url=BASE_URL,
            timeout=30,
            headers={"User-Agent": "DailymotionDataBot/1.0"},
        )

    def _refresh_token(self):
        resp = httpx.post(
            f"{BASE_URL}/oauth/token",
            data={
                "grant_type": "client_credentials",
                "client_id": self.api_key,
                "client_secret": self.api_secret,
            },
            timeout=15,
        )
        resp.raise_for_status()
        data = resp.json()
        self._token = data["access_token"]
        self._token_expires = time.time() + data.get("expires_in", 3600) - 60

    def _get_token(self) -> str:
        with self._lock:
            if not self._token or time.time() >= self._token_expires:
                self._refresh_token()
            return self._token

    def get(
        self,
        path: str,
        params: dict = None,
        use_auth: bool = True,
        proxy: Optional[str] = None,
        retries: int = 3,
    ) -> dict:
        """Make an authenticated GET request with retry logic."""
        full_params = dict(params or {})
        if use_auth:
            full_params["access_token"] = self._get_token()

        for attempt in range(retries):
            try:
                if proxy:
                    with httpx.Client(
                        base_url=BASE_URL,
                        proxy=proxy,
                        timeout=30,
                    ) as proxy_client:
                        resp = proxy_client.get(path, params=full_params)
                else:
                    resp = self._session.get(path, params=full_params)

                if resp.status_code == 200:
                    return resp.json()

                elif resp.status_code == 403:
                    error_data = resp.json()
                    error_code = error_data.get("error_data", {}).get("code", "")
                    if error_code == "limit_reached":
                        wait = 60 * (attempt + 1)
                        print(f"Rate limit reached. Waiting {wait}s...")
                        time.sleep(wait)
                        continue
                    elif "oauth2" in str(error_data).lower():
                        # Token expired, refresh and retry
                        self._token = None
                        full_params["access_token"] = self._get_token()
                        continue
                    else:
                        return {}  # Video not accessible from this region

                elif resp.status_code == 404:
                    return {}

                elif resp.status_code == 503:
                    time.sleep(10 * (attempt + 1))
                    continue

                else:
                    resp.raise_for_status()

            except httpx.TimeoutException:
                if attempt == retries - 1:
                    raise
                time.sleep(5 * (attempt + 1))

        return {}

    def close(self):
        self._session.close()

Fetching Video Metadata

The /video/{id} endpoint accepts a fields parameter:

@dataclass
class DailymotionVideo:
    video_id: str
    title: str
    description: str
    tags: list
    duration_seconds: int
    views_total: int
    views_last_hour: int
    views_last_24h: int
    views_last_week: int
    likes_total: int
    bookmarks_total: int
    created_time: int
    updated_time: int
    thumbnail_url: str
    thumbnail_720_url: str
    language: str
    country: str
    channel_name: str
    owner_screenname: str
    owner_fans_total: int
    is_explicit: bool = False
    is_private: bool = False
    status: str = "published"

VIDEO_FIELDS = [
    "id", "title", "description", "tags", "duration",
    "views_total", "views_last_hour", "views_last_24h", "views_last_week",
    "likes_total", "bookmarks_total",
    "created_time", "updated_time",
    "thumbnail_url", "thumbnail_720_url",
    "language", "country", "channel.name",
    "owner.screenname", "owner.fans_total",
    "explicit", "private", "status",
    "embed_url", "short_url",
]

def get_video(client: DailymotionClient, video_id: str, proxy: Optional[str] = None) -> Optional[DailymotionVideo]:
    """Fetch full metadata for a single video."""
    data = client.get(
        f"/video/{video_id}",
        params={"fields": ",".join(VIDEO_FIELDS)},
        proxy=proxy,
    )

    if not data or "id" not in data:
        return None

    return DailymotionVideo(
        video_id=data.get("id", video_id),
        title=data.get("title", ""),
        description=(data.get("description") or "")[:1000],
        tags=data.get("tags", []),
        duration_seconds=data.get("duration", 0),
        views_total=data.get("views_total", 0),
        views_last_hour=data.get("views_last_hour", 0),
        views_last_24h=data.get("views_last_24h", 0),
        views_last_week=data.get("views_last_week", 0),
        likes_total=data.get("likes_total", 0),
        bookmarks_total=data.get("bookmarks_total", 0),
        created_time=data.get("created_time", 0),
        updated_time=data.get("updated_time", 0),
        thumbnail_url=data.get("thumbnail_url", ""),
        thumbnail_720_url=data.get("thumbnail_720_url", ""),
        language=data.get("language", ""),
        country=data.get("country", ""),
        channel_name=data.get("channel.name", "") or data.get("channel", {}).get("name", ""),
        owner_screenname=data.get("owner.screenname", "") or data.get("owner", {}).get("screenname", ""),
        owner_fans_total=data.get("owner.fans_total", 0) or 0,
        is_explicit=data.get("explicit", False),
        is_private=data.get("private", False),
        status=data.get("status", "published"),
    )

def search_videos(
    client: DailymotionClient,
    query: str,
    sort: str = "relevance",
    limit: int = 100,
    language: Optional[str] = None,
    country: Optional[str] = None,
    created_after: Optional[int] = None,
    proxy: Optional[str] = None,
    delay: float = 1.2,
) -> list[DailymotionVideo]:
    """
    Search Dailymotion for videos.
    sort: 'relevance', 'recent', 'visited' (views), 'rating', 'trending'
    """
    fields = [
        "id", "title", "duration", "views_total", "views_last_24h",
        "created_time", "thumbnail_url", "owner.screenname", "channel.name",
        "language", "country", "tags",
    ]

    results = []
    page = 1

    while len(results) < limit:
        params = {
            "search": query,
            "fields": ",".join(fields),
            "sort": sort,
            "page": page,
            "limit": min(100, limit - len(results)),
        }
        if language:
            params["language"] = language
        if country:
            params["country"] = country
        if created_after:
            params["created_after"] = created_after

        data = client.get("/videos", params=params, proxy=proxy)
        items = data.get("list", [])
        if not items:
            break

        for item in items:
            video = DailymotionVideo(
                video_id=item.get("id", ""),
                title=item.get("title", ""),
                description="",
                tags=item.get("tags", []),
                duration_seconds=item.get("duration", 0),
                views_total=item.get("views_total", 0),
                views_last_hour=0,
                views_last_24h=item.get("views_last_24h", 0),
                views_last_week=0,
                likes_total=0,
                bookmarks_total=0,
                created_time=item.get("created_time", 0),
                updated_time=0,
                thumbnail_url=item.get("thumbnail_url", ""),
                thumbnail_720_url="",
                language=item.get("language", ""),
                country=item.get("country", ""),
                channel_name=item.get("channel.name", "") or "",
                owner_screenname=item.get("owner.screenname", "") or "",
                owner_fans_total=0,
            )
            results.append(video)

        if not data.get("has_more"):
            break
        page += 1
        time.sleep(delay)

    return results[:limit]

Channel Scraping

Pull a channel's metadata and iterate through its video library:

@dataclass
class DailymotionChannel:
    screenname: str
    fans_total: int
    videos_total: int
    views_total: int
    description: str
    created_time: int
    verified: bool
    country: str
    url: str

CHANNEL_FIELDS = [
    "screenname", "fans_total", "videos_total",
    "views_total", "description", "created_time",
    "verified", "country", "url",
    "avatar_720_url", "cover_url",
]

def get_channel_stats(
    client: DailymotionClient,
    channel_name: str,
    proxy: Optional[str] = None,
) -> Optional[DailymotionChannel]:
    """Get statistics for a Dailymotion channel."""
    data = client.get(
        f"/user/{channel_name}",
        params={"fields": ",".join(CHANNEL_FIELDS)},
        proxy=proxy,
    )

    if not data or "screenname" not in data:
        return None

    return DailymotionChannel(
        screenname=data.get("screenname", channel_name),
        fans_total=data.get("fans_total", 0),
        videos_total=data.get("videos_total", 0),
        views_total=data.get("views_total", 0),
        description=(data.get("description") or "")[:500],
        created_time=data.get("created_time", 0),
        verified=data.get("verified", False),
        country=data.get("country", ""),
        url=data.get("url", ""),
    )

def get_channel_videos(
    client: DailymotionClient,
    channel_name: str,
    max_videos: int = 500,
    sort: str = "recent",
    proxy: Optional[str] = None,
    delay: float = 1.0,
) -> list[DailymotionVideo]:
    """
    Retrieve all videos from a channel.
    sort: 'recent', 'visited', 'rating', 'relevance', 'random'
    """
    fields = [
        "id", "title", "duration", "views_total", "views_last_24h",
        "likes_total", "created_time", "tags",
        "thumbnail_url", "language", "country", "status",
    ]

    videos = []
    page = 1

    while len(videos) < max_videos:
        data = client.get(
            f"/user/{channel_name}/videos",
            params={
                "fields": ",".join(fields),
                "page": page,
                "limit": 100,
                "sort": sort,
            },
            proxy=proxy,
        )

        if not data or data.get("error"):
            break

        items = data.get("list", [])
        if not items:
            break

        for item in items:
            videos.append(DailymotionVideo(
                video_id=item.get("id", ""),
                title=item.get("title", ""),
                description="",
                tags=item.get("tags", []),
                duration_seconds=item.get("duration", 0),
                views_total=item.get("views_total", 0),
                views_last_hour=0,
                views_last_24h=item.get("views_last_24h", 0),
                views_last_week=0,
                likes_total=item.get("likes_total", 0),
                bookmarks_total=0,
                created_time=item.get("created_time", 0),
                updated_time=0,
                thumbnail_url=item.get("thumbnail_url", ""),
                thumbnail_720_url="",
                language=item.get("language", ""),
                country=item.get("country", ""),
                channel_name=channel_name,
                owner_screenname=channel_name,
                owner_fans_total=0,
                status=item.get("status", "published"),
            ))

        if not data.get("has_more"):
            break
        page += 1
        time.sleep(delay)

    return videos[:max_videos]

def get_similar_channels(
    client: DailymotionClient,
    channel_name: str,
    limit: int = 10,
) -> list[dict]:
    """Find channels similar to a given one."""
    data = client.get(
        f"/user/{channel_name}/subscriptions",
        params={"fields": "screenname,fans_total,videos_total,verified", "limit": limit},
    )
    return data.get("list", [])

Dailymotion's trending endpoint is essential for media monitoring:

AVAILABLE_COUNTRIES = [
    "us", "fr", "de", "gb", "es", "it", "br", "mx", "ar", "in",
    "au", "ca", "nl", "be", "ch", "pl", "ru", "jp", "kr", "tr",
]

AVAILABLE_CHANNELS = [
    "news", "sport", "fun", "music", "videogames",
    "tech", "travel", "animals", "auto", "film",
]

def get_trending_videos(
    client: DailymotionClient,
    country: str = "us",
    channel: Optional[str] = None,
    limit: int = 50,
    proxy: Optional[str] = None,
) -> list[dict]:
    """
    Get trending videos for a specific country and optional category.
    country: ISO 3166-1 alpha-2 code in lowercase
    channel: category name (see AVAILABLE_CHANNELS)
    """
    fields = [
        "id", "title", "views_total", "views_last_24h", "views_last_week",
        "duration", "created_time", "owner.screenname",
        "thumbnail_720_url", "language", "channel.name",
    ]

    params = {
        "fields": ",".join(fields),
        "country": country,
        "limit": min(100, limit),
        "sort": "trending",
    }
    if channel:
        params["channel"] = channel

    data = client.get("/videos", params=params, proxy=proxy)
    results = []

    for i, item in enumerate(data.get("list", [])[:limit]):
        results.append({
            "rank": i + 1,
            "country": country,
            "channel": channel or "all",
            "video_id": item.get("id", ""),
            "title": item.get("title", ""),
            "views_total": item.get("views_total", 0),
            "views_last_24h": item.get("views_last_24h", 0),
            "views_last_week": item.get("views_last_week", 0),
            "duration_seconds": item.get("duration", 0),
            "created_time": item.get("created_time", 0),
            "owner": item.get("owner.screenname", ""),
            "channel_name": item.get("channel.name", ""),
            "thumbnail_url": item.get("thumbnail_720_url", ""),
        })

    return results

def get_multi_country_trending(
    client: DailymotionClient,
    countries: list[str] = None,
    channel: Optional[str] = None,
    limit_per_country: int = 20,
    delay: float = 1.5,
    proxy_map: Optional[dict[str, str]] = None,
) -> dict[str, list[dict]]:
    """
    Collect trending videos across multiple countries.
    proxy_map: dict mapping country code -> proxy URL for geo-targeting.
    """
    if countries is None:
        countries = ["us", "fr", "de", "gb", "es", "it", "br"]

    all_trending = {}
    for country in countries:
        proxy = proxy_map.get(country) if proxy_map else None
        print(f"Fetching trending for {country.upper()}...")

        try:
            trending = get_trending_videos(
                client,
                country=country,
                channel=channel,
                limit=limit_per_country,
                proxy=proxy,
            )
            all_trending[country] = trending
            print(f"  Got {len(trending)} trending videos")
        except Exception as e:
            print(f"  Error for {country}: {e}")
            all_trending[country] = []

        time.sleep(delay)

    return all_trending

Proxy Configuration for Geo-Restricted Content

Geographic restrictions are the primary reason you'd need proxies for Dailymotion scraping. Content libraries vary significantly by country, especially for news and sports.

ThorData's residential proxies offer country-level targeting — route French content requests through French IPs, German content through German IPs. This isn't about evading bot detection (the API is fairly permissive); it's about seeing the same content catalog that users in each country see.

THORDATA_USER = "your_username"
THORDATA_PASS = "your_password"
THORDATA_HOST = "proxy.thordata.com"
THORDATA_PORT = 9001

def make_country_proxy(country_code: str) -> str:
    """
    Create a ThorData proxy URL targeting a specific country.
    country_code: ISO 3166-1 alpha-2 (e.g., 'fr', 'de', 'us')
    """
    # ThorData country targeting via session label
    session_label = f"dm-{country_code}-{random.randint(1000, 9999)}"
    return (
        f"http://{THORDATA_USER}-country-{country_code}-session-{session_label}"
        f":{THORDATA_PASS}@{THORDATA_HOST}:{THORDATA_PORT}"
    )

# Pre-build proxy map for target countries
COUNTRY_PROXIES = {
    country: make_country_proxy(country)
    for country in ["us", "fr", "de", "gb", "es", "it", "br", "au"]
}

def get_video_with_geo(
    client: DailymotionClient,
    video_id: str,
    target_country: str = "us",
) -> Optional[DailymotionVideo]:
    """
    Fetch video metadata using a country-specific proxy.
    Useful for checking whether geo-restricted content is available.
    """
    proxy = make_country_proxy(target_country)
    video = get_video(client, video_id, proxy=proxy)
    return video

def check_video_geo_availability(
    client: DailymotionClient,
    video_id: str,
    countries: list[str] = None,
) -> dict[str, bool]:
    """
    Check video availability across multiple countries.
    Returns dict: country_code -> available (bool)
    """
    if countries is None:
        countries = ["us", "fr", "de", "gb", "es"]

    availability = {}
    for country in countries:
        proxy = make_country_proxy(country)
        try:
            data = client.get(
                f"/video/{video_id}",
                params={"fields": "id,status"},
                proxy=proxy,
            )
            availability[country] = bool(data.get("id"))
        except Exception:
            availability[country] = False
        time.sleep(0.5)

    return availability

Storage Schema and Analytics

def setup_dailymotion_db(db_path: str = "dailymotion.db") -> sqlite3.Connection:
    conn = sqlite3.connect(db_path)
    conn.execute("PRAGMA journal_mode=WAL")
    conn.execute("PRAGMA synchronous=NORMAL")
    conn.executescript("""
        CREATE TABLE IF NOT EXISTS videos (
            video_id TEXT PRIMARY KEY,
            title TEXT,
            description TEXT,
            tags TEXT,
            channel_name TEXT,
            owner_screenname TEXT,
            owner_fans_total INTEGER,
            duration_seconds INTEGER,
            created_time INTEGER,
            language TEXT,
            country TEXT,
            is_explicit INTEGER DEFAULT 0,
            status TEXT DEFAULT 'published',
            scraped_at TEXT DEFAULT (datetime('now'))
        );

        CREATE TABLE IF NOT EXISTS video_stats (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            video_id TEXT NOT NULL,
            views_total INTEGER,
            views_last_hour INTEGER,
            views_last_24h INTEGER,
            views_last_week INTEGER,
            likes_total INTEGER,
            bookmarks_total INTEGER,
            snapshot_date TEXT DEFAULT (date('now')),
            captured_at TEXT DEFAULT (datetime('now')),
            FOREIGN KEY (video_id) REFERENCES videos(video_id),
            UNIQUE (video_id, snapshot_date)
        );

        CREATE TABLE IF NOT EXISTS channels (
            screenname TEXT PRIMARY KEY,
            fans_total INTEGER,
            videos_total INTEGER,
            views_total INTEGER,
            description TEXT,
            created_time INTEGER,
            verified INTEGER DEFAULT 0,
            country TEXT,
            url TEXT,
            scraped_at TEXT DEFAULT (datetime('now'))
        );

        CREATE TABLE IF NOT EXISTS trending_snapshots (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            country TEXT,
            channel_name TEXT,
            rank_position INTEGER,
            video_id TEXT,
            title TEXT,
            views_last_24h INTEGER,
            owner_screenname TEXT,
            snapshot_date TEXT DEFAULT (date('now')),
            captured_at TEXT DEFAULT (datetime('now')),
            FOREIGN KEY (video_id) REFERENCES videos(video_id)
        );

        CREATE INDEX IF NOT EXISTS idx_videos_channel ON videos(channel_name);
        CREATE INDEX IF NOT EXISTS idx_videos_created ON videos(created_time);
        CREATE INDEX IF NOT EXISTS idx_stats_video ON video_stats(video_id);
        CREATE INDEX IF NOT EXISTS idx_stats_date ON video_stats(snapshot_date);
        CREATE INDEX IF NOT EXISTS idx_trending_country ON trending_snapshots(country, snapshot_date);
    """)
    conn.commit()
    return conn

def save_video(conn: sqlite3.Connection, video: DailymotionVideo):
    """Save video metadata and a stats snapshot."""
    conn.execute("""
        INSERT OR IGNORE INTO videos
        (video_id, title, description, tags, channel_name, owner_screenname,
         owner_fans_total, duration_seconds, created_time, language, country,
         is_explicit, status)
        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
    """, (
        video.video_id, video.title, video.description[:500] if video.description else "",
        json.dumps(video.tags), video.channel_name, video.owner_screenname,
        video.owner_fans_total, video.duration_seconds, video.created_time,
        video.language, video.country, int(video.is_explicit), video.status,
    ))

    # Record stats snapshot (one per day)
    conn.execute("""
        INSERT OR REPLACE INTO video_stats
        (video_id, views_total, views_last_hour, views_last_24h, views_last_week,
         likes_total, bookmarks_total)
        VALUES (?, ?, ?, ?, ?, ?, ?)
    """, (
        video.video_id, video.views_total, video.views_last_hour,
        video.views_last_24h, video.views_last_week,
        video.likes_total, video.bookmarks_total,
    ))
    conn.commit()

def save_channel(conn: sqlite3.Connection, channel: DailymotionChannel):
    conn.execute("""
        INSERT OR REPLACE INTO channels
        (screenname, fans_total, videos_total, views_total, description,
         created_time, verified, country, url)
        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
    """, (
        channel.screenname, channel.fans_total, channel.videos_total,
        channel.views_total, channel.description[:500] if channel.description else "",
        channel.created_time, int(channel.verified), channel.country, channel.url,
    ))
    conn.commit()

def compute_video_growth(conn: sqlite3.Connection, video_id: str, days: int = 7) -> dict:
    """Calculate view growth rate over the last N days."""
    rows = conn.execute("""
        SELECT snapshot_date, views_total, views_last_24h
        FROM video_stats
        WHERE video_id = ?
        ORDER BY snapshot_date DESC
        LIMIT ?
    """, (video_id, days)).fetchall()

    if len(rows) < 2:
        return {"video_id": video_id, "insufficient_data": True}

    latest = rows[0]
    oldest = rows[-1]
    views_gained = latest[1] - oldest[1] if latest[1] and oldest[1] else 0
    days_span = len(rows) - 1

    return {
        "video_id": video_id,
        "total_views_now": latest[1],
        "total_views_then": oldest[1],
        "views_gained": views_gained,
        "days_tracked": days_span,
        "avg_daily_views": round(views_gained / days_span, 0) if days_span else 0,
        "latest_24h_views": latest[2],
    }

Error Handling and Production Patterns

For production monitoring pipelines, you need comprehensive error handling:

def run_channel_monitoring_pipeline(
    channels: list[str],
    db_path: str = "dailymotion.db",
    country: str = "us",
    max_videos_per_channel: int = 200,
) -> dict:
    """
    Monitor a list of channels: collect stats and new videos.
    """
    client = DailymotionClient(API_KEY, API_SECRET)
    conn = setup_dailymotion_db(db_path)
    proxy = make_country_proxy(country) if country != "us" else None

    stats = {
        "channels_processed": 0,
        "channels_failed": 0,
        "videos_saved": 0,
        "errors": [],
    }

    for channel_name in channels:
        print(f"\nProcessing channel: {channel_name}")

        # Get channel stats
        try:
            channel = get_channel_stats(client, channel_name, proxy=proxy)
            if channel:
                save_channel(conn, channel)
                print(f"  Fans: {channel.fans_total:,} | Videos: {channel.videos_total:,}")
        except Exception as e:
            print(f"  Channel stats failed: {e}")
            stats["errors"].append({"channel": channel_name, "error": str(e), "stage": "stats"})

        time.sleep(1.0)

        # Get recent videos
        try:
            videos = get_channel_videos(
                client, channel_name,
                max_videos=max_videos_per_channel,
                proxy=proxy,
            )
            print(f"  Got {len(videos)} videos")

            for video in videos:
                try:
                    save_video(conn, video)
                    stats["videos_saved"] += 1
                except Exception as e:
                    pass  # Non-fatal; log and continue

        except Exception as e:
            print(f"  Video collection failed: {e}")
            stats["errors"].append({"channel": channel_name, "error": str(e), "stage": "videos"})
            stats["channels_failed"] += 1
            continue

        stats["channels_processed"] += 1

    # Run trending snapshot
    print("\nCapturing trending snapshot...")
    try:
        trending = get_trending_videos(client, country=country, limit=50, proxy=proxy)
        for item in trending:
            conn.execute("""
                INSERT INTO trending_snapshots
                (country, channel_name, rank_position, video_id, title,
                 views_last_24h, owner_screenname)
                VALUES (?, ?, ?, ?, ?, ?, ?)
            """, (
                country, item.get("channel"), item.get("rank"),
                item.get("video_id"), item.get("title"),
                item.get("views_last_24h"), item.get("owner"),
            ))
        conn.commit()
        print(f"  Saved {len(trending)} trending videos")
    except Exception as e:
        print(f"  Trending failed: {e}")

    client.close()
    conn.close()

    print(f"\n=== Pipeline complete ===")
    print(f"Channels processed: {stats['channels_processed']}")
    print(f"Videos saved: {stats['videos_saved']}")
    print(f"Errors: {len(stats['errors'])}")

    return stats

def enrich_video_details(
    video_ids: list[str],
    db_path: str = "dailymotion.db",
    delay: float = 1.0,
) -> int:
    """
    Enrich video records with full metadata (not just search snippet fields).
    Returns count of successfully enriched videos.
    """
    client = DailymotionClient(API_KEY, API_SECRET)
    conn = sqlite3.connect(db_path)
    enriched = 0

    for video_id in video_ids:
        try:
            video = get_video(client, video_id)
            if video:
                save_video(conn, video)
                enriched += 1
        except Exception as e:
            print(f"Enrichment failed for {video_id}: {e}")
        time.sleep(delay)

    client.close()
    conn.close()
    return enriched

Legal Note

Dailymotion's API Terms of Use permit data collection for non-commercial analysis and application development, provided you:

Don't cache data beyond 24 hours without refreshing
Don't redistribute video content or build a competing video platform
Attribute Dailymotion as the source in public-facing uses
Don't use the data to circumvent monetization systems

The API is more permissive than most video platforms. Review current terms at developer.dailymotion.com before deploying anything commercial. Direct HTML scraping of the website (rather than the API) is explicitly prohibited in their ToS.

Key Takeaways

Dailymotion's Data API is one of the more accessible video platform APIs — free registration, reasonably generous rate limits, and granular field selection.
Always use the fields parameter to request only what you need. Expensive computed fields (like views_last_hour) increase server processing time and likely contribute to rate limit triggers.
Geographic restrictions are the biggest practical challenge — content libraries vary significantly by country, especially for news and sports.
ThorData residential proxies with country targeting let you see each market's content catalog as local users see it — essential for cross-market media analysis and comprehensive trending data.
Authenticate even if you don't strictly need to — the rate limit increase from 600 to 5,000 requests per 10-minute window makes it worthwhile for any sustained collection.
Build in daily snapshot logic from the start: view counts that seemed stale yesterday become valuable trend data after a month of collection.

Scraping Dailymotion Video Data and Channel Stats with Python (2026)

Scraping Dailymotion Video Data and Channel Stats with Python (2026)

What Data Is Available

Why Geographic Data Varies So Much

Anti-Bot Measures

Setting Up API Access

Fetching Video Metadata

Channel Scraping

Trending Videos by Country

Proxy Configuration for Geo-Restricted Content

Storage Schema and Analytics

Error Handling and Production Patterns

Legal Note

Key Takeaways