Data Sources

This document provides a conceptual overview of the different data sources available for building personalized feeds. These data sources are organized into onchain and offchain graphs that provide the foundation for understanding user behavior, content relationships, and engagement patterns.

Overview

Feed recommendations are built using data from two main graph types:

  1. Onchain Graphs - Data recorded on blockchain networks
  2. Offchain Graphs - Data from social platforms and applications

Together, these graphs enable personalized, relevant content experiences for users across Farcaster, Zora, Polymarket, and other Web3 platforms.


Searchable Indices

The Studio APIs expose four Elasticsearch indices that you can query directly via the Search endpoints. Each index is optimized for a specific data source with its own filterable fields, sort options, and AI labels.

IndexContentDescriptionKey Sort Fields
polymarket-itemsPrediction marketsActive and historical Polymarket markets with pricing, volume, liquidity, and AI labelsvolume_24hr, liquidity, end_date, created_at
polymarket-walletsTrader profilesPolymarket wallet profiles with trading stats, PnL, AI-inferred interest labels, and similar walletsvolume, pnl, updated_at
farcaster-itemsSocial postsFarcaster casts with engagement metrics, AI topic/sentiment labels, and content moderation scoresscore_popular, score_trending, item_creation_timestamp
zora-coinsNFT coins/tokensZora coins with market data, creator info, media metadata, and trading metricszora_market_cap, zora_volume_24h, zora_created_at

Quick test — fetch trending Polymarket markets:

curl -X POST https://api.mbd.xyz/v3/studio/search/filter_and_sort \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "index": "polymarket-items",
    "size": 10,
    "sort_by": { "field": "volume_24hr", "order": "desc" },
    "include": [
      { "filter": "term", "field": "active", "value": true },
      { "filter": "numeric", "field": "liquidity_num", "operator": ">", "value": 10000 }
    ]
  }'

For complete per-index field tables, filter types, and ready-to-use recipes, see the Filter Cookbook.


Onchain Graphs

Onchain graphs represent data that is permanently recorded on blockchain networks. This data is transparent, verifiable, and provides a reliable foundation for understanding user behavior across different protocols.

Agents

Agents represent user profiles or wallet addresses that interact with onchain protocols. They can be standard social media profiles or profiles associated with wallet addresses.

Platform Examples:

  • Farcaster: Farcaster IDs (FIDs) linked to wallet addresses, verified onchain identities, creators, collectors, and participants
  • Zora: Wallet addresses that create, mint, or trade Zora coins (creators and collectors)
  • Polymarket: Wallet addresses that participate in prediction markets (traders, market creators, liquidity providers)

Use Cases: User profiling, behavior analysis, similar user discovery, creator identification, wallet-based personalization

Intent

Intent represents user actions and interactions recorded onchain. These include both user-to-user and user-to-item interactions that indicate preferences, interests, and engagement patterns.

Platform Examples:

  • Farcaster: Onchain social actions, wallet-based interactions, cross-protocol activity (NFT purchases, token transfers)
  • Zora: User-to-item: minting, purchasing, trading, holding coins. User-to-user: following creators, supporting creators' work
  • Polymarket: User-to-item: placing bets, trading positions, providing liquidity, creating markets. User-to-user: following traders, mirroring strategies

Use Cases: Understanding user preferences, predicting future interactions, identifying trending content, building affinity scores

Assets

Assets represent onchain content or digital items that users interact with. These can be user-generated content (text, images, audio, video posts) or onchain tokens, NFTs, or market contracts.

Platform Examples:

  • Farcaster: Casts (posts) referencing onchain content, NFTs embedded in casts, tokens/coins mentioned or shared, onchain media
  • Zora: Zora coins (NFTs/tokens) with onchain metadata, ownership history, trading data, market data (volume, price, liquidity)
  • Polymarket: Prediction market contracts with onchain resolution conditions, trading history, market data (volume, liquidity, prices), outcomes and payouts

Use Cases: Content discovery, similar asset identification, trending asset detection, market-based filtering and ranking


Offchain Graphs

Offchain graphs represent data from social platforms and applications that is not recorded on blockchain networks. This data provides rich context about user behavior, content engagement, and social relationships.

Impressions

Impressions track when users view or are exposed to content, even if they don't explicitly interact with it. This data helps understand what content users have seen and prevents showing duplicate content.

Platform Examples:

  • Farcaster: Track which casts a user has viewed, scroll depth, time spent viewing, feed position, cross-app impression data
  • Zora: Track which coins a user has viewed in marketplaces, browsing history, time spent viewing coin details, marketplace navigation patterns
  • Polymarket: Track which markets a user has viewed, market browsing patterns, time spent analyzing market details, market discovery paths

Use Cases: Preventing duplicate content, understanding content discovery patterns, optimizing feed freshness, tracking engagement beyond explicit interactions

Metadata

Metadata includes rich information about content and users that is generated through AI analysis and platform data. This includes content analysis, labeling, and user profiling.

Platform Examples:

  • Farcaster: Content metadata (AI labels for topics/sentiment/emotion/moderation/web3_topics, media embeddings, text analysis, author info). User metadata (user embeddings, profile info, following graph, engagement patterns)
  • Zora: Asset metadata (coin descriptions, creator info, media analysis, market performance data, trading patterns). User metadata (creator profiles, collector preferences, trading behavior)
  • Polymarket: Market metadata (market descriptions, categories, question analysis, performance metrics, resolution conditions). User metadata (trader profiles, trading history, market preferences)

Use Cases: Content filtering and categorization, topic-based recommendations, quality control and moderation, personalization based on content analysis

Catalog

The Catalog represents the comprehensive inventory of all available content and users in the system. It's the complete database that feeds use to generate candidates.

Platform Examples:

  • Farcaster: Complete catalog of all casts (posts), full user profile database with FIDs, channel and community listings, historical content archive, real-time content stream
  • Zora: Complete catalog of all Zora coins, creator and collector profiles, market data and trading history, collection and series information, real-time minting and trading activity
  • Polymarket: Complete catalog of all prediction markets, market categories and tags, trader profiles and histories, market performance data, real-time market activity

Use Cases: Candidate generation for feeds, content discovery and search, user and content resolution, comprehensive data access for filtering


Using Data Sources in Feed Building

Candidate Generation

Feed builders use data sources to define the pool of content:

  • Agents → Filter by specific creators or user segments
  • Assets → Filter by content type, topics, or categories
  • Catalog → Access the full inventory for candidate generation
  • Metadata → Filter by AI labels, topics, sentiment, etc.

Ranking

Data sources inform ranking algorithms:

  • Intent → User interaction history informs interest and affinity scores
  • Impressions → Track what users have seen to avoid duplicates
  • Metadata → Content analysis influences relevance scoring
  • Agents → Social graph data powers affinity-based ranking

Personalization

All data sources contribute to personalization:

  • Agents → User profiles and behavior patterns
  • Intent → Interaction history and preferences
  • Assets → Content characteristics and relationships
  • Metadata → AI-derived insights about content and users
  • Impressions → View history and engagement patterns
  • Catalog → Complete context for recommendations

Summary

Feed building leverages data from both onchain and offchain sources:

Onchain Graphs provide verifiable, permanent records of:

  • Agents (users/wallets) and their identities
  • Intent (interactions) recorded on blockchain
  • Assets (content/tokens/markets) with onchain metadata

Offchain Graphs provide rich, contextual data about:

  • Impressions (view and exposure tracking)
  • Metadata (AI analysis and content enrichment)
  • Catalog (complete inventory of users and content)

By combining onchain verifiability with offchain richness, feed builders can create highly personalized, relevant content experiences for users across Farcaster, Zora, Polymarket, and other Web3 platforms.


References

  • Check out our AI pipeline to see what's under the hood.
  • Search guide — Studio API Search endpoints: filter_and_sort, boost, semantic, frequent_values
  • Filter Cookbook — All 12 filter types, per-index filterable fields, AI label taxonomy, and 12 ready-to-use recipes
  • Quickstart — Full 4-stage Studio API pipeline walkthrough