Data Sources

This document provides a conceptual overview of the different data sources available for building personalized feeds. These data sources are organized into onchain and offchain graphs that provide the foundation for understanding user behavior, content relationships, and engagement patterns.

Overview

Feed recommendations are built using data from two main graph types:

  1. Onchain Graphs - Data recorded on blockchain networks
  2. Offchain Graphs - Data from social platforms and applications

Together, these graphs enable personalized, relevant content experiences for users across Farcaster, Zora, Polymarket, and other Web3 platforms.


Onchain Graphs

Onchain graphs represent data that is permanently recorded on blockchain networks. This data is transparent, verifiable, and provides a reliable foundation for understanding user behavior across different protocols.

Agents

Agents represent user profiles or wallet addresses that interact with onchain protocols. They can be standard social media profiles or profiles associated with wallet addresses.

Platform Examples:

  • Farcaster: Farcaster IDs (FIDs) linked to wallet addresses, verified onchain identities, creators, collectors, and participants
  • Zora: Wallet addresses that create, mint, or trade Zora coins (creators and collectors)
  • Polymarket: Wallet addresses that participate in prediction markets (traders, market creators, liquidity providers)

Use Cases: User profiling, behavior analysis, similar user discovery, creator identification, wallet-based personalization

Intent

Intent represents user actions and interactions recorded onchain. These include both user-to-user and user-to-item interactions that indicate preferences, interests, and engagement patterns.

Platform Examples:

  • Farcaster: Onchain social actions, wallet-based interactions, cross-protocol activity (NFT purchases, token transfers)
  • Zora: User-to-item: minting, purchasing, trading, holding coins. User-to-user: following creators, supporting creators' work
  • Polymarket: User-to-item: placing bets, trading positions, providing liquidity, creating markets. User-to-user: following traders, mirroring strategies

Use Cases: Understanding user preferences, predicting future interactions, identifying trending content, building affinity scores

Assets

Assets represent onchain content or digital items that users interact with. These can be user-generated content (text, images, audio, video posts) or onchain tokens, NFTs, or market contracts.

Platform Examples:

  • Farcaster: Casts (posts) referencing onchain content, NFTs embedded in casts, tokens/coins mentioned or shared, onchain media
  • Zora: Zora coins (NFTs/tokens) with onchain metadata, ownership history, trading data, market data (volume, price, liquidity)
  • Polymarket: Prediction market contracts with onchain resolution conditions, trading history, market data (volume, liquidity, prices), outcomes and payouts

Use Cases: Content discovery, similar asset identification, trending asset detection, market-based filtering and ranking


Offchain Graphs

Offchain graphs represent data from social platforms and applications that is not recorded on blockchain networks. This data provides rich context about user behavior, content engagement, and social relationships.

Impressions

Impressions track when users view or are exposed to content, even if they don't explicitly interact with it. This data helps understand what content users have seen and prevents showing duplicate content.

Platform Examples:

  • Farcaster: Track which casts a user has viewed, scroll depth, time spent viewing, feed position, cross-app impression data
  • Zora: Track which coins a user has viewed in marketplaces, browsing history, time spent viewing coin details, marketplace navigation patterns
  • Polymarket: Track which markets a user has viewed, market browsing patterns, time spent analyzing market details, market discovery paths

Use Cases: Preventing duplicate content, understanding content discovery patterns, optimizing feed freshness, tracking engagement beyond explicit interactions

Metadata

Metadata includes rich information about content and users that is generated through AI analysis and platform data. This includes content analysis, labeling, and user profiling.

Platform Examples:

  • Farcaster: Content metadata (AI labels for topics/sentiment/emotion/moderation/web3_topics, media embeddings, text analysis, author info). User metadata (user embeddings, profile info, following graph, engagement patterns)
  • Zora: Asset metadata (coin descriptions, creator info, media analysis, market performance data, trading patterns). User metadata (creator profiles, collector preferences, trading behavior)
  • Polymarket: Market metadata (market descriptions, categories, question analysis, performance metrics, resolution conditions). User metadata (trader profiles, trading history, market preferences)

Use Cases: Content filtering and categorization, topic-based recommendations, quality control and moderation, personalization based on content analysis

Catalog

Catalog represents the comprehensive inventory of all available content and users in the system. It's the complete database that feeds use to generate candidates.

Platform Examples:

  • Farcaster: Complete catalog of all casts (posts), full user profile database with FIDs, channel and community listings, historical content archive, real-time content stream
  • Zora: Complete catalog of all Zora coins, creator and collector profiles, market data and trading history, collection and series information, real-time minting and trading activity
  • Polymarket: Complete catalog of all prediction markets, market categories and tags, trader profiles and histories, market performance data, real-time market activity

Use Cases: Candidate generation for feeds, content discovery and search, user and content resolution, comprehensive data access for filtering


Using Data Sources in Feed Building

Candidate Generation

Feed builders use data sources to define the pool of content:

  • Agents → Filter by specific creators or user segments
  • Assets → Filter by content type, topics, or categories
  • Catalog → Access the full inventory for candidate generation
  • Metadata → Filter by AI labels, topics, sentiment, etc.

Ranking

Data sources inform ranking algorithms:

  • Intent → User interaction history informs interest and affinity scores
  • Impressions → Track what users have seen to avoid duplicates
  • Metadata → Content analysis influences relevance scoring
  • Agents → Social graph data powers affinity-based ranking

Personalization

All data sources contribute to personalization:

  • Agents → User profiles and behavior patterns
  • Intent → Interaction history and preferences
  • Assets → Content characteristics and relationships
  • Metadata → AI-derived insights about content and users
  • Impressions → View history and engagement patterns
  • Catalog → Complete context for recommendations

Summary

Feed building leverages data from both onchain and offchain sources:

Onchain Graphs provide verifiable, permanent records of:

  • Agents (users/wallets) and their identities
  • Intent (interactions) recorded on blockchain
  • Assets (content/tokens/markets) with onchain metadata

Offchain Graphs provide rich, contextual data about:

  • Impressions (view and exposure tracking)
  • Metadata (AI analysis and content enrichment)
  • Catalog (complete inventory of users and content)

By combining onchain verifiability with offchain richness, feed builders can create highly personalized, relevant content experiences for users across Farcaster, Zora, Polymarket, and other Web3 platforms.


References