Scoring & Ranking

Stages 3-4 of the Studio API pipeline. After Search generates candidates and Features enriches them, these two services rerank and finalize your feed.

Search  →  Features  →  Scoring  →  Ranking
  ↓           ↓            ↓           ↓
Candidates  Enriched     Reranked    Final feed
from ES     with ML      by model    with diversity
            features                 & limits

Base URL: https://api.mbd.xyz/v3/studio Auth: Authorization: Bearer <your-console-api-key> — get your key from the Embed Console


Stage 3: Scoring

The Scoring service reranks items using a trained ML model (AWS Personalize). Given a user and items, it returns the same items in a new personalized order.

Endpoint

POST /scoring/ranking_model/{model_name}

Available models: polymarket-rerank-v1

Request

curl -X POST https://api.mbd.xyz/v3/studio/scoring/ranking_model/polymarket-rerank-v1 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "user_id": "0xf68a281980f8c13828e84e147e3822381d6e5b1b",
    "item_ids": ["1289113", "1288428", "1287500"]
  }'

Request Parameters

ParameterTypeRequiredDescription
user_idstringYesWallet address of the user
item_idsstring[]YesItem IDs to rerank

Response

{
  "result": ["1288428", "1289113", "1287500"]
}

The response is the same item IDs in a new order — the model's predicted best-to-worst ranking for this user.

Note: Errors are returned with HTTP 200 and an error field, not HTTP error codes. Always check for response.error.


Stage 4: Ranking

The Ranking service applies a 3-stage post-processing pipeline: sort → diversity → limits_by_field. Each stage is optional. Use this to blend multiple signals, enforce topic diversity, and cap repetitive content.

Endpoint

POST /ranking/feed

Request

curl -X POST https://api.mbd.xyz/v3/studio/ranking/feed \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "items": [
      { "item_id": "1288428", "rank_score": 1.0, "topic_score": 0.85, "sem_sim_closest": 0.88, "primary_tag": "Crypto" },
      { "item_id": "1289113", "rank_score": 0.5, "topic_score": 0.72, "sem_sim_closest": 0.72, "primary_tag": "Politics" },
      { "item_id": "1287500", "rank_score": 0.0, "topic_score": 0.65, "sem_sim_closest": 0.65, "primary_tag": "Crypto" }
    ],
    "sort": {
      "method": "sort",
      "params": { "fields": ["rank_score"], "direction": ["desc"] }
    },
    "diversity": {
      "method": "fields",
      "params": { "fields": ["primary_tag"] }
    },
    "limits_by_field": {
      "every_n": 5,
      "rules": [{ "field": "primary_tag", "limit": 2 }]
    }
  }'

This preserves the scoring model's order as the base sort, then applies tag-based diversity (round-robin across categories) and limits (at most 2 markets with the same tag in any window of 5).

Request Parameters

ParameterTypeRequiredDescription
itemsobject[]YesItems with item_id + any numeric fields for sorting
sortobjectNoSorting configuration (see Sort Methods below)
diversityobjectNoDiversity configuration (see Diversity Methods below)
limits_by_fieldobjectNoPer-field frequency limits

Sort Methods

MethodWhat it doesParams
sortMulti-field sort (like SQL ORDER BY){ fields: [...], direction: [...] }
linearWeighted sum → descending order[{ field, weight }, ...]
mixInterleave from multiple sorted lists by percentage[{ field, percentage, direction? }, ...]

sort — Deterministic multi-field ordering:

{
  "method": "sort",
  "params": {
    "fields": ["volume_24hr", "topic_similarity"],
    "direction": ["desc", "desc"]
  }
}

linear — Weighted combination of signals. Score = sum(field_i * weight_i), sorted descending:

{
  "method": "linear",
  "params": [
    { "field": "topic_score", "weight": 0.5 },
    { "field": "sem_sim_closest", "weight": 0.5 }
  ]
}

Warning: The linear method does not normalize fields to the same scale. If you mix fields with different magnitudes (e.g., volume_24hr in millions vs. sem_sim_closest at 0–1), the large-scale field will dominate regardless of weight. Only use linear with fields that are all on the same 0–1 scale (like topic_score and sem_sim_closest), or use the sort method instead.

mix — Interleave items from multiple sorted orderings by percentage:

{
  "method": "mix",
  "params": [
    { "field": "topic_similarity", "percentage": 60, "direction": "desc" },
    { "field": "volume_24hr", "percentage": 40, "direction": "desc" }
  ]
}

Diversity Methods

MethodWhat it doesParams
fieldsRound-robin over groups by field values{ fields: [...] }
semanticMMR-style diversity using embeddings{ lambda: 0-1, horizon: >=5 }

fields — Round-robin interleaving ensures no single category dominates:

{
  "method": "fields",
  "params": { "fields": ["category"] }
}

semantic — Maximal Marginal Relevance balances relevance vs. novelty. Items must have an embed field. Lambda closer to 1 = favor relevance; closer to 0 = favor diversity:

{
  "method": "semantic",
  "params": { "lambda": 0.7, "horizon": 10 }
}

Limits by Field

Cap how often items with the same field value appear within a sliding window:

{
  "every_n": 5,
  "rules": [{ "field": "category", "limit": 3 }]
}

This means: within every 5 consecutive items, at most 3 can have the same category value.

  • every_n must be >= 2
  • Multiple rules can target different fields

Response

{
  "result": {
    "items": [
      { "item_id": "1288428", "score": 1.0 },
      { "item_id": "1289113", "score": 0.5 },
      { "item_id": "1287500", "score": 0.0 }
    ]
  }
}

Items come back in final order with position-based scores in [0, 1]. First item = 1.0, last item = 0.0.


Data Flow Between Stages

StageInputOutputPassed to next stage
SearchIndex, filters, sorthits[] with _id and _sourceItem IDs → Features & Scoring
FeaturesUser profile + item IDsFeature values per itemFeature values → Ranking items
ScoringUser ID + item IDsReranked item ID arrayReranked order → Ranking items
RankingItems with numeric fields + configFinal ordered items with scoresDisplay to user

All Studio API Endpoints

StageMethodEndpointPurpose
SearchPOST/search/filter_and_sortStructured queries with filters + sorting
SearchPOST/search/boostSoft relevance tuning with boost multipliers
SearchPOST/search/semanticText or vector similarity search
SearchGET/search/frequent_values/{index}/{field}Discover field values before filtering
FeaturesPOST/features/v1Enrich items with ML features
ScoringPOST/scoring/ranking_model/{model_name}Rerank with trained ML model
RankingPOST/ranking/feedSort + diversity + limits pipeline
StoriesPOST/stories/generateNarrative content generation

What's Next

  • Search guide — Stage 1: filter_and_sort, boost, semantic, frequent_values
  • Features — Stage 2: enrich items with ML features
  • Filter Cookbook — All 12 filter types with per-index field tables and recipes
  • Quickstart — Full 4-stage pipeline walkthrough with copy-paste code

Looking for legacy scoring algorithms? They're documented in How Scoring Works.


What’s Next

Learn about feed construction and deployment