Scoring & Ranking

Stages 3-4 of the Studio API pipeline. After Search generates candidates and Features enriches them, these two services rerank and finalize your feed.

Search  →  Features  →  Scoring  →  Ranking
  ↓           ↓            ↓           ↓
Candidates  Enriched     Reranked    Final feed
from ES     with ML      by model    with diversity
            features                 & limits

Base URL: https://api.mbd.xyz/v3/studio Auth: Authorization: Bearer <your-console-api-key> — get your key from the Embed Console

Stage 3: Scoring

The Scoring service reranks items using a trained ML model (AWS Personalize). Given a user and items, it returns the same items in a new personalized order.

Endpoint

POST /scoring/ranking_model/{model_name}

Available models: polymarket-rerank-v1

Request

curl -X POST https://api.mbd.xyz/v3/studio/scoring/ranking_model/polymarket-rerank-v1 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "user_id": "0xf68a281980f8c13828e84e147e3822381d6e5b1b",
    "item_ids": ["1289113", "1288428", "1287500"]
  }'

Request Parameters

Parameter	Type	Required	Description
`user_id`	string	Yes	Wallet address of the user
`item_ids`	string[]	Yes	Item IDs to rerank

Response

{
  "result": ["1288428", "1289113", "1287500"]
}

The response is the same item IDs in a new order — the model's predicted best-to-worst ranking for this user.

Note: Errors are returned with HTTP 200 and an error field, not HTTP error codes. Always check for response.error.

Stage 4: Ranking

The Ranking service applies a 3-stage post-processing pipeline: sort → diversity → limits_by_field. Each stage is optional. Use this to blend multiple signals, enforce topic diversity, and cap repetitive content.

Endpoint

POST /ranking/feed

Request

curl -X POST https://api.mbd.xyz/v3/studio/ranking/feed \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "items": [
      { "item_id": "1288428", "rank_score": 1.0, "topic_score": 0.85, "sem_sim_closest": 0.88, "primary_tag": "Crypto" },
      { "item_id": "1289113", "rank_score": 0.5, "topic_score": 0.72, "sem_sim_closest": 0.72, "primary_tag": "Politics" },
      { "item_id": "1287500", "rank_score": 0.0, "topic_score": 0.65, "sem_sim_closest": 0.65, "primary_tag": "Crypto" }
    ],
    "sort": {
      "method": "sort",
      "params": { "fields": ["rank_score"], "direction": ["desc"] }
    },
    "diversity": {
      "method": "fields",
      "params": { "fields": ["primary_tag"] }
    },
    "limits_by_field": {
      "every_n": 5,
      "rules": [{ "field": "primary_tag", "limit": 2 }]
    }
  }'

This preserves the scoring model's order as the base sort, then applies tag-based diversity (round-robin across categories) and limits (at most 2 markets with the same tag in any window of 5).

Request Parameters

Parameter	Type	Required	Description
`items`	object[]	Yes	Items with `item_id` + any numeric fields for sorting
`sort`	object	No	Sorting configuration (see Sort Methods below)
`diversity`	object	No	Diversity configuration (see Diversity Methods below)
`limits_by_field`	object	No	Per-field frequency limits

Sort Methods

Method	What it does	Params
`sort`	Multi-field sort (like SQL ORDER BY)	`{ fields: [...], direction: [...] }`
`linear`	Weighted sum → descending order	`[{ field, weight }, ...]`
`mix`	Interleave from multiple sorted lists by percentage	`[{ field, percentage, direction? }, ...]`

sort — Deterministic multi-field ordering:

{
  "method": "sort",
  "params": {
    "fields": ["volume_24hr", "topic_similarity"],
    "direction": ["desc", "desc"]
  }
}

linear — Weighted combination of signals. Score = sum(field_i * weight_i), sorted descending:

{
  "method": "linear",
  "params": [
    { "field": "topic_score", "weight": 0.5 },
    { "field": "sem_sim_closest", "weight": 0.5 }
  ]
}

Warning: The linear method does not normalize fields to the same scale. If you mix fields with different magnitudes (e.g., volume_24hr in millions vs. sem_sim_closest at 0–1), the large-scale field will dominate regardless of weight. Only use linear with fields that are all on the same 0–1 scale (like topic_score and sem_sim_closest), or use the sort method instead.

mix — Interleave items from multiple sorted orderings by percentage:

{
  "method": "mix",
  "params": [
    { "field": "topic_similarity", "percentage": 60, "direction": "desc" },
    { "field": "volume_24hr", "percentage": 40, "direction": "desc" }
  ]
}

Diversity Methods

Method	What it does	Params
`fields`	Round-robin over groups by field values	`{ fields: [...] }`
`semantic`	MMR-style diversity using embeddings	`{ lambda: 0-1, horizon: >=5 }`

fields — Round-robin interleaving ensures no single category dominates:

{
  "method": "fields",
  "params": { "fields": ["category"] }
}

semantic — Maximal Marginal Relevance balances relevance vs. novelty. Items must have an embed field. Lambda closer to 1 = favor relevance; closer to 0 = favor diversity:

{
  "method": "semantic",
  "params": { "lambda": 0.7, "horizon": 10 }
}

Limits by Field

Cap how often items with the same field value appear within a sliding window:

{
  "every_n": 5,
  "rules": [{ "field": "category", "limit": 3 }]
}

This means: within every 5 consecutive items, at most 3 can have the same category value.

every_n must be >= 2
Multiple rules can target different fields

Response

{
  "result": {
    "items": [
      { "item_id": "1288428", "score": 1.0 },
      { "item_id": "1289113", "score": 0.5 },
      { "item_id": "1287500", "score": 0.0 }
    ]
  }
}

Items come back in final order with position-based scores in [0, 1]. First item = 1.0, last item = 0.0.

Data Flow Between Stages

Stage	Input	Output	Passed to next stage
Search	Index, filters, sort	`hits[]` with `_id` and `_source`	Item IDs → Features & Scoring
Features	User profile + item IDs	Feature values per item	Feature values → Ranking items
Scoring	User ID + item IDs	Reranked item ID array	Reranked order → Ranking items
Ranking	Items with numeric fields + config	Final ordered items with scores	Display to user

All Studio API Endpoints

Stage	Method	Endpoint	Purpose
Search	POST	`/search/filter_and_sort`	Structured queries with filters + sorting
Search	POST	`/search/boost`	Soft relevance tuning with boost multipliers
Search	POST	`/search/semantic`	Text or vector similarity search
Search	GET	`/search/frequent_values/{index}/{field}`	Discover field values before filtering
Features	POST	`/features/v1`	Enrich items with ML features
Scoring	POST	`/scoring/ranking_model/{model_name}`	Rerank with trained ML model
Ranking	POST	`/ranking/feed`	Sort + diversity + limits pipeline
Stories	POST	`/stories/generate`	Narrative content generation

What's Next

Search guide — Stage 1: filter_and_sort, boost, semantic, frequent_values
Features — Stage 2: enrich items with ML features
Filter Cookbook — All 12 filter types with per-index field tables and recipes
Quickstart — Full 4-stage pipeline walkthrough with copy-paste code

Looking for legacy scoring algorithms? They're documented in How Scoring Works.