SEARCH, SELECT, SCROLL, RECOMMEND, Hybrid Search & Reranking

SEARCH — find similar points

Performs a semantic similarity search: your query text is embedded with the same model used during insert, then Qdrant finds the nearest vectors by cosine distance.

An optional WHERE clause filters the candidate set before similarity ranking so you only get results that match both the semantic query and the payload conditions.

Syntax:

SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n>
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING MODEL '<model_name>'
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> [USING MODEL '<model>'] WHERE <filter>
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING HYBRID
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING HYBRID [FUSION 'rrf|dbsf'] [DENSE MODEL '<model>'] [SPARSE MODEL '<model>'] [WHERE <filter>]
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING SPARSE [MODEL '<sparse_model>']
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> EXACT
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> [USING ...] [WHERE <filter>] [RERANK] WITH { hnsw_ef: <n>, exact: true|false, acorn: true|false }
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> [USING ...] [WHERE <filter>] RERANK [MODEL '<reranker_model>']

Examples:

Basic search, return top 5 results:

SEARCH articles SIMILAR TO 'machine learning algorithms' LIMIT 5

Search only papers published after 2020:

SEARCH articles SIMILAR TO 'deep learning' LIMIT 10 WHERE year > 2020

Hybrid search (combines dense semantic + sparse BM25 keyword retrieval via RRF by default):

SEARCH articles SIMILAR TO 'attention mechanism' LIMIT 10 USING HYBRID

Sparse-only search (queries only the sparse named vector — useful for pure keyword retrieval):

SEARCH medical_knowledge SIMILAR TO 'beta blocker contraindications' LIMIT 5 USING SPARSE

Sparse scores are unbounded dot-products. Unlike dense cosine similarity (which is bounded 0–1), sparse vector scores are raw dot-products that can exceed 1.0 — scores like 8.3 or 14.5 are perfectly normal and expected. Do not compare sparse scores to dense cosine scores; they are on different scales.

Exact search for recall debugging:

SEARCH articles SIMILAR TO 'attention mechanism' LIMIT 10 EXACT

Search with query-time HNSW tuning:

SEARCH articles SIMILAR TO 'attention mechanism' LIMIT 10 WITH { hnsw_ef: 128 }

Output:

Results are displayed as a table with three columns:

 Score  │ ID                                   │ Payload
────────┼──────────────────────────────────────┼──────────────────────────────────
 0.9241 │ 3f2e1a4b-...                          │ {'text': 'Neural networks...', 'author': 'alice'}
 0.8817 │ 7a1b2c3d-...                          │ {'text': 'Attention is all...', 'tags': [...]}

Important: Use the same model for SEARCH as you used for INSERT. Mixing models produces meaningless scores because the vectors live in different spaces.

SELECT — retrieve a point by ID

Fetches a single point payload by exact point ID.

Syntax:

SELECT * FROM <collection_name> WHERE id = '<point_id>'
SELECT * FROM <collection_name> WHERE id = <integer_id>

Examples:

SELECT * FROM articles WHERE id = '3f2e1a4b-8c91-4d0e-b123-abc123def456'
SELECT * FROM articles WHERE id = 42

SELECT in this version is intentionally strict:

only * projection is supported
only WHERE id = ... is supported

Query-Time Search Params (`EXACT`, `WITH`)

Use these when you want to debug retrieval quality or tune recall without changing collection-level settings.

Syntax	Effect
`EXACT`	Shorthand for exact KNN search (`exact=true`)
`WITH { hnsw_ef: 128 }`	Increase HNSW exploration at query time
`WITH { exact: true }`	Force exact KNN explicitly
`WITH { acorn: true }`	Enable ACORN for filtered queries

EXACT can appear after LIMIT or after RERANK
WITH { ... } can appear after WHERE and/or RERANK
Supported WITH keys are only hnsw_ef, exact, and acorn

-- Exact KNN baseline
SEARCH articles SIMILAR TO 'programming language' LIMIT 5 EXACT

-- Raise HNSW ef at query time
SEARCH articles SIMILAR TO 'transformers' LIMIT 10 WITH { hnsw_ef: 256 }

-- Filtered search with ACORN
SEARCH articles SIMILAR TO 'RAG' LIMIT 10 WHERE tag = 'li' WITH { acorn: true }

SCROLL — pagination / browsing

Use SCROLL to iterate through points in a collection page by page.

Syntax:

SCROLL FROM <collection_name> LIMIT <n>
SCROLL FROM <collection_name> WHERE <filter> LIMIT <n>
SCROLL FROM <collection_name> AFTER '<point_id>' LIMIT <n>
SCROLL FROM <collection_name> WHERE <filter> AFTER <point_id> LIMIT <n>

Examples:

SCROLL FROM articles LIMIT 50
SCROLL FROM articles WHERE year >= 2024 LIMIT 50
SCROLL FROM articles AFTER 'cursor-id' LIMIT 50

Behavior:

Returns points in ID order with payloads.
Returns a next_offset cursor when more points are available.
Use AFTER <next_offset> to fetch the next page.

Hybrid Search (USING HYBRID)

Hybrid search combines dense semantic vectors and sparse BM25 keyword vectors in a single query. By default QQL merges the two result sets with Qdrant’s Reciprocal Rank Fusion (RRF) algorithm, and you can optionally switch to DBSF with a FUSION clause.

How it works internally

Both a dense vector (TextEmbedding) and a sparse BM25 vector (SparseTextEmbedding) are generated from your query text.
Qdrant fetches the top candidates from each index independently (prefetch limit = LIMIT × 4).
The two result lists are merged using the selected fusion strategy (RRF by default, or DBSF when requested).
The final top-N results are returned.

Step 1: Create a hybrid collection

-- Shorthand (backward compatible)
CREATE COLLECTION articles HYBRID

-- USING form — allows specifying a dense model
CREATE COLLECTION articles USING HYBRID
CREATE COLLECTION articles USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5'

Step 2: Insert with hybrid vectors

INSERT INTO COLLECTION articles VALUES {
  'text': 'Attention is all you need',
  'author': 'Vaswani et al.',
  'year': 2017
} USING HYBRID

Step 3: Search with hybrid retrieval

-- Basic hybrid search
SEARCH articles SIMILAR TO 'transformer architecture' LIMIT 10 USING HYBRID

-- Hybrid search with a WHERE filter
SEARCH articles SIMILAR TO 'attention' LIMIT 10 USING HYBRID WHERE year >= 2017

-- Hybrid with DBSF fusion
SEARCH articles SIMILAR TO 'hybrid retrieval' LIMIT 10 USING HYBRID FUSION 'dbsf'

-- Hybrid with custom dense model
SEARCH articles SIMILAR TO 'embeddings' LIMIT 5
  USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5'

-- Hybrid with both custom models
SEARCH articles SIMILAR TO 'sparse retrieval' LIMIT 5
  USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5' SPARSE MODEL 'prithivida/Splade_PP_en_v1'

Model defaults in hybrid mode

Argument	Default
Dense model	configured default (`sentence-transformers/all-MiniLM-L6-v2`)
Sparse model	`Qdrant/bm25`
Fusion	`rrf`

Dense vs. hybrid — when to use which

Situation	Recommendation
Semantic similarity (paraphrasing, synonyms)	Dense only
Exact keyword matching (product codes, names)	Hybrid or BM25-only
General-purpose retrieval (unknown query distribution)	Hybrid
Low latency / small collection	Dense only

Performs a Qdrant recommendation query using existing point IDs as positive and optional negative examples. Qdrant uses those examples to retrieve nearby points, and QQL automatically excludes the seed IDs from the results.

Syntax:

RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) NEGATIVE IDS (<id>, ...) LIMIT <n>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) STRATEGY '<strategy>' LIMIT <n>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> WHERE <filter>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> OFFSET <n>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> SCORE THRESHOLD <f>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> WITH { exact: true, hnsw_ef: <n> }
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> LOOKUP FROM <collection>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> LOOKUP FROM <collection> VECTOR '<name>'
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> USING '<vector_name>'

Examples:

Recommend more results like two known articles:

RECOMMEND FROM articles POSITIVE IDS (1001, 1002) LIMIT 5

Recommend similar results while steering away from one bad example:

RECOMMEND FROM articles POSITIVE IDS (1001, 1002) NEGATIVE IDS (1009) LIMIT 5

Use Qdrant’s best_score recommendation strategy:

RECOMMEND FROM articles POSITIVE IDS (1001) STRATEGY 'best_score' LIMIT 10

Recommend only within a filtered subset:

RECOMMEND FROM articles POSITIVE IDS (1001) LIMIT 5 WHERE year >= 2020 AND status = 'published'

Cross-collection recommend (look up example IDs from another collection):

RECOMMEND FROM target_collection
  POSITIVE IDS ('a')
  LOOKUP FROM source_collection VECTOR 'dense'
  LIMIT 5

Full-featured recommend:

RECOMMEND FROM articles
  POSITIVE IDS (1001, 1002)
  NEGATIVE IDS (1009)
  STRATEGY 'best_score'
  LOOKUP FROM other_collection VECTOR 'dense'
  USING 'dense'
  LIMIT 10
  OFFSET 5
  SCORE THRESHOLD 0.5
  WHERE year >= 2020
  WITH { exact: true }

Supported strategies: average_vector, best_score, sum_scores

Clause order: POSITIVE IDS → NEGATIVE IDS → STRATEGY → LOOKUP FROM → USING → LIMIT → OFFSET → SCORE THRESHOLD → WHERE → WITH

Cross-Encoder Reranking (RERANK)

Appending RERANK to any SEARCH statement activates a second-pass relevance scoring step using a cross-encoder model. Cross-encoders process the (query, document) pair jointly, producing a more accurate relevance score at the cost of extra compute.

How it works internally

Qdrant executes the normal dense or hybrid search, but fetches LIMIT × 4 candidates.
Each candidate’s payload["text"] is paired with the original query text.
The cross-encoder scores all (query, document) pairs in one batch.
Results are sorted descending by cross-encoder score and sliced to LIMIT.
The score column reflects the cross-encoder relevance score (raw logits — higher is more relevant).

Syntax:

SEARCH <name> SIMILAR TO '<query>' LIMIT <n> RERANK
SEARCH <name> SIMILAR TO '<query>' LIMIT <n> RERANK MODEL '<cross_encoder_model>'
SEARCH ... LIMIT n [USING ...] [WHERE ...] RERANK [MODEL '...']

Examples:

Dense search + rerank (default cross-encoder):

SEARCH articles SIMILAR TO 'machine learning for healthcare' LIMIT 5 RERANK

Hybrid search + rerank (best of all three worlds):

SEARCH articles SIMILAR TO 'attention mechanism in transformers' LIMIT 10 USING HYBRID RERANK

Custom cross-encoder model:

SEARCH articles SIMILAR TO 'semantic search' LIMIT 5
  RERANK MODEL 'cross-encoder/ms-marco-MiniLM-L-6-v2'

Default cross-encoder model: cross-encoder/ms-marco-MiniLM-L-6-v2

Model	Notes
`cross-encoder/ms-marco-MiniLM-L-6-v2`	Default. Fast and accurate for passage reranking
`cross-encoder/ms-marco-MiniLM-L-12-v2`	Larger, higher quality, slower
`BAAI/bge-reranker-base`	BGE reranker, strong general-purpose performance
`BAAI/bge-reranker-large`	Highest quality BGE reranker, slower

When to use RERANK

Situation	Recommendation
High-precision retrieval (legal, medical, research)	Add `RERANK`
Small LIMIT (top-3 or top-5 results)	Very effective
Low latency required	Skip `RERANK` (adds ~100–500 ms per batch)
Large collections with keyword-heavy queries	`USING HYBRID RERANK`

Note on scores: After reranking, the score column shows the cross-encoder’s raw logit (can be any real number, unbounded). Do not compare reranked scores to non-reranked cosine similarity scores.

SEARCH, SELECT, SCROLL, RECOMMEND, Hybrid Search & Reranking

SEARCH — find similar points

SELECT — retrieve a point by ID

Query-Time Search Params (EXACT, WITH)

SCROLL — pagination / browsing

Hybrid Search (USING HYBRID)

How it works internally

Step 1: Create a hybrid collection

Step 2: Insert with hybrid vectors

Step 3: Search with hybrid retrieval

Model defaults in hybrid mode

Dense vs. hybrid — when to use which

RECOMMEND — retrieve by example IDs

Cross-Encoder Reranking (RERANK)

How it works internally

When to use RERANK

Query-Time Search Params (`EXACT`, `WITH`)