SEARCH, SELECT, SCROLL, RECOMMEND, Hybrid Search & Reranking
SEARCH, SELECT, SCROLL, RECOMMEND, Hybrid Search & Reranking
SEARCH — find similar points
Performs a semantic similarity search: your query text is embedded with the same model used during insert, then Qdrant finds the nearest vectors by cosine distance.
An optional WHERE clause filters the candidate set before similarity ranking so you only get results that match both the semantic query and the payload conditions.
Syntax:
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n>
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING MODEL '<model_name>'
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> [USING MODEL '<model>'] WHERE <filter>
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING HYBRID
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING HYBRID [FUSION 'rrf|dbsf'] [DENSE MODEL '<model>'] [SPARSE MODEL '<model>'] [WHERE <filter>]
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING SPARSE [MODEL '<sparse_model>']
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> EXACT
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> [USING ...] [WHERE <filter>] [RERANK] WITH { hnsw_ef: <n>, exact: true|false, acorn: true|false }
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> [USING ...] [WHERE <filter>] RERANK [MODEL '<reranker_model>']
Examples:
Basic search, return top 5 results:
SEARCH articles SIMILAR TO 'machine learning algorithms' LIMIT 5
Search only papers published after 2020:
SEARCH articles SIMILAR TO 'deep learning' LIMIT 10 WHERE year > 2020
Hybrid search (combines dense semantic + sparse BM25 keyword retrieval via RRF by default):
SEARCH articles SIMILAR TO 'attention mechanism' LIMIT 10 USING HYBRID
Sparse-only search (queries only the sparse named vector — useful for pure keyword retrieval):
SEARCH medical_knowledge SIMILAR TO 'beta blocker contraindications' LIMIT 5 USING SPARSE
Sparse scores are unbounded dot-products. Unlike dense cosine similarity (which is bounded 0–1), sparse vector scores are raw dot-products that can exceed 1.0 — scores like 8.3 or 14.5 are perfectly normal and expected. Do not compare sparse scores to dense cosine scores; they are on different scales.
Exact search for recall debugging:
SEARCH articles SIMILAR TO 'attention mechanism' LIMIT 10 EXACT
Search with query-time HNSW tuning:
SEARCH articles SIMILAR TO 'attention mechanism' LIMIT 10 WITH { hnsw_ef: 128 }
Output:
Results are displayed as a table with three columns:
Score │ ID │ Payload
────────┼──────────────────────────────────────┼──────────────────────────────────
0.9241 │ 3f2e1a4b-... │ {'text': 'Neural networks...', 'author': 'alice'}
0.8817 │ 7a1b2c3d-... │ {'text': 'Attention is all...', 'tags': [...]}
Important: Use the same model for SEARCH as you used for INSERT. Mixing models produces meaningless scores because the vectors live in different spaces.
SELECT — retrieve a point by ID
Fetches a single point payload by exact point ID.
Syntax:
SELECT * FROM <collection_name> WHERE id = '<point_id>'
SELECT * FROM <collection_name> WHERE id = <integer_id>
Examples:
SELECT * FROM articles WHERE id = '3f2e1a4b-8c91-4d0e-b123-abc123def456'
SELECT * FROM articles WHERE id = 42
SELECT in this version is intentionally strict:
- only
*projection is supported - only
WHERE id = ...is supported
Query-Time Search Params (EXACT, WITH)
Use these when you want to debug retrieval quality or tune recall without changing collection-level settings.
| Syntax | Effect |
|---|---|
EXACT |
Shorthand for exact KNN search (exact=true) |
WITH { hnsw_ef: 128 } |
Increase HNSW exploration at query time |
WITH { exact: true } |
Force exact KNN explicitly |
WITH { acorn: true } |
Enable ACORN for filtered queries |
EXACTcan appear afterLIMITor afterRERANKWITH { ... }can appear afterWHEREand/orRERANK- Supported
WITHkeys are onlyhnsw_ef,exact, andacorn
-- Exact KNN baseline
SEARCH articles SIMILAR TO 'programming language' LIMIT 5 EXACT
-- Raise HNSW ef at query time
SEARCH articles SIMILAR TO 'transformers' LIMIT 10 WITH { hnsw_ef: 256 }
-- Filtered search with ACORN
SEARCH articles SIMILAR TO 'RAG' LIMIT 10 WHERE tag = 'li' WITH { acorn: true }
SCROLL — pagination / browsing
Use SCROLL to iterate through points in a collection page by page.
Syntax:
SCROLL FROM <collection_name> LIMIT <n>
SCROLL FROM <collection_name> WHERE <filter> LIMIT <n>
SCROLL FROM <collection_name> AFTER '<point_id>' LIMIT <n>
SCROLL FROM <collection_name> WHERE <filter> AFTER <point_id> LIMIT <n>
Examples:
SCROLL FROM articles LIMIT 50
SCROLL FROM articles WHERE year >= 2024 LIMIT 50
SCROLL FROM articles AFTER 'cursor-id' LIMIT 50
Behavior:
- Returns points in ID order with payloads.
- Returns a
next_offsetcursor when more points are available. - Use
AFTER <next_offset>to fetch the next page.
Hybrid Search (USING HYBRID)
Hybrid search combines dense semantic vectors and sparse BM25 keyword vectors in a single query. By default QQL merges the two result sets with Qdrant’s Reciprocal Rank Fusion (RRF) algorithm, and you can optionally switch to DBSF with a FUSION clause.
How it works internally
- Both a dense vector (
TextEmbedding) and a sparse BM25 vector (SparseTextEmbedding) are generated from your query text. - Qdrant fetches the top candidates from each index independently (
prefetch limit = LIMIT × 4). - The two result lists are merged using the selected fusion strategy (
RRFby default, orDBSFwhen requested). - The final top-N results are returned.
Step 1: Create a hybrid collection
-- Shorthand (backward compatible)
CREATE COLLECTION articles HYBRID
-- USING form — allows specifying a dense model
CREATE COLLECTION articles USING HYBRID
CREATE COLLECTION articles USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5'
Step 2: Insert with hybrid vectors
INSERT INTO COLLECTION articles VALUES {
'text': 'Attention is all you need',
'author': 'Vaswani et al.',
'year': 2017
} USING HYBRID
Step 3: Search with hybrid retrieval
-- Basic hybrid search
SEARCH articles SIMILAR TO 'transformer architecture' LIMIT 10 USING HYBRID
-- Hybrid search with a WHERE filter
SEARCH articles SIMILAR TO 'attention' LIMIT 10 USING HYBRID WHERE year >= 2017
-- Hybrid with DBSF fusion
SEARCH articles SIMILAR TO 'hybrid retrieval' LIMIT 10 USING HYBRID FUSION 'dbsf'
-- Hybrid with custom dense model
SEARCH articles SIMILAR TO 'embeddings' LIMIT 5
USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5'
-- Hybrid with both custom models
SEARCH articles SIMILAR TO 'sparse retrieval' LIMIT 5
USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5' SPARSE MODEL 'prithivida/Splade_PP_en_v1'
Model defaults in hybrid mode
| Argument | Default |
|---|---|
| Dense model | configured default (sentence-transformers/all-MiniLM-L6-v2) |
| Sparse model | Qdrant/bm25 |
| Fusion | rrf |
Dense vs. hybrid — when to use which
| Situation | Recommendation |
|---|---|
| Semantic similarity (paraphrasing, synonyms) | Dense only |
| Exact keyword matching (product codes, names) | Hybrid or BM25-only |
| General-purpose retrieval (unknown query distribution) | Hybrid |
| Low latency / small collection | Dense only |
RECOMMEND — retrieve by example IDs
Performs a Qdrant recommendation query using existing point IDs as positive and optional negative examples. Qdrant uses those examples to retrieve nearby points, and QQL automatically excludes the seed IDs from the results.
Syntax:
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) NEGATIVE IDS (<id>, ...) LIMIT <n>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) STRATEGY '<strategy>' LIMIT <n>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> WHERE <filter>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> OFFSET <n>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> SCORE THRESHOLD <f>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> WITH { exact: true, hnsw_ef: <n> }
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> LOOKUP FROM <collection>
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> LOOKUP FROM <collection> VECTOR '<name>'
RECOMMEND FROM <collection_name> POSITIVE IDS (<id>, ...) LIMIT <n> USING '<vector_name>'
Examples:
Recommend more results like two known articles:
RECOMMEND FROM articles POSITIVE IDS (1001, 1002) LIMIT 5
Recommend similar results while steering away from one bad example:
RECOMMEND FROM articles POSITIVE IDS (1001, 1002) NEGATIVE IDS (1009) LIMIT 5
Use Qdrant’s best_score recommendation strategy:
RECOMMEND FROM articles POSITIVE IDS (1001) STRATEGY 'best_score' LIMIT 10
Recommend only within a filtered subset:
RECOMMEND FROM articles POSITIVE IDS (1001) LIMIT 5 WHERE year >= 2020 AND status = 'published'
Cross-collection recommend (look up example IDs from another collection):
RECOMMEND FROM target_collection
POSITIVE IDS ('a')
LOOKUP FROM source_collection VECTOR 'dense'
LIMIT 5
Full-featured recommend:
RECOMMEND FROM articles
POSITIVE IDS (1001, 1002)
NEGATIVE IDS (1009)
STRATEGY 'best_score'
LOOKUP FROM other_collection VECTOR 'dense'
USING 'dense'
LIMIT 10
OFFSET 5
SCORE THRESHOLD 0.5
WHERE year >= 2020
WITH { exact: true }
Supported strategies: average_vector, best_score, sum_scores
Clause order: POSITIVE IDS → NEGATIVE IDS → STRATEGY → LOOKUP FROM → USING → LIMIT → OFFSET → SCORE THRESHOLD → WHERE → WITH
Cross-Encoder Reranking (RERANK)
Appending RERANK to any SEARCH statement activates a second-pass relevance scoring step using a cross-encoder model. Cross-encoders process the (query, document) pair jointly, producing a more accurate relevance score at the cost of extra compute.
How it works internally
- Qdrant executes the normal dense or hybrid search, but fetches
LIMIT × 4candidates. - Each candidate’s
payload["text"]is paired with the original query text. - The cross-encoder scores all (query, document) pairs in one batch.
- Results are sorted descending by cross-encoder score and sliced to
LIMIT. - The
scorecolumn reflects the cross-encoder relevance score (raw logits — higher is more relevant).
Syntax:
SEARCH <name> SIMILAR TO '<query>' LIMIT <n> RERANK
SEARCH <name> SIMILAR TO '<query>' LIMIT <n> RERANK MODEL '<cross_encoder_model>'
SEARCH ... LIMIT n [USING ...] [WHERE ...] RERANK [MODEL '...']
Examples:
Dense search + rerank (default cross-encoder):
SEARCH articles SIMILAR TO 'machine learning for healthcare' LIMIT 5 RERANK
Hybrid search + rerank (best of all three worlds):
SEARCH articles SIMILAR TO 'attention mechanism in transformers' LIMIT 10 USING HYBRID RERANK
Custom cross-encoder model:
SEARCH articles SIMILAR TO 'semantic search' LIMIT 5
RERANK MODEL 'cross-encoder/ms-marco-MiniLM-L-6-v2'
Default cross-encoder model: cross-encoder/ms-marco-MiniLM-L-6-v2
| Model | Notes |
|---|---|
cross-encoder/ms-marco-MiniLM-L-6-v2 |
Default. Fast and accurate for passage reranking |
cross-encoder/ms-marco-MiniLM-L-12-v2 |
Larger, higher quality, slower |
BAAI/bge-reranker-base |
BGE reranker, strong general-purpose performance |
BAAI/bge-reranker-large |
Highest quality BGE reranker, slower |
When to use RERANK
| Situation | Recommendation |
|---|---|
| High-precision retrieval (legal, medical, research) | Add RERANK |
| Small LIMIT (top-3 or top-5 results) | Very effective |
| Low latency required | Skip RERANK (adds ~100–500 ms per batch) |
| Large collections with keyword-heavy queries | USING HYBRID RERANK |
Note on scores: After reranking, the
scorecolumn shows the cross-encoder’s raw logit (can be any real number, unbounded). Do not compare reranked scores to non-reranked cosine similarity scores.