Vector
ArcadeDB includes a native vector search engine for similarity-based retrieval of embeddings. Vector indexes are fully integrated into the SQL query engine and support ACID transactions, persistent storage, and automatic compaction.
How Vector Search Works
Vector search finds the nearest neighbors to a query vector in high-dimensional space. Instead of exact matching (like SQL WHERE), it finds the most similar items based on a distance or similarity metric.
Typical workflow:
-
Generate embeddings from your data using an external model (OpenAI, Sentence Transformers, etc.)
-
Store embeddings as vector properties on vertices or documents
-
Create a vector index on the property
-
Query with
vectorNeighbors()to find the k most similar items
LSMVectorIndex Architecture
ArcadeDB’s vector index is built on two foundations:
-
LSM Tree storage — ArcadeDB’s proven LSM Tree architecture provides persistent, crash-safe storage with automatic compaction
-
JVector 4.0.0 — A high-performance vector search library that implements both HNSW (Hierarchical Navigable Small World) and Vamana (DiskANN) graph algorithms
The index stores vectors as a navigable graph where each node connects to its approximate nearest neighbors. Searches traverse this graph, narrowing in on the closest matches efficiently — typically in O(log n) time rather than O(n) brute-force scanning.
Flat vs Hierarchical Structure
The index supports two graph structures:
| Flat (default) | Hierarchical | |
|---|---|---|
Algorithm |
Single-layer Vamana graph |
Multi-layer HNSW with exponential decay |
Build speed |
Faster |
10-20% slower |
Disk usage |
Baseline |
5-15% larger |
Best for |
< 100K vectors, well-clustered data |
100K+ vectors, 1024+ dimensions, diverse queries |
Enable hierarchical mode with addHierarchy: true in the index metadata.
Similarity Functions
Three distance metrics are available:
| Function | When to Use | Value Range |
|---|---|---|
COSINE (default) |
Text embeddings (BERT, GPT, Sentence Transformers). Direction matters, magnitude does not. |
-1 to 1 |
DOT_PRODUCT |
Normalized vectors where speed matters. 10-15% faster than COSINE. |
Unbounded |
EUCLIDEAN |
Spatial data, point clouds, continuous measurements. Absolute distance matters. |
0 to infinity |
| If your embeddings are already L2-normalized (unit vectors), use DOT_PRODUCT for best performance — it produces the same ranking as COSINE but skips the normalization step. |
Quantization
Quantization reduces memory usage by compressing vector components at the cost of slight accuracy loss:
| Type | Memory Reduction | Speed | Recall | Use Case |
|---|---|---|---|---|
NONE |
Baseline |
Baseline |
100% |
Small datasets (< 10K vectors), maximum accuracy |
INT8 (recommended) |
4x (75% savings) |
10-15% faster |
95-98% |
Best balance of speed and accuracy for most workloads |
BINARY |
32x (97% savings) |
15-20% faster |
85-92% |
Massive datasets, approximate search with reranking |
PRODUCT |
16-64x |
Approximate |
Varies |
Very large datasets (100K+), enables zero-disk-I/O graph construction |
| Use INT8 quantization for most use cases. It provides 4x memory savings with minimal accuracy loss and significantly faster search. Only use NONE for very small datasets where maximum precision matters. |
Why INT8 is faster
Quantization doesn’t just save memory — it fundamentally changes how vectors are read during search.
Without quantization (NONE), each node visited during graph traversal requires a full document lookup: read the record from disk, deserialize it, and extract the vector property. With INT8, vectors are stored in compact contiguous index pages and read directly — no document deserialization needed.
In benchmarks with 500K 384-dimensional vectors (matching the all-MiniLM-L6-v2 embedding model), INT8 reduces search latency by 2.5x compared to NONE:
| Quantization | Mean latency | p95 latency | Vector fetch path |
|---|---|---|---|
NONE |
3.50 ms |
4.36 ms |
Document lookup (random I/O) |
INT8 |
1.59 ms |
1.94 ms |
Index pages (sequential I/O) |
The difference becomes even more significant under memory pressure. With NONE quantization, vector data is 4x larger, evicting more data from memory caches and forcing real disk I/O. INT8 keeps the working set small enough to stay in memory even with constrained resources.
Quantization is transparent — queries work identically regardless of quantization setting. The index automatically quantizes on insert and dequantizes on retrieval.
When using PRODUCT quantization, the graph build uses Product Quantization scores instead of exact vector distances. PQ codes are compact and stay in memory, eliminating disk I/O during graph construction. This is most effective on large datasets (100K+ vectors) where PQ quality is sufficient.
| Quantization | Vector memory | Search speed |
|---|---|---|
NONE |
156 MB |
Baseline |
INT8 |
39 MB |
~2.5x faster |
BINARY |
5 MB |
~3x faster (lower recall) |
Vector Encoding (Pre-Quantized Ingest)
| Available since ArcadeDB v26.5.1. |
The encoding index option controls the wire and document-storage representation of vectors. It is distinct from quantization, which controls the index-internal compression scheme:
| Knob | What it changes | Trade-off |
|---|---|---|
|
What the document column stores and what HTTP clients send |
Wire payload, bucket bytes, client round trip |
|
What the HNSW graph compresses internally |
Index memory footprint, recall, search latency |
Two encodings are supported:
-
FLOAT32(default) — theembeddingproperty isARRAY_OF_FLOATS(4 bytes per dimension). Backwards-compatible behaviour for every index created prior to v26.5.1. -
INT8— theembeddingproperty isBINARY(one signed byte per dimension). Callers using providers that emit int8 directly (Cohereint8endpoints, OpenAItext-embedding-3-largereduced precision, Sentence Transformers with int8 quantization) skip a precision-losing client-sideint8 → float32 → serverround trip. The HTTP payload and document bucket storage shrink 4x.
The HNSW graph still runs on float32 internally; the engine dequantizes on the read path using value / 127.0f (Cohere/OpenAI calibration). Native int8 HNSW is tracked upstream at datastax/jvector#665 — once the JVector contract widens, no schema change will be needed.
-- INT8-encoded ingest. Property MUST be BINARY.
CREATE PROPERTY Doc.embedding BINARY;
CREATE INDEX ON Doc (embedding) LSM_VECTOR METADATA {
"dimensions": 1024,
"similarity": "COSINE",
"encoding": "INT8"
};
The factory rejects two misconfigurations at index-creation time so the failure surfaces immediately rather than as silent mis-scaling later:
-
encoding=INT8with a non-BINARYproperty (orencoding=FLOAT32with aBINARYproperty) — the property type and encoding must agree. -
encoding=INT8combined withquantization=INT8— redundant: the property is already byte-quantized at the wire level, so the index-internal scalar quantizer would re-quantize the dequantized floats. Pick one (encoding=INT8for payload/storage savings,quantization=INT8for index-internal compression), not both.
When a Cohere-style int8 source is not the origin of the bytes, note that Java’s byte range [-128, 127] includes a value the Cohere/OpenAI calibration never emits. The dequantizer clamps -128 up to -127 to keep the result inside [-1, 1] for COSINE; a one-time WARNING is logged the first time a -128 byte is encountered in a process so operators can investigate non-conforming sources.
Store Embeddings in an EXTERNAL Property
| Available since ArcadeDB v26.5.1. |
A vector embedding is typically the largest field on a record — a FLOAT32 1,536-dim embedding alone weighs 6 KB, an order of magnitude bigger than the topology (id, label, edge lists) it sits next to.
Stored inline, every traversal, scan, or projection that does not touch the embedding still pages those 6 KB into the buffer cache.
On a graph of a few million vertices that is enough to evict the useful topology working set long before the dataset is exhausted.
Declare the embedding property as EXTERNAL true so the bytes move to a paired external bucket. The main record then carries only an 8-byte pointer; the embedding is loaded lazily, on the queries that actually need it (vector search, similarity scoring, hybrid retrieval). See External property storage for the full description, including the per-bucket pairing, compression, and tiered-storage options.
|
-- FLOAT32 embedding stored in a paired external bucket.
CREATE PROPERTY Doc.embedding ARRAY_OF_FLOATS (EXTERNAL true);
CREATE INDEX ON Doc (embedding) LSM_VECTOR METADATA {
"dimensions": 1536,
"similarity": "COSINE"
};
-- Same idea with INT8 wire/storage encoding -- still external, 4x smaller payload.
CREATE PROPERTY Doc.embedding BINARY (EXTERNAL true);
CREATE INDEX ON Doc (embedding) LSM_VECTOR METADATA {
"dimensions": 1024,
"similarity": "COSINE",
"encoding": "INT8"
};
When you should opt in:
-
The embedding is large compared to the rest of the record (any modern dense embedding qualifies).
-
Most of your workload is not vector search — graph traversals, filters by other properties, or full-record reads that project everything except the embedding all benefit immediately.
-
You plan to put the heavy payload on a different volume (e.g. embeddings on bulk storage, topology on NVMe) — set
arcadedb.externalPropertyBucketPathbefore creating the property; see Tiered storage.
When inline storage is still the right default:
-
Records are tiny and the embedding is your hottest field (every query reads it). The extra pointer lookup is a per-read cost that adds up.
-
You are running a vector-only workload where every query projects
embedding. There is nothing to save.
To migrate an existing populated type to external storage in one step, use ALTER PROPERTY … EXTERNAL true followed by REBUILD TYPE; see Migrating existing records.
Key Parameters
| Parameter | Default | Purpose |
|---|---|---|
|
(required) |
Must match your embedding model output size |
|
|
Distance metric: COSINE, DOT_PRODUCT, or EUCLIDEAN |
|
|
Wire / document-storage encoding: FLOAT32 ( |
|
|
Index-internal compression: NONE, INT8, BINARY, or PRODUCT. INT8 is recommended for most use cases. |
|
adaptive |
Search beam width at query time. Controls recall vs speed trade-off. See efSearch and Adaptive Search. |
|
16 |
Connections per node. Higher = better recall, more memory |
|
100 |
Search depth during build. Higher = better index quality, slower builds |
|
false |
Enable multi-layer HNSW for large/complex datasets |
|
false |
Co-locate vectors in graph file for faster retrieval at large scale |
|
true |
Build the HNSW graph immediately at index creation time. Set to |
efSearch and Adaptive Search
The efSearch parameter controls how many candidate nodes the search explores in the vector graph. Higher values find more accurate results but take longer.
When efSearch is not explicitly set (either on the index or per-query), ArcadeDB uses an adaptive two-pass strategy:
-
First pass — Uses a moderate beam width (
2 × k), which is sufficient for most queries on well-clustered data. -
Second pass — If the first pass returns insufficient results, the search automatically widens the beam to
10 × k.
For small indexes (< 10K vectors), the full default efSearch is always used since the cost is negligible.
This adaptive behavior gives you fast queries on easy lookups while still maintaining recall on harder queries — without requiring any tuning.
Setting efSearch
You can set efSearch at three levels:
Per-query (highest priority) — pass as the 4th argument to vectorNeighbors(), either positionally or via the named options map:
-- Higher efSearch for a critical query that needs maximum recall (positional)
SELECT expand(vectorNeighbors('Doc[embedding]', [...], 10, 500))
-- Lower efSearch for a latency-sensitive query (positional)
SELECT expand(vectorNeighbors('Doc[embedding]', [...], 10, 30))
-- Options map form (extensible; also supports `filter`)
SELECT expand(vectorNeighbors('Doc[embedding]', [...], 10, { efSearch: 500 }))
Per-index — set in the index metadata at creation time:
CREATE INDEX ON Doc (embedding) LSM_VECTOR METADATA {
dimensions: 1024,
similarity: 'COSINE',
efSearch: 200
}
Adaptive (default) — when neither per-query nor per-index efSearch is specified, the adaptive strategy described above is used.
| For most workloads, the adaptive default works well. Only set efSearch explicitly if you need consistently high recall regardless of query difficulty, or if you have strict latency requirements. |
Filtered Search
Vector search can be combined with a logical filter on the same type by passing a filter option containing the allowed RIDs. The HNSW traversal restricts itself to that set, so non-matching vectors are skipped without decoding.
-- Find the 10 most similar documents within a specific tenant and category
SELECT vectorNeighbors(
'Document[embedding]',
:queryVector,
10,
{ filter: (SELECT @rid FROM Document WHERE tenantId = 'acme' AND category = 'finance') }
)
The filter value accepts a list of RIDs, RID strings, or any Identifiable. It can be produced by a subquery, a query parameter, or built programmatically.
Very selective filters (only a tiny fraction of records match) can starve the HNSW beam; combine filter with a higher efSearch to preserve recall.
|
Partition-aware filterable HNSW
When the type uses partitioned(<key>) as its bucket selection strategy and the query’s WHERE clause binds the partition key to a literal, the planner narrows vector.neighbors / vector.sparseNeighbors to a single per-bucket HNSW graph automatically — no filter option needed:
SELECT vector.neighbors('Document[embedding]', :queryVector, 10) AS neighbors
FROM Document
WHERE tenant_id = 'acme'
With the partition strategy in place, this query touches only tenant_id = 'acme'’s HNSW graph instead of fanning out across every bucket. The optimisation kicks in transparently for both dense (`vector.neighbors) and sparse (vector.sparseNeighbors) search and composes with explicit filter lists when both are present.
The two approaches are complementary:
-
partitioned(<key>)+ aWHEREpredicate on the key prunes the bucket set before the HNSW traversal even starts. Best for high-cardinality scope keys (tenant, customer, region) where every query naturally carries the predicate. -
filter: (SELECT @rid …)prunes individual RIDs during the HNSW traversal. Best for low-cardinality or query-shaped filters that don’t map cleanly onto a partition key.
For the multi-tenant SaaS case, partition-aware HNSW is usually the right primitive. See Schema design 101: choosing a bucket strategy for picking the partition key.
Multi-Modal Search
A single vertex type can have multiple vector indexes on different properties:
CREATE INDEX ON Product (imageEmbedding) LSM_VECTOR METADATA {dimensions: 512, similarity: 'COSINE'}
CREATE INDEX ON Product (textEmbedding) LSM_VECTOR METADATA {dimensions: 768, similarity: 'COSINE'}
Query each index independently to search by image similarity, text similarity, or combine scores.
Integration with Other Models
Vector search combines naturally with ArcadeDB’s other data models:
-
Graph + Vectors — Find similar items, then traverse relationships to discover connected context (Graph RAG pattern)
-
Full-text + Vectors — Hybrid search combining keyword matching with semantic similarity (Knowledge Graph pattern)
-
Time Series + Vectors — Detect behavioral anomalies by comparing embedding patterns over time (Fraud Detection pattern)
SQL Example
Create a vector index and query it:
-- Create vertex type and property
CREATE VERTEX TYPE Document;
CREATE PROPERTY Document.content STRING;
CREATE PROPERTY Document.embedding ARRAY_OF_FLOATS;
-- Create vector index with 384 dimensions using COSINE similarity
CREATE INDEX ON Document (embedding) LSM_VECTOR METADATA {
dimensions: 384,
similarity: 'COSINE'
};
-- Query for the 10 nearest documents
-- Returns rows with .record (full document) and .distance (0 = identical for COSINE)
SELECT expand(vectorNeighbors('Document[embedding]', $queryVector, 10))
Java Example
Create and query a vector index programmatically:
import com.arcadedb.index.lsm.LSMVectorIndex;
import com.arcadedb.index.lsm.LSMVectorIndexBuilder;
import com.arcadedb.index.vector.VectorSimilarityFunction;
// Create index programmatically
final LSMVectorIndexBuilder builder = new LSMVectorIndexBuilder(
database,
"Document",
new String[]{"embedding"})
.withDimensions(384)
.withSimilarity(VectorSimilarityFunction.COSINE)
.withMaxConnections(16)
.withBeamWidth(100);
final LSMVectorIndex index = builder.create();
// Query the index using SQL
final ResultSet resultSet = database.query("sql",
"SELECT expand(vectorNeighbors('Document[embedding]', ?, 10))",
queryVector);
Configuration Parameters
When creating LSMVectorIndex instances, the following parameters can be configured:
-
dimensions: The dimensionality of the vectors (must match your embedding model output) -
similarity: The distance function for similarity calculation (COSINE, DOT_PRODUCT, EUCLIDEAN, etc.) -
maxConnections: Maximum number of connections per layer in the HNSW graph (default: 16, increase for better recall) -
beamWidth: Beam width for approximate nearest neighbor search (default: 100, increase for more accurate results)
Supported Similarity Functions
| Measure | Name | Type |
|---|---|---|
|
L2 |
|
|
L2 |
|
|
L2 |
For more information on vector embeddings, see the Vector Embeddings section.
Sparse Vector Search
| Available since ArcadeDB v26.5.1. |
In addition to dense embeddings, ArcadeDB supports sparse vector retrieval via the LSM_SPARSE_VECTOR index. Sparse vectors carry only the non-zero dimensions of a high-dimensional space and are produced by learned-sparse retrieval models such as SPLADE, BGE-M3, OpenSearch’s opensearch-neural-sparse-encoding-multilingual-v1, and BM25-as-sparse-vector pipelines.
Sparse retrieval excels where dense embeddings collapse semantically distinct queries into the same point ("freeze card after theft" vs "unfreeze card after travel"), because the per-token weights expose the discriminating terms directly.
Storage layout
The index uses a posting-list inverted structure on the LSM-Tree backbone with composite key (int dim_id, RID rid, float weight). It inherits ACID transactions, WAL, replication, and compaction from the LSM-Tree, just like every other ArcadeDB index. Per-document data lives in two parallel array properties:
-
an
ARRAY_OF_INTEGERSof non-zero dimension ids, and -
an
ARRAY_OF_FLOATSof the matching weights.
Both arrays must have the same length on every record. Weights must be non-negative; negative weights would corrupt the WAND per-dim upper bound used during retrieval and are rejected at write time. All standard sparse models (BM25, SPLADE, BGE-M3, Cohere sparse) emit non-negative weights, so this is not a real constraint in practice.
Schema setup
CREATE DOCUMENT TYPE Doc;
CREATE PROPERTY Doc.tokens ARRAY_OF_INTEGERS;
CREATE PROPERTY Doc.weights ARRAY_OF_FLOATS;
-- LSM_SPARSE_VECTOR index. `dimensions` declares the vocabulary cap (0 = open-ended).
-- `modifier: 'IDF'` enables Robertson-Sparck-Jones IDF weighting at query time.
CREATE INDEX ON Doc (tokens, weights) LSM_SPARSE_VECTOR
METADATA { "dimensions": 105000, "modifier": "IDF" };
Querying
Use vector.sparseNeighbors to retrieve top-K records by sparse dot product:
SELECT expand(`vector.sparseNeighbors`(
'Doc[tokens,weights]',
:queryIndices, :queryValues,
50
))
Options map (5th argument) supports filter (allowed-RIDs whitelist), groupBy, groupSize — see Group-By Retrieval below.
Top-K algorithm
Retrieval uses document-at-a-time WAND with per-dim max_weight upper bounds: per-dim cursors traverse postings in RID order, the algorithm pivots to the smallest RID whose cumulative upper bound can still beat the current K-th best score, and skips cursors below the pivot via direct seeks. Only dot-product similarity is supported (cosine on sparse vectors is conventionally handled by L2-normalizing both the query and stored vectors at insert time).
The MVP works well up to roughly 10M sparse vectors. The scaling story past that point (BlockMax-WAND with per-page bounds, weight quantization, parallel per-segment scoring) is tracked separately.
Hybrid Search with vector.fuse
| Available since ArcadeDB v26.5.1. |
vector.fuse combines two or more ranked sub-pipelines into a single ranked top-K, server-side, in one query. It generalises the dense+sparse hybrid pattern but works with any source that yields (@rid, $score) rows: dense vector.neighbors, sparse vector.sparseNeighbors, full-text SEARCH_INDEX, or any plain SELECT … ORDER BY … LIMIT N.
SELECT expand(`vector.fuse`(
`vector.neighbors`('Doc[dense]', :denseVec, 50),
`vector.sparseNeighbors`('Doc[tokens,weights]', :qIdx, :qVal, 50),
{ fusion: 'RRF', groupBy: 'source_file', groupSize: 1 }
)) LIMIT 10
Fusion strategies
vector.fuse supports three strategies via the fusion option:
| Strategy | When to use |
|---|---|
|
Reciprocal Rank Fusion. |
|
Distribution-Based Score Fusion (Qdrant 1.11+). Per-source scores normalised to [mean - 3sigma, mean + 3sigma], then weighted sum. Useful when sources produce comparable, roughly Gaussian distributions and you want score magnitude (not just rank) to influence fusion. |
|
Per-source min-max normalisation, then weighted sum. Use when you have already-tuned weights from offline relevance experiments. Requires every source row to carry a numeric score. |
vector.fuse assumes "higher = better" for every source. It auto-flips the distance field exposed by vector.neighbors to a similarity at extract time so dense and sparse sources fuse without manual rescaling. If you build a custom source via SELECT @rid, <expr> AS score FROM …, expose score with similarity semantics for LINEAR and DBSF to behave correctly. RRF is rank-only so it tolerates either convention.
Weighted fusion
Per-source weights tilt the fused ranking:
-- Dense matters 3x more than sparse for this corpus.
SELECT expand(`vector.fuse`(
`vector.neighbors`('Doc[dense]', :denseVec, 50),
`vector.sparseNeighbors`('Doc[tokens,weights]', :qIdx, :qVal, 50),
{ fusion: 'RRF', weights: [3.0, 1.0] }
)) LIMIT 10
Three-way fusion (dense + sparse + full-text)
SELECT expand(`vector.fuse`(
`vector.neighbors`('Doc[dense]', :denseVec, 100),
`vector.sparseNeighbors`('Doc[tokens,weights]', :qIdx, :qVal, 100),
(SELECT @rid, $score FROM Doc WHERE SEARCH_INDEX('Doc[content]', :keywords) = true),
{ fusion: 'RRF', k: 60 }
)) LIMIT 10
Outer composition
The fused output is itself a result set, so it composes with outer WHERE, ORDER BY, LIMIT, joins, and projections:
-- Score threshold + projection.
SELECT title, source_file, score
FROM (
SELECT expand(`vector.fuse`(
`vector.neighbors`('Doc[dense]', :denseVec, 50),
`vector.sparseNeighbors`('Doc[tokens,weights]', :qIdx, :qVal, 50),
{ fusion: 'RRF' }
))
)
WHERE score >= 0.02
ORDER BY score DESC
LIMIT 10
Group-By Retrieval
| Available since ArcadeDB v26.5.1. |
vector.neighbors, vector.sparseNeighbors, and vector.fuse all accept groupBy and groupSize options for diversification at retrieval time, mirroring Qdrant’s query_points_groups semantics. Common pattern: best chunk per source document, deduped at the index level rather than over-fetched and post-partitioned in the application.
-- Top 10 distinct source files, best matching chunk from each.
SELECT expand(`vector.neighbors`(
'Doc[embedding]', :queryVec, 10,
{ groupBy: 'source_file', groupSize: 1 }
))
The third positional argument (limit / k) becomes the max number of distinct groups when groupBy is set. Total returned rows are bounded by limit * groupSize. groupSize defaults to 1.
groupBy accepts dotted nested-field paths — metadata.author, provenance.source.id, etc. Each segment after the first descends one level via Map-typed properties or embedded documents; a missing segment lands the row in the null group.
groupBy composes with the existing filter option:
SELECT expand(`vector.neighbors`(
'Doc[embedding]', :queryVec, 10,
{ filter: [#5:0, #5:1, #5:2], groupBy: 'source_file', groupSize: 1 }
))
Grouping is integrated into the index traversal proper, not applied as a post-filter on an over-fetched candidate pool:
-
Sparse (
vector.sparseNeighbors) — the BlockMax-WAND DAAT loop runs with a per-group min-heap. A new group opens only while the heap holds fewer thanlimitdistinct keys; within each group, a candidate replaces the group’s worst member when its score is higher. The pruning threshold tightens to the global per-group worst score once every group has reachedgroupSize, so the BMW pivot can skip whole posting-list regions that cannot beat any group’s current worst member.allowedRIDsis applied inline — no over-fetch. -
Dense (
vector.neighbors) — a group-awareBitsfilter is plugged into JVector’s HNSW search. Candidates from a full group are rejected before scoring, so the graph traversal stops expanding into regions that cannot contribute new groups. The search budget is sized tolimit * groupSize. BecauseBitscannot consult scores (it gates eligibility before scoring), best-per-group is approximate: the firstgroupSizeadmitted candidates per group are kept, in HNSW visit order. HNSW visits approximately best-first from the entry node so the first encountered are usually among the best, but a strict best-per-group guarantee is not provided — raiseefSearchif you need wider coverage.
Pathological combinations of limit and groupSize (e.g. limit=1000, groupSize=1000 would size a per-group state past the cap) are still rejected with an explicit over-fetch budget exceeded error to bound memory.
Further Reading
-
Vector Search Tutorial — Step-by-step hands-on guide
-
Vector Embeddings How-To — Index creation, tuning, and best practices
-
Java Vector API — Programmatic vector index management
-
SQL Vector Functions — All 40+ vector SQL functions
-
Sparse Vector Search and Hybrid Search with
vector.fuse— This section