Vector databases · LLM Switchboard

Zvec (alibaba/zvec) Apache 2.0

you asked about this

An open-source, lightweight, lightning-fast, in-process (embedded) vector database that links directly into your application rather than running as a separate server/daemon ('no server, no config, just install'). Widely framed as 'the SQLite of vector databases.' Written primarily in C++ (~80%, with SWIG/Python/C bindings). It builds on Proxima, Alibaba's production-grade vector search engine that has powered vector retrieval in Taobao search, Alipay face payment, Youku video search, and Alimama advertising for years. Positioned for on-device/edge RAG and cases where running Milvus would be overkill, but the docs note it scales from rapid prototyping to billion-scale, production-grade systems.

VendorAlibaba Group. The underlying Proxima engine was developed by Alibaba DAMO Academy's systems lab (not Tongyi Lab).Key featuresIndex types: HNSW, IVF, and DiskANN (new on-disk index in v0.5.0 that keeps the bulk of the index on disk to cut memory for large datasets). Supports dense + sparse vectors plus multi-vector queries. Native Full-Text Search (FTS) — attach an FTS index to any string field. Hybrid search unifying vector similarity + FTS + scalar/structured filtering in a single MultiQuery, with a built-in reranker (RRF). Full CRUD and schema evolution. Write-ahead logging (WAL) for crash-safe durability. Concurrency model: multiple processes can read a collection simultaneously, writes are single-process exclusive. SDKs for Python (3.10-3.14), Node.js (@zvec/zvec), Go, Rust, and Dart/Flutter. Runs on Linux (x86_64/ARM64), macOS (ARM64), Windows (x86_64), and RISC-V (added in v0.5.0). Performance: claims billion-scale search in milliseconds; Alibaba reports >8,000 QPS on VectorDBBench Cohere 10M (768-dim, int8 + refiner, HNSW m=50 ef-search=118), roughly 2x the prior leaderboard #1 (Zilliz Cloud) with reduced index build time.StatusOpen-sourced early February 2026 (~Feb 10, 2026); very active. ~12.3k GitHub stars and ~730 forks by mid-2026. Latest release v0.5.0 (June 12, 2026).Why notableA major, credible vendor (Alibaba) open-sourcing a battle-tested production engine (Proxima) under permissive Apache 2.0 — rare for an embedded vector DB. It is one of very few embedded/in-process vector DBs to ship DiskANN + native FTS + hybrid search with no server, directly challenging sqlite-vec, Chroma, and LanceDB in the local/embedded niche while claiming to scale far beyond them. The C++ core, broad multi-language SDKs, and RISC-V support signal serious engineering investment. One of the fastest-rising new vector-DB projects of 2026.

Pick by where your data already lives and how much you want to operate. (1) Already on Postgres? Use pgvector — vectors beside relational data, SQL filters/joins, every managed provider supports it; add pgvectorscale (StreamingDiskANN) when the index outgrows RAM. This is the best default for most SMBs. (2) Already on MongoDB/Atlas, Redis, or Elastic/OpenSearch? Use that system's built-in vector/kNN — one fewer system to run; OpenSearch (Apache 2.0) avoids Elastic's licensing knots, and Elastic/OpenSearch shine when you need true hybrid BM25+vector. (3) Local-first, embedded, or edge with no server? sqlite-vec for maximally portable small datasets (brute-force, <~100k-ish vectors); Chroma for the simplest Python RAG prototype (memory-bound, <~1M); LanceDB when you need disk/object-storage scale (millions-billions) plus multimodal and time-travel; and Zvec (alibaba/zvec) as the standout new option — embedded like SQLite but with HNSW/IVF/DiskANN, native full-text, and hybrid search, scaling far past Chroma while staying Apache 2.0 and serverless. (4) Want a dedicated, self-hostable, fully-open vector DB? Qdrant is the strongest all-rounder (Apache 2.0 with no feature gating, filterable HNSW, single binary, generous free cloud); Weaviate if you want batteries-included AI modules (embeddings, reranking, query agent); Milvus for maximum index flexibility/GPU at billion scale (use Zilliz Cloud Serverless rather than self-hosting the cluster); Vespa for the most demanding hybrid-search/ranking/recommendation workloads (most powerful, steepest curve). (5) Want zero ops, fully managed? Pinecone for the easiest serverless start (mind the $50/mo minimum and usage-based bill); Turbopuffer for the best cost-at-scale on large, mostly-cold object-storage data; Marqo when you want embedding generation + search unified in one API. SMB rule of thumb: don't add a new system if your existing database (Postgres, Mongo, Redis, OpenSearch) already does vectors well enough; reach for a dedicated DB (Qdrant/Weaviate/Milvus) only when scale, recall, or filtered-query performance demands it; and for purely local/embedded needs, Zvec, LanceDB, sqlite-vec, and Chroma cover the spectrum from prototype to disk-scale with no infrastructure. License watch-outs: Redis (AGPL vs RSAL feature gating), Elasticsearch (AGPL/SSPL/Elastic License), Pinecone/Turbopuffer/Atlas (proprietary SaaS lock-in).

The field

Database	License	Deployment	Index	Best for	SMB fit
Zvec (alibaba/zvec) ↗ Embedded/in-process vector database (with FTS + hybrid search)	Apache 2.0	Embedded	HNSW, IVF, DiskANN (on-disk)	Local-first and edge apps, desktop/CLI tools, RAG prototypes that need to scale, and apps wanting embedded vector + full-text + filtering with no separate service. Backed by Alibaba's proven Proxima engine.	Excellent for SMBs/startups: zero infra cost, no servers to run, Apache 2.0, scales from prototype to billions of vectors. Newest option, so smaller ecosystem/community maturity than pgvector or Chroma.
pgvector / pgvectorscale ↗ PostgreSQL extension (vector search inside your relational DB)	pgvector: PostgreSQL License (BSD-like, OSI). pgvectorscale: PostgreSQL License (OSS).	Self-host (any Postgres) or managed (RDS	HNSW, IVFFlat (pgvector); StreamingDiskANN (pgvectorscale)	Teams already on Postgres who want vectors next to relational data with transactions and SQL joins/filters. pgvectorscale adds DiskANN for memory-efficient scale (Timescale reports ~28x lower p95 latency vs Pinecone s1 at 50M vectors, 99% recall).	Best default for most SMBs — reuse existing Postgres, no new system, cheap, huge ecosystem. v0.8.0 iterative scans fixed filtered-query under-fetch. Add pgvectorscale when datasets outgrow RAM.
Milvus / Zilliz Cloud ↗ Dedicated distributed vector database (cloud-native)	Apache 2.0 (Milvus OSS, under LF AI & Data Foundation). Zilliz Cloud is the proprietary managed service.	Self-host (standalone or distributed	HNSW, IVF (FLAT/PQ/SQ8/RABITQ), FLAT, SCANN, DiskANN, GPU CAGRA	Billion-scale workloads needing maximum index flexibility, GPU acceleration, and tiered storage. 2.6's IVF_RABITQ 1-bit quantization compresses index to ~1/32 size; tiered hot/cold storage cuts cost.	Powerful but operationally heavy to self-host at scale. SMBs should use Zilliz Cloud Serverless (free tier + pay-as-you-go) rather than running the distributed cluster themselves.
Qdrant ↗ Dedicated vector database / search engine (Rust)	Apache 2.0 (full core, no feature gating)	Self-host (single binary	HNSW (custom, with filterable graph traversal)	High-performance filtered vector search — its payload index extends the HNSW graph so metadata filters are applied during traversal (not post-filter), staying fast on complex filtered queries. REST + gRPC APIs.	Very strong SMB fit: fully Apache 2.0 with no paywalled features, easy single-binary self-host, and a generous free managed tier. Tens of thousands of production deployments.
Weaviate ↗ Dedicated vector database (objects + vectors, AI-native)	BSD-3-Clause (open-source core); Weaviate Cloud is the managed offering.	Self-host (open source	HNSW (with RQ-8 compression), Flat, Dynamic (auto-switch flat->HNSW)	AI-native apps wanting built-in modules: hybrid search, an Embedding Service, a Query Agent, and reranking integrated into the database. Auto-scales vector memory with no manual tuning.	Good fit when you want batteries-included AI features and managed embeddings without wiring up a pipeline. Serverless cloud lowers the entry barrier; self-host is free.
Pinecone ↗ Fully managed proprietary serverless vector database	Proprietary (closed source; managed SaaS only)	Managed serverless only (no self-host of the core). BYOC data plane now in public preview on AWS	Proprietary (purpose-built Rust engine; serverless, no index/node tuning exposed)	Teams that want zero ops and automatic read/write scaling for production AI. Adds Pinecone Inference (hosted embedding + reranking), Pinecone Assistant, Dedicated Read Nodes, and native full-text search (preview).	Easiest to start (no infra), but usage-based pricing with a $50/month minimum and per-RU/WU/storage charges can get expensive at scale. Good for SMBs that value time-to-market over cost control; watch the bill.
Chroma ↗ Embedded / lightweight open-source embedding database	Apache 2.0	Embedded (in-process	HNSW (with brute-force/flat for small sets); in-memory focused	Rapid RAG prototyping with a dead-simple Python API, metadata filtering, multi-modal, and persistent local storage. The default 'just works' choice in many LLM tutorials/frameworks.	Excellent for SMB prototypes and small (<~1M vector) workloads on a single machine. Memory-bound — performance degrades beyond ~1M vectors / available RAM, so plan a migration path for scale.
LanceDB ↗ Embedded / serverless multimodal vector + lakehouse DB (Rust)	Apache 2.0	Embedded mode (in-process	IVF-PQ (disk-based), plus HNSW variants; built on the Lance columnar format	Disk-based and multimodal workloads at millions-to-billions of vectors with low RAM, zero-copy access, and automatic table versioning / time-travel queries via the Lance format. Great for AI data lakes.	Strong SMB fit: embedded so no server to run, scales on disk/S3 far past RAM (unlike Chroma), Apache 2.0, and cheap. Slightly newer/lower-level API than Chroma.
Vespa ↗ Dedicated AI search platform (search + vectors + tensors + ranking)	Apache 2.0	Self-host (open source) or managed Vespa Cloud (serverless managed service; handles >500k req	Real-time HNSW (vetted on ann-benchmarks); native tensor/multi-vector support	Large-scale, low-latency hybrid search, recommendation, RAG, and complex ranking — stores and computes on tensors (not just vectors) for multi-modal and learned ranking. Powers Marqo's backend (16ms P50 @ 50M vs 140ms Milvus in their bench).	Most powerful but steepest learning curve and heaviest to operate. Overkill for simple RAG; SMBs needing its power should use Vespa Cloud rather than self-hosting the cluster.
Turbopuffer ↗ Serverless vector + full-text search built on object storage (proprietary)	Proprietary (managed SaaS)	Managed serverless only — no infra to provision; auto horizontal scaling.	Tiered ANN over object storage (S3 cold, NVMe warm, memory hot); proprietary	Cost-efficient large-scale search where data lives on cheap object storage (~$0.02/GB cold). Sub-10ms p50 warm latency; namespaces get faster as queried (auto-tiering). Added BM25 full-text + sparse vectors (April 2026). Production at 2.5T+ docs, 10M+ writes/s.	Great cost story for SMBs with large, mostly-cold datasets and bursty query patterns (10-100x cheaper than traditional vector DBs). Closed-source SaaS, so some lock-in; no self-host.
Redis (vector) ↗ In-memory data store with vector search (Redis Query Engine)	Redis 8+: tri-license — AGPLv3 (open source) OR RSALv2 OR SSPLv1. Redis Cloud/Enterprise are proprietary.	Self-host (Redis 8 OSS) or managed Redis Cloud	FLAT, HNSW, SVS-VAMANA (8-bit quantization). Intel LVQ/LeanVec optimizations only under RSALv2, not in AGPL/SSPL OSS builds.	Apps already using Redis that need ultra-low-latency in-memory vector search with KNN, range queries, and metadata filters — e.g., real-time recommendation, semantic caching, session retrieval.	Convenient if Redis is already in your stack and datasets fit in RAM. RAM-bound means cost rises with scale; licensing (AGPL vs RSAL feature gating) needs a quick check for commercial use.
Elasticsearch / OpenSearch kNN ↗ Search engine with dense-vector / kNN (vectors alongside full-text + BM25)	Elasticsearch: AGPLv3 (OSI) + SSPL + Elastic License v2 (tri-license). OpenSearch: Apache 2.0.	Self-host or managed (Elastic Cloud; Amazon OpenSearch Service; many providers).	HNSW (via Lucene; OpenSearch also supports FAISS and NMSLIB engines)	Teams that need true hybrid keyword (BM25) + vector search at scale with mature filtering, aggregations, and observability/log tooling already in place. OpenSearch is the fully Apache-2.0 fork after Elastic's 2021 license change.	Good if you already run Elastic/OpenSearch for search or logs — vectors come 'for free.' Heavier to operate than a purpose-built vector DB; OpenSearch's Apache 2.0 avoids Elastic licensing concerns.
MongoDB Atlas Vector Search ↗ Document database with integrated vector search	Proprietary (Atlas managed service; MongoDB core is SSPL)	Managed (MongoDB Atlas) primarily; vector search is an Atlas feature.	HNSW (graph-based; configurable m and efConstruction)	Teams already on MongoDB/Atlas wanting vectors next to their documents — supports up to 4096-dim embeddings and automatic scalar/binary quantization (set in the index definition) to cut memory. Voyage AI embeddings integrated.	Strong fit if you already use Atlas — one platform for operational data + vectors, no new system. Atlas-only (not in self-managed Community), so it ties you to the managed service and its pricing.
SQLite-vec ↗ Embedded SQLite extension for vector search	Apache 2.0 / MIT (dual)	Embedded — a single SQLite extension; runs anywhere SQLite does (server	Brute-force (highly optimized, hardware-specific SIMD distance functions auto-selected); no ANN graph yet	Maximally portable, dependency-free vector search for small-to-medium datasets, local/edge/mobile apps, and embedding it inside existing SQLite apps. Supports FLOAT32/FLOAT16/BFLOAT16/INT8/UINT8 and quantization.	Excellent for SMBs/solo devs needing zero-infra vectors in a local app or edge device. Brute-force means it's best below a few hundred thousand vectors; not for high-QPS billion-scale workloads.
Marqo ↗ Open-source tensor search engine (embedding generation + vector search in one API)	Apache 2.0 (open-source core); built on Vespa as its engine.	Self-host (Docker) or managed Marqo Cloud.	Inherits Vespa's HNSW (Marqo auto-designs the underlying Vespa schema)	End-to-end text-and-image (multimodal/tensor) search where you want embedding generation and vector search unified in a single API — no separate embedding pipeline. Marqo 2 redesigned on Vespa for scale/predictability.	Good for SMBs wanting an all-in-one search-with-built-in-embeddings stack and to skip managing models + a vector DB separately. Newer/smaller ecosystem; managed cloud eases ops.