Vector Databases Explained — How Semantic Search Works
Every time an AI assistant finds the right document for your question — even when the words do not match — a vector database is doing the work. Every RAG system, every semantic search feature, most AI memory implementations — all of them depend on vector databases.
Yet most people who use these systems have never needed to understand what a vector database actually is. That is fine for users. For anyone building or evaluating AI systems, it is a gap worth closing.
🔗 Foundation posts
This post builds on RAG — Retrieval Augmented Generation — the RAG post explains the full retrieval pipeline; this one goes deeper on the vector database component. Also connects to How Generative AI Works — embeddings are introduced there as part of how transformers process meaning. This post explains them fully.
The problem with keyword search
Traditional databases — including the search indexes you use every day in Google, Elasticsearch or SQL LIKE queries — find documents by matching exact words. You search for ‘purchase order’. It finds documents containing the phrase ‘purchase order’.
That works well until the words do not match. If the document says ‘procurement document’ or ‘buying goods request’, a keyword search misses it. The meaning is the same. The words are different.
Vector databases solve this by searching for meaning rather than words. They find documents that are semantically similar to your query — even when the vocabulary is completely different.
What an embedding is
Before you can search by meaning, you need to represent meaning as something a computer can compare. That is what an embedding does.
An embedding model converts a piece of text — a word, a sentence, a paragraph — into a vector: a list of numbers, typically 768 to 3,072 numbers long. These numbers are not random. They are learned during training so that texts with similar meanings produce numerically similar vectors.
💡 The geometric intuition
Think of each embedding as a position in a vast multi-dimensional space. Texts with similar meanings end up in the same neighbourhood. Texts with different meanings end up far apart. Searching by meaning is then a geometry problem: find the vectors closest to the query vector. That is all semantic search is.
How vector similarity search works
When you send a query to a RAG system, here is what happens at the vector database level:
- Your query text is converted to an embedding using the same model used to index the documents
- The vector database compares your query vector against all stored document vectors
- It returns the N most similar vectors — the documents whose embeddings are closest to your query
- Those documents are passed to the LLM as context for generating the answer
How similarity is measured
| Similarity metric | How it works | Best for |
|---|---|---|
| Cosine similarity | Measures the angle between two vectors — range -1 to 1, higher is more similar. Ignores vector magnitude, measures direction only. | Text and document similarity — the standard for almost all NLP tasks |
| Dot product | Multiplies corresponding elements and sums — similar to cosine but sensitive to vector magnitude | When vector magnitude carries meaningful information alongside direction |
| Euclidean distance | Straight-line distance between two points in vector space — lower distance means more similar | Less common for text — magnitude differences affect results |
💡 Why cosine similarity dominates text search
In text embeddings, what matters is the direction of the vector — which concepts it is near — not its magnitude. Cosine similarity measures direction only, ignoring magnitude. This makes it robust for comparing documents of different lengths. Almost every vector database for text defaults to cosine similarity.
How a vector database is different from a SQL database
| Aspect | SQL / Relational Database | Vector Database |
|---|---|---|
| Data type | Structured rows and columns — numbers, strings, dates | Vectors — lists of floating-point numbers representing meaning |
| Query type | Exact match or range — WHERE name = ‘Rakesh’ | Approximate nearest neighbour — find vectors closest to this query vector |
| Search capability | Keyword search only — exact terms | Semantic search — similar meaning, not just same words |
| Index type | B-tree, hash — optimised for exact lookup | ANN index — optimised for fast approximate similarity search |
| Scaling challenge | Volume of rows and query throughput | Dimensionality of vectors (768 to 3072 dimensions) at scale |
| Best for | Transactional systems, structured data, reporting | AI memory, RAG retrieval, semantic search, recommendation systems |
Vector databases in use in 2026
| Database | Type | Notable for | Used with |
|---|---|---|---|
| SAP HANA Cloud | Built-in vector store in HANA Cloud | Allows SAP customers to add vector storage without a separate database. Full SQL plus vector queries in one system. | SAP AI Core RAG pipelines on BTP |
| Pinecone | Managed cloud-native | Fully managed, scales automatically, simple API — popular for production RAG | OpenAI, Anthropic, LangChain integrations |
| Weaviate | Open-source, managed options | Hybrid search — combines vector and keyword search. Strong for enterprise. | Self-hosted or Weaviate Cloud |
| pgvector | PostgreSQL extension | Adds vector storage to existing PostgreSQL — no new infrastructure needed | Teams already running PostgreSQL |
| Chroma | Open-source, lightweight | Simple to set up — popular for prototyping and development RAG pipelines | LangChain, LlamaIndex |
| Azure AI Search | Managed by Microsoft | Integrated with Azure ecosystem — hybrid vector and keyword search | Azure OpenAI, Microsoft Copilot Stack |
| Qdrant | Open-source, Rust-based | High performance, rich filtering — suited for large-scale production | Self-hosted or Qdrant Cloud |
💡 SAP HANA Cloud as a vector store
SAP added native vector capabilities to HANA Cloud in 2024. For SAP customers, this means you can build RAG pipelines on SAP BTP without adding an external vector database. Your existing HANA Cloud instance stores both your structured business data and your embeddings. SAP AI Core uses HANA Cloud as the default vector store for enterprise RAG scenarios.
Embedding models — the other half of the equation
A vector database is only as good as the embeddings it stores. The quality of semantic search depends heavily on the embedding model you choose.
| Embedding model | Provider | Dimensions | Good for |
|---|---|---|---|
| text-embedding-3-large | OpenAI | 3072 | High-quality general-purpose text — widely used in production |
| text-embedding-3-small | OpenAI | 1536 | Lower cost, faster — good for high-volume, lower-precision applications |
| embed-english-v3.0 | Cohere | 1024 | Strong multilingual and domain adaptation |
| all-MiniLM-L6-v2 | Hugging Face (open) | 384 | Small, fast, open-source — popular for local or low-cost deployments |
| SAP Generative AI Hub embeddings | SAP (via AI Core) | Varies | Accessible via SAP AI Core on BTP — integrated with SAP AI infrastructure |
📌 The most common setup mistake
Always use the same embedding model for indexing documents and embedding queries at search time. Vectors from different models are not comparable — they live in different spaces. Mixing models produces meaningless similarity scores. This is the most common mistake when building RAG pipelines, and it produces failures that are hard to diagnose.
At a glance — vector databases
| Concept | One-line summary |
|---|---|
| Embedding | A numerical vector representing the meaning of text — similar meanings produce mathematically close vectors |
| Vector space | The multi-dimensional space where embeddings live — similar texts are geometrically close to each other |
| Semantic search | Finding documents by meaning rather than exact keywords — using vector similarity instead of text matching |
| Cosine similarity | The standard metric for text similarity — measures the angle between vectors, ignoring magnitude |
| ANN (Approximate Nearest Neighbour) | The index type that makes vector search fast at scale — trades perfect accuracy for speed |
| Vector database | A database optimised for storing and searching embeddings at scale |
| SAP HANA Cloud | Supports native vector storage since 2024 — no separate vector DB needed for SAP BTP RAG pipelines |
| Same model rule | Always use the same embedding model for indexing documents and embedding queries — mixing models breaks search |
What to take away
Vector databases are the infrastructure layer that makes semantic AI search possible. They are not a replacement for traditional databases — they are a complementary layer for one specific problem: finding content by meaning at scale.
If you are building or evaluating RAG systems, AI search features or AI memory, understanding vector databases is not optional — it is foundational. The embedding model and vector store are two of the most impactful decisions in any RAG pipeline, and they are the ones most often made without enough thought.
Start with SAP HANA Cloud if you are building on BTP — it removes one infrastructure decision entirely. Understand the same-model rule before you index a single document. And remember: the database is only as good as the embeddings feeding it.
🔗 Related posts on this site
RAG — Retrieval Augmented Generation — the full RAG architecture: this post covers the vector database layer in depth. How Generative AI Works — embeddings are central to how the transformer processes meaning — that post introduces them in context. Fine-Tuning vs Prompt Engineering vs RAG — where vector databases fit in the broader AI customisation picture. SAP BTP — The Platform Explained — SAP AI Core and HANA Cloud on BTP are where enterprise vector search and RAG pipelines run.
Published on rakeshnarayan.com — Articles
URL: https://rakeshnarayan.com/articles/vector-databases-explained-how-semantic-search-works/


