Artificial Intelligence

Vector Databases Explained — How Semantic Search Works

Every time an AI assistant finds the right document for your question — even when the words do not match — a vector database is doing the work. Every RAG system, every semantic search feature, most AI memory implementations — all of them depend on vector databases.

Yet most people who use these systems have never needed to understand what a vector database actually is. That is fine for users. For anyone building or evaluating AI systems, it is a gap worth closing.

🔗 Foundation posts

This post builds on RAG — Retrieval Augmented Generation — the RAG post explains the full retrieval pipeline; this one goes deeper on the vector database component. Also connects to How Generative AI Works — embeddings are introduced there as part of how transformers process meaning. This post explains them fully.

Traditional databases — including the search indexes you use every day in Google, Elasticsearch or SQL LIKE queries — find documents by matching exact words. You search for ‘purchase order’. It finds documents containing the phrase ‘purchase order’.

That works well until the words do not match. If the document says ‘procurement document’ or ‘buying goods request’, a keyword search misses it. The meaning is the same. The words are different.

Vector databases solve this by searching for meaning rather than words. They find documents that are semantically similar to your query — even when the vocabulary is completely different.

What an embedding is

Before you can search by meaning, you need to represent meaning as something a computer can compare. That is what an embedding does.

An embedding model converts a piece of text — a word, a sentence, a paragraph — into a vector: a list of numbers, typically 768 to 3,072 numbers long. These numbers are not random. They are learned during training so that texts with similar meanings produce numerically similar vectors.

Embedding model diagram on white background showing Purchase Order and Buying Goods producing similar vectors and Birthday Cake producing a very different vector

💡 The geometric intuition

Think of each embedding as a position in a vast multi-dimensional space. Texts with similar meanings end up in the same neighbourhood. Texts with different meanings end up far apart. Searching by meaning is then a geometry problem: find the vectors closest to the query vector. That is all semantic search is.

How vector similarity search works

When you send a query to a RAG system, here is what happens at the vector database level:

  • Your query text is converted to an embedding using the same model used to index the documents
  • The vector database compares your query vector against all stored document vectors
  • It returns the N most similar vectors — the documents whose embeddings are closest to your query
  • Those documents are passed to the LLM as context for generating the answer

How similarity is measured

Similarity metricHow it worksBest for
Cosine similarityMeasures the angle between two vectors — range -1 to 1, higher is more similar. Ignores vector magnitude, measures direction only.Text and document similarity — the standard for almost all NLP tasks
Dot productMultiplies corresponding elements and sums — similar to cosine but sensitive to vector magnitudeWhen vector magnitude carries meaningful information alongside direction
Euclidean distanceStraight-line distance between two points in vector space — lower distance means more similarLess common for text — magnitude differences affect results

Vector similarity search flow diagram on white background showing four steps — embed query, search vector database and return ranked results with similarity scores

💡 Why cosine similarity dominates text search

In text embeddings, what matters is the direction of the vector — which concepts it is near — not its magnitude. Cosine similarity measures direction only, ignoring magnitude. This makes it robust for comparing documents of different lengths. Almost every vector database for text defaults to cosine similarity.

How a vector database is different from a SQL database

AspectSQL / Relational DatabaseVector Database
Data typeStructured rows and columns — numbers, strings, datesVectors — lists of floating-point numbers representing meaning
Query typeExact match or range — WHERE name = ‘Rakesh’Approximate nearest neighbour — find vectors closest to this query vector
Search capabilityKeyword search only — exact termsSemantic search — similar meaning, not just same words
Index typeB-tree, hash — optimised for exact lookupANN index — optimised for fast approximate similarity search
Scaling challengeVolume of rows and query throughputDimensionality of vectors (768 to 3072 dimensions) at scale
Best forTransactional systems, structured data, reportingAI memory, RAG retrieval, semantic search, recommendation systems

Vector databases in use in 2026

DatabaseTypeNotable forUsed with
SAP HANA CloudBuilt-in vector store in HANA CloudAllows SAP customers to add vector storage without a separate database. Full SQL plus vector queries in one system.SAP AI Core RAG pipelines on BTP
PineconeManaged cloud-nativeFully managed, scales automatically, simple API — popular for production RAGOpenAI, Anthropic, LangChain integrations
WeaviateOpen-source, managed optionsHybrid search — combines vector and keyword search. Strong for enterprise.Self-hosted or Weaviate Cloud
pgvectorPostgreSQL extensionAdds vector storage to existing PostgreSQL — no new infrastructure neededTeams already running PostgreSQL
ChromaOpen-source, lightweightSimple to set up — popular for prototyping and development RAG pipelinesLangChain, LlamaIndex
Azure AI SearchManaged by MicrosoftIntegrated with Azure ecosystem — hybrid vector and keyword searchAzure OpenAI, Microsoft Copilot Stack
QdrantOpen-source, Rust-basedHigh performance, rich filtering — suited for large-scale productionSelf-hosted or Qdrant Cloud

💡 SAP HANA Cloud as a vector store

SAP added native vector capabilities to HANA Cloud in 2024. For SAP customers, this means you can build RAG pipelines on SAP BTP without adding an external vector database. Your existing HANA Cloud instance stores both your structured business data and your embeddings. SAP AI Core uses HANA Cloud as the default vector store for enterprise RAG scenarios.

Embedding models — the other half of the equation

A vector database is only as good as the embeddings it stores. The quality of semantic search depends heavily on the embedding model you choose.

Embedding modelProviderDimensionsGood for
text-embedding-3-largeOpenAI3072High-quality general-purpose text — widely used in production
text-embedding-3-smallOpenAI1536Lower cost, faster — good for high-volume, lower-precision applications
embed-english-v3.0Cohere1024Strong multilingual and domain adaptation
all-MiniLM-L6-v2Hugging Face (open)384Small, fast, open-source — popular for local or low-cost deployments
SAP Generative AI Hub embeddingsSAP (via AI Core)VariesAccessible via SAP AI Core on BTP — integrated with SAP AI infrastructure

Same embedding model rule diagram on white background showing document indexing using Model A on the left and query embedding using the same Model A on the right with a warning about mixing models

📌 The most common setup mistake

Always use the same embedding model for indexing documents and embedding queries at search time. Vectors from different models are not comparable — they live in different spaces. Mixing models produces meaningless similarity scores. This is the most common mistake when building RAG pipelines, and it produces failures that are hard to diagnose.

At a glance — vector databases

ConceptOne-line summary
EmbeddingA numerical vector representing the meaning of text — similar meanings produce mathematically close vectors
Vector spaceThe multi-dimensional space where embeddings live — similar texts are geometrically close to each other
Semantic searchFinding documents by meaning rather than exact keywords — using vector similarity instead of text matching
Cosine similarityThe standard metric for text similarity — measures the angle between vectors, ignoring magnitude
ANN (Approximate Nearest Neighbour)The index type that makes vector search fast at scale — trades perfect accuracy for speed
Vector databaseA database optimised for storing and searching embeddings at scale
SAP HANA CloudSupports native vector storage since 2024 — no separate vector DB needed for SAP BTP RAG pipelines
Same model ruleAlways use the same embedding model for indexing documents and embedding queries — mixing models breaks search

What to take away

Vector databases are the infrastructure layer that makes semantic AI search possible. They are not a replacement for traditional databases — they are a complementary layer for one specific problem: finding content by meaning at scale.

If you are building or evaluating RAG systems, AI search features or AI memory, understanding vector databases is not optional — it is foundational. The embedding model and vector store are two of the most impactful decisions in any RAG pipeline, and they are the ones most often made without enough thought.

Start with SAP HANA Cloud if you are building on BTP — it removes one infrastructure decision entirely. Understand the same-model rule before you index a single document. And remember: the database is only as good as the embeddings feeding it.

🔗 Related posts on this site

RAG — Retrieval Augmented Generation — the full RAG architecture: this post covers the vector database layer in depth. How Generative AI Works — embeddings are central to how the transformer processes meaning — that post introduces them in context. Fine-Tuning vs Prompt Engineering vs RAG — where vector databases fit in the broader AI customisation picture. SAP BTP — The Platform Explained — SAP AI Core and HANA Cloud on BTP are where enterprise vector search and RAG pipelines run.

Published on rakeshnarayan.com — Articles

URL: https://rakeshnarayan.com/articles/vector-databases-explained-how-semantic-search-works/