---
id: "prereq-vector-databases"
type: "prereq"
source_timestamps: ["00:06:45"]
tags: ["machine-learning", "data-infrastructure"]
related: ["concept-semantic-retrieval"]
reason: "Understanding why Semantic Retrieval fails requires knowing that vector databases retrieve information based on mathematical proximity of language, not structural business logic."
sources: ["s15-block-layoffs"]
sourceVaultSlug: "s15-block-layoffs"
originDay: 15
---
# Vector Databases and Semantic Search

## Why This Is a Prerequisite

Understanding why [[concept-semantic-retrieval]] fails requires knowing that vector databases retrieve information based on mathematical proximity of language, not structural business logic.

## What You Need to Know

The speaker assumes the audience understands the basic mechanics of how modern AI systems ingest and retrieve company data:

- Vector databases embed text (and other data) into high-dimensional mathematical space.
- Retrieval is performed via similarity search (e.g., cosine similarity) between query and stored embeddings.
- The database returns documents whose embeddings are *near* the query embedding.
- This nearness reflects topical or contextual similarity in language usage.

## The Critical Implication

Because the database only understands that words are used in similar contexts — not the actual hierarchical or causal relationships of the business — it cannot reliably judge what information is *strategically important* versus what is merely *topically related*.

This is the mechanical root of [[claim-semantic-retrieval-flaw]]: the architecture conflates surfacing with interpreting because the underlying retrieval operation has no concept of business priority.

## Related

- [[concept-semantic-retrieval]]
- [[claim-semantic-retrieval-flaw]]
- [[framework-world-model-architectures]]
