<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Vector Search Archives - Zorost Intelligence | AI, Cloud &amp; Data Experts</title>
	<atom:link href="https://zorost.com/tag/vector-search/feed/" rel="self" type="application/rss+xml" />
	<link>https://zorost.com/tag/vector-search/</link>
	<description>Production AI systems for aviation, manufacturing, pharma, government, finance, freight, and geopolitical intelligence.</description>
	<lastBuildDate>Wed, 20 May 2026 18:52:39 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>

<image>
	<url>https://zorost.com/wp-content/uploads/2025/08/ZOROST-Intel-Logo3_512-150x150.png</url>
	<title>Vector Search Archives - Zorost Intelligence | AI, Cloud &amp; Data Experts</title>
	<link>https://zorost.com/tag/vector-search/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">81719879</site>	<item>
		<title>Hybrid Retrieval: Why Vector Alone Isn&#8217;t Enough</title>
		<link>https://zorost.com/hybrid-retrieval-vector-alone-not-enough/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 17 Feb 2026 09:00:00 +0000</pubDate>
				<category><![CDATA[Agentic AI Engineering]]></category>
		<category><![CDATA[BM25]]></category>
		<category><![CDATA[Evaluation]]></category>
		<category><![CDATA[Hybrid Retrieval]]></category>
		<category><![CDATA[RAG]]></category>
		<category><![CDATA[Vector Search]]></category>
		<guid isPermaLink="false">https://zorost.com/hybrid-retrieval-vector-alone-not-enough/</guid>

					<description><![CDATA[<p>Vector search is excellent at semantic similarity and bad at named entities. BM25 is the opposite. Production-grade retrieval is hybrid — and the architecture decisions matter.</p>
<p>The post <a href="https://zorost.com/hybrid-retrieval-vector-alone-not-enough/">Hybrid Retrieval: Why Vector Alone Isn&#8217;t Enough</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;Pure vector retrieval is the most common production-grade RAG mistake. Pure BM25 is the second most common.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>A pattern repeats in every RAG project that goes wrong: someone embeds the corpus, runs vector search, and ships. The system works in demos and disappoints in production. The fix is a structural architecture change: <strong>hybrid retrieval</strong>.</p>
<h4>The components</h4>
<pre><code>Query
  │
  ├──► Dense (vector)   — pgvector / Weaviate / Qdrant + an embedding model
  │
  ├──► Sparse (BM25)    — Postgres FTS / Elasticsearch / OpenSearch
  │
  ├──► Optional filters — date range, source, entity tags
  │
  └──► Merge (RRF or weighted) ──► Cross-encoder re-rank ──► Top-K
                                                                │
                                                                ▼
                                                Citation-grounded generation</code></pre>
<h4>Why each piece matters</h4>
<ul>
<li><strong>Vector</strong> is excellent at <em>semantic similarity</em> — finding documents that are about the same topic in different words. It is bad at <em>named entities</em> — exact terms, IDs, dates.</li>
<li><strong>BM25</strong> is the opposite — excellent at named entities, weaker on semantic similarity.</li>
<li><strong>Filters</strong> — when the question is bounded (&#8220;just look at 2024 reports about Boeing 737&#8221;), filters dramatically reduce the candidate set before ranking.</li>
<li><strong>Merge</strong> — Reciprocal Rank Fusion (RRF) is a clean default. Weighted merges work with calibrated scores.</li>
<li><strong>Cross-encoder re-rank</strong> — sees the query and the candidate document together and scores them jointly. More expensive than bi-encoder vector search, but the precision improvement on the top-K is large enough to pay for itself.</li>
</ul>
<h4>What changes when you do this right</h4>
<ul>
<li>Hallucination rate drops. The model has better evidence to ground in.</li>
<li>Citation precision goes up. The cited documents actually support the claim.</li>
<li>Edge cases (rare entity queries, exact-quote queries) work properly.</li>
<li>Generation latency stays low because the model only sees the top-K (typically 6–10), not the top-100.</li>
</ul>
<h4>Common mistakes</h4>
<ul>
<li><strong>No re-ranker.</strong> Top-50 from vector + top-50 from BM25 with RRF is a starting point, but without a re-ranker the top-K still contains noise.</li>
<li><strong>No filtering.</strong> Filtering before retrieval is essentially free if your data is properly indexed.</li>
<li><strong>Skip evaluation.</strong> Without a golden Q&amp;A dataset and grounding scoring, you have no way to compare retrieval architectures.</li>
</ul>
<h4>Closing</h4>
<p>Pure vector retrieval is the most common production-grade RAG mistake. Hybrid retrieval — vector + sparse + filters + re-rank — is the boring, reliable, production answer. Every Zorost RAG system runs this architecture.</p>
<hr>
<p>The post <a href="https://zorost.com/hybrid-retrieval-vector-alone-not-enough/">Hybrid Retrieval: Why Vector Alone Isn&#8217;t Enough</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24296</post-id>	</item>
		<item>
		<title>Production-Grade RAG on the Lakehouse with Mosaic AI Vector Search</title>
		<link>https://zorost.com/production-rag-mosaic-ai-vector-search/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 03 Feb 2026 09:00:00 +0000</pubDate>
				<category><![CDATA[Databricks Modernization]]></category>
		<category><![CDATA[Evaluation]]></category>
		<category><![CDATA[Hybrid Retrieval]]></category>
		<category><![CDATA[Mosaic AI]]></category>
		<category><![CDATA[RAG]]></category>
		<category><![CDATA[Vector Search]]></category>
		<guid isPermaLink="false">https://zorost.com/production-rag-mosaic-ai-vector-search/</guid>

					<description><![CDATA[<p>How to design, build, and evaluate a production RAG system on Databricks using Mosaic AI Vector Search, hybrid retrieval, and a real evaluation harness.</p>
<p>The post <a href="https://zorost.com/production-rag-mosaic-ai-vector-search/">Production-Grade RAG on the Lakehouse with Mosaic AI Vector Search</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;RAG works in demos. RAG that works in production requires hybrid retrieval, a re-ranker, citation grounding, and an evaluation harness.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>Most RAG projects pilot well and disappoint in production. The pattern is the same: embed the corpus, run vector search, ship. Production-grade RAG requires more.</p>
<h4>The production RAG architecture</h4>
<pre><code>                     ┌────────────────────┐
        Question ───►│  AI Gateway        │  ← key mgmt, routing, observability
                     └──────────┬─────────┘
                                ▼
        ┌────────────────────────────────────────────┐
        │                Retrieval                    │
        │  ┌────────────────┐  ┌────────────────┐   │
        │  │ Mosaic AI      │  │ BM25 (lexical) │   │
        │  │ Vector Search  │  │ on Delta SQL   │   │
        │  │ (Delta-synced) │  │                │   │
        │  └───────┬────────┘  └────────┬───────┘   │
        │          └──── merge (RRF) ───┘           │
        │                  │                          │
        │              cross-encoder                  │
        │              re-rank                        │
        └────────────────┬─────────────────────────────┘
                         ▼
              top-K (typically 6–10)
                         │
                         ▼
              Citation-grounded generation
              (Mosaic AI Model Serving)
                         │
                         ▼
              Validated answer with source links</code></pre>
<h4>Why Mosaic AI Vector Search specifically</h4>
<p>Mosaic AI Vector Search <strong>synchronizes with Delta tables</strong>. Update the source table, the index updates. No orchestration glue. Tagging, ACLs, and lineage flow through Unity Catalog. For RAG over enterprise data that changes, this matters more than people initially appreciate.</p>
<h4>Hybrid retrieval is the pattern</h4>
<p>Pure vector search is the most common production RAG mistake. Pure BM25 is the second most common. Hybrid — vector + BM25 + filters + re-rank — is the answer that actually works.</p>
<h4>Citation grounding as a structural fix</h4>
<p>Constrain the model to write with bracketed citation tokens. Validate every citation against the retrieval set. Reject answers that fail validation. This is a small structural change with a large operational impact.</p>
<h4>Evaluation harness — non-negotiable</h4>
<p>A production RAG system without an evaluation harness is a guess. The harness has three components:</p>
<ol>
<li><strong>Golden Q&amp;A dataset</strong> — questions paired with the documents that should ground the answers</li>
<li><strong>Grounding rate</strong> — what fraction of generated claims are supported by retrieved documents</li>
<li><strong>Hallucination detection</strong> — flagging unsupported claims</li>
</ol>
<p>The harness runs as a Databricks Job on every model or retrieval change. Regressions are caught before deployment.</p>
<h4>Closing</h4>
<p>Production RAG on the Lakehouse with Mosaic AI is straightforward when you adopt the architecture: hybrid retrieval, re-ranker, citation grounding, evaluation harness. The result is a RAG system analysts trust enough to use.</p>
<hr>
<p>The post <a href="https://zorost.com/production-rag-mosaic-ai-vector-search/">Production-Grade RAG on the Lakehouse with Mosaic AI Vector Search</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24305</post-id>	</item>
		<item>
		<title>A Retrieval Engine over the World&#8217;s Aviation Safety Corpus</title>
		<link>https://zorost.com/retrieval-engine-aviation-safety-corpus/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 13 Jan 2026 09:00:00 +0000</pubDate>
				<category><![CDATA[Aviation Intelligence]]></category>
		<category><![CDATA[AeroFarr]]></category>
		<category><![CDATA[BM25]]></category>
		<category><![CDATA[Hybrid Retrieval]]></category>
		<category><![CDATA[RAG]]></category>
		<category><![CDATA[Safety]]></category>
		<category><![CDATA[Vector Search]]></category>
		<guid isPermaLink="false">https://zorost.com/retrieval-engine-aviation-safety-corpus/</guid>

					<description><![CDATA[<p>247,000 public-domain aviation safety reports — indexed with hybrid retrieval, re-ranking, and citation-grounded generation. Here is what we learned designing it for production.</p>
<p>The post <a href="https://zorost.com/retrieval-engine-aviation-safety-corpus/">A Retrieval Engine over the World&#8217;s Aviation Safety Corpus</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;Vector search alone is not retrieval. It is one signal among several.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>Aviation safety knowledge sits in two enormous public-domain corpora: the U.S. NTSB accident reports and the NASA ASRS voluntary safety reports. Together, that&#8217;s <strong>247,000+ documents</strong> of structured incident narratives. Pilots, controllers, and operations engineers have written them under the assumption that they would be searched, cross-referenced, and learned from.</p>
<p>Most platforms reduce this to keyword search. Better platforms add full-text search. The frontier is <strong>citation-grounded retrieval-augmented generation</strong> — the assistant retrieves, the model writes, every claim links back to the source documents.</p>
<h4>Why hybrid retrieval</h4>
<p>The naive approach to a RAG system is &#8220;embed everything and run a vector search.&#8221; It does not work in production. Vector search is excellent at finding <em>semantically similar</em> documents and bad at finding <em>specifically named</em> entities. BM25 is the opposite. Production retrieval needs both.</p>
<p>Our retrieval pipeline:</p>
<pre><code>Question
   │
   ├──► dense (pgvector + BGE-large) ──► top 50
   ├──► sparse (BM25)                  ──► top 50
   │
   └──► merge + cross-encoder re-rank   ──► top 8
                            │
                            ▼
                Citation-grounded generation
                (Gemini 2.5 Flash for fast answers,
                 Claude / GPT for detailed analysis)</code></pre>
<h4>Why a re-ranker</h4>
<p>The re-ranker (a cross-encoder, not a bi-encoder) sees the query and the candidate document together and scores them jointly. This is more expensive per call than vector search, but the precision improvement on the top-8 is large enough that it pays for itself — fewer retrievals, fewer hallucinations, better answers.</p>
<h4>Why citation grounding</h4>
<p>The default mode of an LLM is to <strong>fabricate plausible-sounding answers</strong>. The fix is structural: the model is constrained to write its answer with bracketed citation tokens, and the citation tokens must reference documents that actually exist in the retrieval set. Generation is post-processed to validate the citations and reject any answer that fails validation.</p>
<p>This is a small structural change with a large operational impact. It moves the system from &#8220;talking to a model that has ingested aviation knowledge&#8221; to &#8220;asking a model to summarize specific source documents.&#8221;</p>
<h4>What this is good at</h4>
<ul>
<li>&#8220;What are the leading causes of runway incursions for regional jets in low-visibility conditions?&#8221;</li>
<li>&#8220;Show me ASRS reports that match the pattern of sudden hydraulic failure during flap retraction.&#8221;</li>
<li>&#8220;What are the recurring training gaps that show up in cargo operations CRM reports?&#8221;</li>
</ul>
<p>What it is <em>not</em> good at: real-time operational queries that need current schedule data — those go to the predictive and causal layers.</p>
<h4>Closing</h4>
<p>Vector search alone is not retrieval. It is one signal. Production-grade RAG over a regulated safety corpus requires hybrid retrieval, a real re-ranker, and structural citation grounding. The result is an assistant analysts trust enough to use — which is the only metric that matters.</p>
<hr>
<p>The post <a href="https://zorost.com/retrieval-engine-aviation-safety-corpus/">A Retrieval Engine over the World&#8217;s Aviation Safety Corpus</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24284</post-id>	</item>
	</channel>
</rss>
