<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Multi-Agent Archives - Zorost Intelligence | AI, Cloud &amp; Data Experts</title>
	<atom:link href="https://zorost.com/tag/multi-agent/feed/" rel="self" type="application/rss+xml" />
	<link>https://zorost.com/tag/multi-agent/</link>
	<description>Production AI systems for aviation, manufacturing, pharma, government, finance, freight, and geopolitical intelligence.</description>
	<lastBuildDate>Wed, 20 May 2026 18:52:41 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>

<image>
	<url>https://zorost.com/wp-content/uploads/2025/08/ZOROST-Intel-Logo3_512-150x150.png</url>
	<title>Multi-Agent Archives - Zorost Intelligence | AI, Cloud &amp; Data Experts</title>
	<link>https://zorost.com/tag/multi-agent/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">81719879</site>	<item>
		<title>When Agents Call Agents: Why the MCP Server Matters in Freight</title>
		<link>https://zorost.com/mcp-server-freight-agents-call-agents/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 24 Feb 2026 09:00:00 +0000</pubDate>
				<category><![CDATA[Freight & Logistics]]></category>
		<category><![CDATA[Agentic AI]]></category>
		<category><![CDATA[FreightCortex]]></category>
		<category><![CDATA[MCP]]></category>
		<category><![CDATA[Model Context Protocol]]></category>
		<category><![CDATA[Multi-Agent]]></category>
		<guid isPermaLink="false">https://zorost.com/mcp-server-freight-agents-call-agents/</guid>

					<description><![CDATA[<p>Model Context Protocol lets external AI agents call FreightCortex tools natively. Here is why that matters — and what it unlocks for the freight intelligence stack.</p>
<p>The post <a href="https://zorost.com/mcp-server-freight-agents-call-agents/">When Agents Call Agents: Why the MCP Server Matters in Freight</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;If your platform isn&#8217;t callable by other agents, your platform isn&#8217;t future-proof.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>The next generation of enterprise software is being shaped by a simple fact: <strong>users have agents now</strong>. Claude Desktop, custom internal agents, vendor-provided agents — they&#8217;re all going to call your platform. Either they call it through your REST API (and the agent has to know your URL structure, your authentication, your error semantics) or they call it through a standard protocol.</p>
<p>That standard is <strong>Model Context Protocol (MCP)</strong>.</p>
<h4>What MCP is</h4>
<p>MCP is an open protocol developed by Anthropic and adopted across the agent ecosystem. It defines how an AI agent describes its tools, how a host (the agent&#8217;s runtime) discovers and calls those tools, and how results are returned. The result is a clean separation: tools are <em>advertised</em>, agents <em>discover and call them</em>, and you can swap tool servers without touching the agent.</p>
<p>For FreightCortex, the MCP server is a thin layer that exposes our 16 tools using the protocol. An external agent — a customer&#8217;s internal Claude Desktop, an OEM&#8217;s analytics chatbot, or a third-party tool — can connect to our MCP endpoint and <em>use FreightCortex like a native tool</em>.</p>
<h4>What this unlocks</h4>
<p>Three things:</p>
<ol>
<li><strong>Native callability from any MCP-compatible agent.</strong> Customers do not need to write custom integrations. Their agent just connects to our MCP server.</li>
<li><strong>Composability with other tools.</strong> A customer agent can use FreightCortex tools alongside their own internal tools. The agent decides when to call which.</li>
<li><strong>Future-proofing.</strong> As the agent ecosystem grows, MCP-compatible platforms are accessible by default. REST-only platforms have to be manually integrated, one customer at a time.</li>
</ol>
<h4>What it requires</h4>
<p>Three engineering investments:</p>
<ol>
<li><strong>Tool contracts</strong> — every tool we want to expose has a typed schema. (We already had this.)</li>
<li><strong>The MCP server itself</strong> — a thin transport layer over those tools.</li>
<li><strong>Authentication and rate limiting</strong> — MCP doesn&#8217;t replace your existing auth; it sits on top of it.</li>
</ol>
<h4>A concrete example</h4>
<p>An analyst is using Claude Desktop on her workstation. She asks &#8220;what&#8217;s driving the cost increase on the Atlanta–Dallas corridor?&#8221; Claude knows about the FreightCortex MCP server (configured once per workstation) and decides to use it. It calls <code>query_corridor_metrics</code>, <code>compute_anomaly_score</code>, <code>query_carrier_metrics</code>, and <code>run_capacity_simulation</code> — and produces an answer with the same structure as the answer it would have given inside the FreightCortex web app, except this time it is in her existing analyst environment.</p>
<p>The customer never had to log in to FreightCortex.</p>
<h4>Closing</h4>
<p>If your platform isn&#8217;t callable by other agents, your platform isn&#8217;t future-proof. MCP is how you make that callable. It is a small engineering investment with very high leverage.</p>
<hr>
<p>The post <a href="https://zorost.com/mcp-server-freight-agents-call-agents/">When Agents Call Agents: Why the MCP Server Matters in Freight</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24291</post-id>	</item>
		<item>
		<title>Building Multi-Agent Workflows on Databricks (mosaic AI Agent Framework)</title>
		<link>https://zorost.com/multi-agent-databricks-mosaic-ai-agent-framework/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 24 Feb 2026 09:00:00 +0000</pubDate>
				<category><![CDATA[Databricks Modernization]]></category>
		<category><![CDATA[Agent Framework]]></category>
		<category><![CDATA[Agentic AI]]></category>
		<category><![CDATA[MLflow]]></category>
		<category><![CDATA[Mosaic AI]]></category>
		<category><![CDATA[Multi-Agent]]></category>
		<guid isPermaLink="false">https://zorost.com/multi-agent-databricks-mosaic-ai-agent-framework/</guid>

					<description><![CDATA[<p>Multi-agent workflows native to the Lakehouse — designed, built, evaluated, and deployed on the Mosaic AI Agent Framework with typed tools and an evaluation harness.</p>
<p>The post <a href="https://zorost.com/multi-agent-databricks-mosaic-ai-agent-framework/">Building Multi-Agent Workflows on Databricks (mosaic AI Agent Framework)</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;Agents on the Lakehouse mean tools that read and write Delta tables, models that serve under MLflow, and evaluations that ship as Delta tables themselves.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>Agentic workflows are the next layer on the Lakehouse — agents that reason, plan, call tools, and produce verifiable artifacts. The Mosaic AI Agent Framework provides the runtime. The architectural decisions still belong to you.</p>
<h4>Reference architecture</h4>
<pre><code>┌──────────────────────────────────────────────────────────────────┐
│                    AGENT (LangGraph / LlamaIndex / Custom)        │
│                                                                    │
│   Planner ──► Executor ──► Critic ──► Referee                    │
└─────────────────────┬────────────────────────────────────────────┘
                      │
                      ▼
       ┌──────────────────────────────┐
       │   Typed Tools                 │ ◄── Tool catalog
       │   - read Delta tables         │     (Unity Catalog)
       │   - write Delta tables        │
       │   - call MLflow models        │
       │   - call REST APIs            │
       └──────────────┬───────────────┘
                      │
                      ▼
       ┌──────────────────────────────┐
       │   Mosaic AI Model Serving     │
       │   - foundation models         │
       │   - fine-tuned models         │
       │   - per-agent traffic split   │
       └──────────────┬───────────────┘
                      │
                      ▼
       ┌──────────────────────────────┐
       │   Evaluations as Delta tables │ ◄── Versioned
       │   - golden datasets           │
       │   - regression suite          │
       │   - hallucination detection   │
       └──────────────────────────────┘</code></pre>
<h4>What &#8220;typed tools&#8221; means</h4>
<p>Every tool has a JSON schema for inputs and outputs. The agent cannot call a tool with invalid inputs — the schema rejects the call. This eliminates an entire class of failure that plagues unconstrained agents.</p>
<h4>What &#8220;evaluations as Delta tables&#8221; means</h4>
<p>Evaluation results are stored as rows in versioned Delta tables. Each row is <code>(agent_version, input, expected_output, actual_output, score, metadata)</code>. Regression analysis is a <code>JOIN</code> between two <code>agent_version</code> slices. New versions don&#8217;t promote unless they pass.</p>
<h4>The agent / human contract</h4>
<p>Where humans fit:</p>
<ul>
<li><strong>High-risk operations</strong> require human-in-the-loop checkpoints. Agents can propose; humans approve.</li>
<li><strong>Critic disagreements with the executor</strong> route to humans when the referee cannot adjudicate.</li>
<li><strong>Periodic spot-checks</strong> on agent decisions are scheduled into the evaluation harness.</li>
</ul>
<p>This is not &#8220;manual override.&#8221; This is a designed-in contract about which decisions are agent-final and which are human-final.</p>
<h4>Common architectural decisions</h4>
<table>
<thead>
<tr>
<th>Decision</th>
<th>Default</th>
</tr>
</thead>
<tbody>
<tr>
<td>Number of executors</td>
<td>One unless sub-goals are independent</td>
</tr>
<tr>
<td>Critic per executor or shared</td>
<td>Shared unless executors are heterogeneous</td>
</tr>
<tr>
<td>Memory model</td>
<td>Working memory in agent state; long-term memory in Delta table</td>
</tr>
<tr>
<td>Tool call timeout</td>
<td>30 s default, with retries on idempotent tools</td>
</tr>
<tr>
<td>Cost ceiling per session</td>
<td>Configurable; defaults to a hard cap</td>
</tr>
</tbody>
</table>
<h4>Closing</h4>
<p>Multi-agent workflows on Databricks are productive when the framework is paired with discipline: typed tools, deterministic logging, evaluations as Delta tables, and a designed-in agent / human contract. The Mosaic AI Agent Framework is the runtime; the architecture is yours.</p>
<hr>
<p>The post <a href="https://zorost.com/multi-agent-databricks-mosaic-ai-agent-framework/">Building Multi-Agent Workflows on Databricks (mosaic AI Agent Framework)</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24306</post-id>	</item>
		<item>
		<title>Multi-Agent OSINT with a Critic and a Referee</title>
		<link>https://zorost.com/multi-agent-osint-critic-referee/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 20 Jan 2026 09:00:00 +0000</pubDate>
				<category><![CDATA[Geopolitical Intelligence]]></category>
		<category><![CDATA[Aquil]]></category>
		<category><![CDATA[Causal Inference]]></category>
		<category><![CDATA[Evaluation]]></category>
		<category><![CDATA[Multi-Agent]]></category>
		<category><![CDATA[OSINT]]></category>
		<guid isPermaLink="false">https://zorost.com/multi-agent-osint-critic-referee/</guid>

					<description><![CDATA[<p>A swarm of agents producing summaries is not analysis. Adding a critic and a referee changes what the system is. Here is how Aquil's OSINT architecture is structured.</p>
<p>The post <a href="https://zorost.com/multi-agent-osint-critic-referee/">Multi-Agent OSINT with a Critic and a Referee</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;Speed of agents matters less than honesty of agents. Critic and referee are how you build honesty into the swarm.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>The first wave of multi-agent OSINT systems was a swarm: ten agents reading the same inputs and producing summaries, which were then averaged. The result was confident-sounding mediocrity. The agents reinforced each other&#8217;s biases. The aggregator could not tell whether the consensus was real or echo.</p>
<p>The second wave adds <strong>structure</strong> to the swarm. Specifically, two roles that are missing in the naive design:</p>
<ul>
<li><strong>Critic</strong> — adversarial review. The critic&#8217;s job is to find the weakest link in the analysts&#8217; reasoning and challenge it.</li>
<li><strong>Referee</strong> — adjudicates when analysts disagree. The referee&#8217;s job is to apply explicit decision criteria and produce a final answer with explicit reasoning.</li>
</ul>
<p>This is not a UI improvement. It is a structural change in what the system is.</p>
<h4>Aquil&#8217;s swarm</h4>
<p>Aquil runs a structured OSINT swarm with four roles:</p>
<ol>
<li><strong>Sourcers</strong> — discover and ingest open-source signals (news, public data, leaks, public records, satellite imagery sources where licensed)</li>
<li><strong>Analysts</strong> — produce hypotheses, summarize evidence, and propose causal explanations</li>
<li><strong>Critic</strong> — reviews analyst output for unsupported claims, missing evidence, plausible alternative explanations, and reasoning gaps</li>
<li><strong>Referee</strong> — adjudicates when the analysts and the critic disagree, with explicit criteria</li>
</ol>
<p>The critic is structurally different from the analysts: it does not propose new claims. Its only function is to challenge existing ones. The referee is structurally different again: it does not propose or challenge. It decides, with explicit reasoning that goes into the audit trail.</p>
<h4>Causal-graph synthesis</h4>
<p>On top of the swarm, Aquil produces a <strong>causal graph</strong> of the assessed situation — events as nodes, hypothesized causal relationships as edges, with confidence weights. The graph is the team&#8217;s shared mental model. It is updateable, queryable, and exportable.</p>
<p>A causal graph is not just a visualization. It is a structured commitment to <em>what we think is going on</em>. New evidence updates the graph; missing evidence flags weak edges; alternative hypotheses are visible as competing edges.</p>
<h4>Why this works</h4>
<p>The naive swarm fails because mediocre answers can hide behind a chorus. The structured swarm makes the chorus disagree on purpose, and then makes a referee adjudicate. The agents&#8217; weaknesses are surfaced rather than averaged. The team gets a more honest answer.</p>
<h4>Closing</h4>
<p>Speed of agents matters less than honesty of agents. The critic and the referee are how you build honesty into the swarm. Aquil is structured around that thesis.</p>
<hr>
<p>The post <a href="https://zorost.com/multi-agent-osint-critic-referee/">Multi-Agent OSINT with a Critic and a Referee</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24293</post-id>	</item>
		<item>
		<title>The Agent Factory: Planner, Executor, Critic, Referee</title>
		<link>https://zorost.com/agent-factory-planner-executor-critic-referee/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 23 Dec 2025 09:00:00 +0000</pubDate>
				<category><![CDATA[Agentic AI Engineering]]></category>
		<category><![CDATA[Agentic AI]]></category>
		<category><![CDATA[Evaluation]]></category>
		<category><![CDATA[Governance]]></category>
		<category><![CDATA[LangGraph]]></category>
		<category><![CDATA[Multi-Agent]]></category>
		<guid isPermaLink="false">https://zorost.com/agent-factory-planner-executor-critic-referee/</guid>

					<description><![CDATA[<p>Most production agentic systems converge on the same architecture: a planner, an executor, a critic, and a referee. Here is the pattern, why it works, and how we apply it across industries.</p>
<p>The post <a href="https://zorost.com/agent-factory-planner-executor-critic-referee/">The Agent Factory: Planner, Executor, Critic, Referee</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;The four-role pattern is not an opinion. It&#8217;s the architecture every production multi-agent system converges on once it survives the first round of real users.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>Multi-agent AI starts as a clever idea (let agents talk to each other!) and dies in production as an unreliable mess (agents hallucinate to each other, disagreements never resolve, the audit trail is unreadable). The fix is structural: four roles, typed contracts, deterministic logs.</p>
<h4>The four roles</h4>
<ol>
<li><strong>Planner</strong> — decomposes the high-level goal into sub-goals and decides the sequence. Reads the task, the available tools, and the agent&#8217;s memory; emits a structured plan.</li>
<li><strong>Executor(s)</strong> — carries out sub-goals. Calls tools. Returns structured outputs. Knows nothing about the high-level plan; just executes its assigned sub-goal honestly.</li>
<li><strong>Critic</strong> — reviews each executor output adversarially. Looks for unsupported claims, broken citations, missed evidence, alternative interpretations. Does not propose new actions; only critiques.</li>
<li><strong>Referee</strong> — adjudicates when the critic disagrees with the executor. Has explicit criteria. Produces the final decision with explicit reasoning.</li>
</ol>
<h4>Why this works</h4>
<ul>
<li><strong>Planner / executor separation</strong> prevents the planner from drifting into execution and getting confused by tool errors.</li>
<li><strong>Critic separation</strong> prevents the executors from grading their own work, which is a category error.</li>
<li><strong>Referee separation</strong> prevents endless analyst-vs-critic loops.</li>
</ul>
<h4>Common variations</h4>
<ul>
<li><strong>Single executor vs. multi-executor (parallelism).</strong> Parallel executors for independent sub-goals; serial for dependent ones.</li>
<li><strong>Critic per executor or shared critic.</strong> Per-executor for specialized critique; shared for consistency across the run.</li>
<li><strong>Hierarchical planning.</strong> A meta-planner produces a plan that includes &#8220;now plan this sub-task in detail&#8221; steps.</li>
</ul>
<h4>What we standardize</h4>
<p>We standardize three things across every production agentic system:</p>
<ol>
<li><strong>Typed tool contracts</strong> — every tool has explicit input/output schemas. No improvisation.</li>
<li><strong>Deterministic logs</strong> — every call (planner → executor, executor → tool, critic → executor) is logged with timestamps and parameters.</li>
<li><strong>Evaluation harnesses</strong> — every system ships with a golden dataset, a regression suite, hallucination detection, and grounding scoring. New versions are evaluated before promotion.</li>
</ol>
<h4>Where we run this pattern</h4>
<ul>
<li><strong>AeroFarr</strong> — multi-tool aviation analyst (planner / executor / critic over the prediction core, the cascade GNN, the causal engine, and the RAG corpus)</li>
<li><strong>EvidAI</strong> — 4-model consensus screening with explicit critic and referee</li>
<li><strong>FreightCortex</strong> — 16-tool AI freight analyst with planner / executor and a critic on report quality</li>
<li><strong>Aquil</strong> — sourcers / analysts / critic / referee for OSINT</li>
<li><strong>SPCio</strong> (with a manufacturing intelligence partner) — 8 specialized agents with a meta-coordinator</li>
</ul>
<h4>Closing</h4>
<p>The four-role pattern is not an opinion. It is the architecture every production multi-agent system converges on once it survives the first round of real users. Skipping it is a tax you pay later.</p>
<hr>
<p>The post <a href="https://zorost.com/agent-factory-planner-executor-critic-referee/">The Agent Factory: Planner, Executor, Critic, Referee</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24295</post-id>	</item>
		<item>
		<title>Multi-Agent Quality: a New Architecture for the QMS</title>
		<link>https://zorost.com/multi-agent-quality-management-system/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 25 Nov 2025 09:00:00 +0000</pubDate>
				<category><![CDATA[Manufacturing & Quality]]></category>
		<category><![CDATA[IATF 16949]]></category>
		<category><![CDATA[ISO 9001]]></category>
		<category><![CDATA[Multi-Agent]]></category>
		<category><![CDATA[RAG]]></category>
		<category><![CDATA[SPCio]]></category>
		<guid isPermaLink="false">https://zorost.com/multi-agent-quality-management-system/</guid>

					<description><![CDATA[<p>Traditional QMS systems are forms-and-rules engines. A multi-agent QMS is something different — and the difference matters operationally.</p>
<p>The post <a href="https://zorost.com/multi-agent-quality-management-system/">Multi-Agent Quality: a New Architecture for the QMS</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;An SPC chart is not a decision. The decision is what to do about it. That&#8217;s where agents earn their keep.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>Quality management in regulated manufacturing has been essentially the same shape for thirty years: a set of forms (FMEA, Control Plans, MSA, NCR/CAPA, 8D), a statistical engine (control charts, capability indices, Gage R&amp;R), and an audit trail. The forms get filled out, the charts get run, the audits pass. Operations engineers spend more time documenting than analyzing.</p>
<p>A <strong>multi-agent QMS</strong> is structurally different. It is not a forms engine with AI bolted on. It is an engine of cooperating agents that observe data, run analysis, recommend actions, and document what they did.</p>
<h4>The agent architecture (eight specialized agents)</h4>
<p>SPCio (co-developed with a manufacturing intelligence partner) ships eight specialized quality agents:</p>
<ol>
<li><strong>Process Monitor</strong> — watches SPC charts and triggers analysis on out-of-control patterns</li>
<li><strong>Capability Analyst</strong> — runs Cp/Cpk/Pp/Ppk and interprets results in context</li>
<li><strong>MSA Engineer</strong> — runs Gage R&amp;R, ANOVA, and bias studies</li>
<li><strong>FMEA Author</strong> — drafts and updates Failure Mode and Effects Analysis with severity / occurrence / detection scoring</li>
<li><strong>Control Plan Author</strong> — drafts and updates Control Plans tied to FMEA and PPAP</li>
<li><strong>8D Investigator</strong> — runs Eight Disciplines problem-solving with root-cause analysis</li>
<li><strong>NCR / CAPA Coordinator</strong> — manages non-conformance reports and corrective/preventive actions through closure</li>
<li><strong>APQP Coordinator</strong> — orchestrates Advanced Product Quality Planning across phase gates</li>
</ol>
<p>Each agent has a typed tool contract (inputs, outputs, side effects) and a deterministic call log. Agent-to-agent communication is mediated and recorded.</p>
<h4>Why the architecture works</h4>
<p>The classical QMS treats every form as an isolated artifact. The multi-agent QMS treats them as nodes in a graph: an FMEA refers to a Control Plan, which refers to an MSA, which refers to historical SPC data, which refers to the current production run. When an out-of-control pattern emerges on a chart, the Process Monitor doesn&#8217;t just raise an alert — it asks the Capability Analyst whether the process is still capable, asks the FMEA Author whether the relevant failure mode is documented, and asks the 8D Investigator to start a structured investigation if the pattern persists.</p>
<p>The result is a system that <strong>continuously maintains the QMS</strong> rather than waiting for the team to maintain it during audit prep.</p>
<h4>Tool counts and the RAG corpus</h4>
<p>SPCio&#8217;s eight agents share a tool catalog of <strong>fifty-seven callable tools</strong> ranging from statistical computations to chart generation to FMEA cross-referencing to PPAP documentation. The RAG layer is built over a <strong>765,000-chunk</strong> quality knowledge corpus covering IATF 16949, ISO 9001, AIAG core tools, and customer-specific quality manuals.</p>
<h4>Closing</h4>
<p>A multi-agent QMS is not a UI improvement on the old model. It is a different model. The implication for quality engineers is significant: less time documenting, more time analyzing — and a continuously updated system that audits don&#8217;t catch up to, because it never falls behind.</p>
<hr>
<p>The post <a href="https://zorost.com/multi-agent-quality-management-system/">Multi-Agent Quality: a New Architecture for the QMS</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24288</post-id>	</item>
		<item>
		<title>An AI Freight Analyst with 16 Tools</title>
		<link>https://zorost.com/ai-freight-analyst-16-tools/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 11 Nov 2025 09:00:00 +0000</pubDate>
				<category><![CDATA[Freight & Logistics]]></category>
		<category><![CDATA[Anomaly Detection]]></category>
		<category><![CDATA[FreightCortex]]></category>
		<category><![CDATA[Multi-Agent]]></category>
		<category><![CDATA[Simulation]]></category>
		<category><![CDATA[Tool Use]]></category>
		<guid isPermaLink="false">https://zorost.com/ai-freight-analyst-16-tools/</guid>

					<description><![CDATA[<p>Most freight intelligence platforms add a chatbot. FreightCortex makes the analyst the center of the platform. Here is what an AI analyst with 16 callable tools actually does — and how it compares to a senior human analyst.</p>
<p>The post <a href="https://zorost.com/ai-freight-analyst-16-tools/">An AI Freight Analyst with 16 Tools</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;The AI analyst is not a chatbot bolted on the side. It is the center of the platform.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>Most freight intelligence platforms have followed the same pattern with generative AI: keep the existing dashboards, add a chatbot in the corner, ship a press release. The chatbot answers FAQ-class questions and sometimes summarizes a dashboard. Senior freight analysts ignore it.</p>
<p>FreightCortex is built around the AI analyst, not the other way around. The analyst is <strong>a multi-tool agent with sixteen callable tools</strong> that can pull data, run statistical tests, run simulations, and produce structured outputs. It is more like a junior analyst with access to the full platform than like a chatbot.</p>
<h4>The 16 tools</h4>
<table>
<thead>
<tr>
<th>#</th>
<th>Tool</th>
<th>What it does</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td><code>query_corridor_metrics</code></td>
<td>Lane-level KPIs (cost, transit time, capacity, on-time %)</td>
</tr>
<tr>
<td>2</td>
<td><code>query_carrier_metrics</code></td>
<td>Carrier-level KPIs and ranking</td>
</tr>
<tr>
<td>3</td>
<td><code>query_origin_destination_flows</code></td>
<td>OD-pair flows with filters</td>
</tr>
<tr>
<td>4</td>
<td><code>compute_anomaly_score</code></td>
<td>Z-score / isolation forest / CUSUM on a metric series</td>
</tr>
<tr>
<td>5</td>
<td><code>run_capacity_simulation</code></td>
<td>What-if capacity reduction or expansion</td>
</tr>
<tr>
<td>6</td>
<td><code>run_demand_simulation</code></td>
<td>What-if demand shock scenarios</td>
</tr>
<tr>
<td>7</td>
<td><code>run_disruption_simulation</code></td>
<td>What-if disruption (port closure, weather, strike)</td>
</tr>
<tr>
<td>8</td>
<td><code>run_routing_simulation</code></td>
<td>Reroute optimization under constraints</td>
</tr>
<tr>
<td>9</td>
<td><code>run_modal_shift_simulation</code></td>
<td>Mode-shift impact (truck ↔ rail ↔ intermodal)</td>
</tr>
<tr>
<td>10</td>
<td><code>run_emissions_simulation</code></td>
<td>CO₂ impact under scenarios</td>
</tr>
<tr>
<td>11</td>
<td><code>run_network_stress_test</code></td>
<td>Network-wide stress scenarios</td>
</tr>
<tr>
<td>12</td>
<td><code>compute_shortest_path</code></td>
<td>Multi-modal shortest path</td>
</tr>
<tr>
<td>13</td>
<td><code>compute_betweenness</code></td>
<td>Node centrality</td>
</tr>
<tr>
<td>14</td>
<td><code>compute_communities</code></td>
<td>Network communities</td>
</tr>
<tr>
<td>15</td>
<td><code>generate_report</code></td>
<td>Compose structured report from analytical session</td>
</tr>
<tr>
<td>16</td>
<td><code>generate_chart</code></td>
<td>Render a specific chart type with provided data</td>
</tr>
</tbody>
</table>
<p>Each tool is a typed contract: inputs, outputs, and side effects are documented. Every call is logged with the requesting question, the parameters, the result, and timestamps.</p>
<h4>Why typed tools matter</h4>
<p>The single most important architectural decision in agent design is <strong>whether your tools have contracts</strong>. Untyped tools — give the model a vague description and let it improvise — are unreliable. Typed tools — with explicit input schemas, output schemas, and validation — are reliable.</p>
<p>FreightCortex&#8217;s analyst will not call a tool with an invalid input. The schema rejects the call before it reaches the data layer. That eliminates an entire class of failure that plagues unconstrained agents.</p>
<h4>What this lets analysts do</h4>
<p>A typical session: an analyst asks &#8220;what&#8217;s driving the cost increase on the Atlanta-Dallas corridor over the last quarter?&#8221; The analyst:</p>
<ol>
<li>Calls <code>query_corridor_metrics</code> for Atlanta-Dallas with a 90-day window</li>
<li>Calls <code>compute_anomaly_score</code> on the cost series</li>
<li>Calls <code>query_carrier_metrics</code> to see which carriers&#8217; rates moved</li>
<li>Calls <code>run_capacity_simulation</code> to test whether the increase tracks capacity changes</li>
<li>Generates a structured report with charts</li>
</ol>
<p>This is fifteen minutes of senior-analyst work. With FreightCortex, it is one question and a structured answer with citations.</p>
<h4>Closing</h4>
<p>A chatbot bolted on a dashboard is a feature. An AI analyst at the center of the platform is a product. The difference shows up the moment senior analysts compare them in real engagements.</p>
<hr>
<p>The post <a href="https://zorost.com/ai-freight-analyst-16-tools/">An AI Freight Analyst with 16 Tools</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24290</post-id>	</item>
		<item>
		<title>Multi-Agent Consensus for Systematic Literature Review</title>
		<link>https://zorost.com/multi-agent-consensus-systematic-review/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 04 Nov 2025 09:00:00 +0000</pubDate>
				<category><![CDATA[Pharmaceutical Research]]></category>
		<category><![CDATA[Evaluation]]></category>
		<category><![CDATA[EvidAI]]></category>
		<category><![CDATA[Multi-Agent]]></category>
		<category><![CDATA[PRISMA 2020]]></category>
		<category><![CDATA[Risk of Bias]]></category>
		<category><![CDATA[ROBINS-I]]></category>
		<guid isPermaLink="false">https://zorost.com/multi-agent-consensus-systematic-review/</guid>

					<description><![CDATA[<p>Single-LLM screening makes the SLR process faster but no more accurate. Multi-agent consensus screening — with four models, explanations, and disagreement detection — preserves PRISMA 2020 rigor.</p>
<p>The post <a href="https://zorost.com/multi-agent-consensus-systematic-review/">Multi-Agent Consensus for Systematic Literature Review</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;If four independent reasoners agree, the inclusion decision is high-confidence. If they disagree, the question goes to a human. That&#8217;s the design contract.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>Systematic literature reviews underpin regulatory submissions, clinical practice guidelines, and HTA decisions. Doing them well is expensive and slow — typically 4–6 months and a six-figure investment for a single review. Doing them badly is dangerous.</p>
<p>The first wave of LLM-assisted screening was a single model judging each title/abstract against the inclusion criteria. It was faster than manual review. It was no more accurate. In some cases, it was less accurate, because a single model has systematic biases that a human reviewer doesn&#8217;t share.</p>
<h4>What multi-agent consensus does</h4>
<p>EvidAI runs every screening decision through <strong>four independent LLMs</strong>, each with a structured prompt that includes the protocol&#8217;s inclusion and exclusion criteria, a brief excerpt from the abstract, and a request for explicit reasoning.</p>
<p>The four models vote. Three patterns emerge:</p>
<table>
<thead>
<tr>
<th>Pattern</th>
<th>Frequency</th>
<th>Action</th>
</tr>
</thead>
<tbody>
<tr>
<td>4–0 unanimous include</td>
<td>~78%</td>
<td>Auto-include</td>
</tr>
<tr>
<td>4–0 unanimous exclude</td>
<td>~13%</td>
<td>Auto-exclude</td>
</tr>
<tr>
<td>3–1 majority</td>
<td>~6%</td>
<td>Flag for human reviewer with explanations</td>
</tr>
<tr>
<td>2–2 split</td>
<td>~2%</td>
<td>Mandatory human reviewer with adjudication</td>
</tr>
<tr>
<td>Disagreement on reasoning</td>
<td>varies</td>
<td>Flag for human reviewer regardless of outcome</td>
</tr>
</tbody>
</table>
<p>(Frequencies are typical for a well-designed protocol; they vary with topic.)</p>
<h4>Why the design works</h4>
<p>The key insight is that <strong>independent errors are uncorrelated</strong>. Different LLMs have different systematic biases — different training data, different RLHF preferences, different prompt sensitivities. When four independent reasoners agree, the marginal probability of error drops sharply. When they disagree, the model designers&#8217; expected behavior is reproducing the disagreement that human reviewers would have had — which is exactly what should be escalated.</p>
<p>Single-model screening hides disagreement. Multi-agent consensus surfaces it.</p>
<h4>Auditability</h4>
<p>Every screening decision is stored as a row with: paper ID, protocol version, model identifiers, raw model outputs, parsed decisions, the reason for inclusion/exclusion in each model&#8217;s words, the consensus result, and (if applicable) the human reviewer&#8217;s adjudication. The complete chain is replayable by an auditor and reproducible by a successor team.</p>
<p>This is the difference between an AI tool that <em>speeds up</em> the SLR process and one that <em>preserves the audit standard</em> it requires.</p>
<h4>Closing</h4>
<p>The multi-agent consensus pattern is the right answer for any high-stakes screening problem where accountability and auditability matter. EvidAI applies it to systematic reviews. The same pattern transfers cleanly to compliance screening, regulatory document review, due diligence, and grant assessment.</p>
<hr>
<p>The post <a href="https://zorost.com/multi-agent-consensus-systematic-review/">Multi-Agent Consensus for Systematic Literature Review</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24286</post-id>	</item>
	</channel>
</rss>
