<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>MLflow Archives - Zorost Intelligence | AI, Cloud &amp; Data Experts</title>
	<atom:link href="https://zorost.com/tag/mlflow/feed/" rel="self" type="application/rss+xml" />
	<link>https://zorost.com/tag/mlflow/</link>
	<description>Production AI systems for aviation, manufacturing, pharma, government, finance, freight, and geopolitical intelligence.</description>
	<lastBuildDate>Wed, 20 May 2026 18:52:37 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>

<image>
	<url>https://zorost.com/wp-content/uploads/2025/08/ZOROST-Intel-Logo3_512-150x150.png</url>
	<title>MLflow Archives - Zorost Intelligence | AI, Cloud &amp; Data Experts</title>
	<link>https://zorost.com/tag/mlflow/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">81719879</site>	<item>
		<title>Production ML on Databricks: Mlflow, Feature Store, Calibration</title>
		<link>https://zorost.com/production-ml-databricks-mlflow-feature-store-calibration/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 03 Mar 2026 09:00:00 +0000</pubDate>
				<category><![CDATA[Databricks Modernization]]></category>
		<category><![CDATA[Calibration]]></category>
		<category><![CDATA[Feature Store]]></category>
		<category><![CDATA[MLflow]]></category>
		<category><![CDATA[MLOps]]></category>
		<category><![CDATA[Mosaic AI]]></category>
		<guid isPermaLink="false">https://zorost.com/production-ml-databricks-mlflow-feature-store-calibration/</guid>

					<description><![CDATA[<p>A reference MLOps stack on Databricks — MLflow Model Registry, Feature Store with online serving, calibration-first model evaluation, and Mosaic AI Model Serving.</p>
<p>The post <a href="https://zorost.com/production-ml-databricks-mlflow-feature-store-calibration/">Production ML on Databricks: Mlflow, Feature Store, Calibration</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;Production ML is not training a model. It&#8217;s the disciplines around training, registering, serving, monitoring, retraining, and retiring.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>Most teams shipping their first ML model on Databricks underestimate the discipline required. Training is the small part. The system around training is the large part.</p>
<h4>The reference stack</h4>
<pre><code>   Data ──►  Feature Store  ◄────  online + offline serving
                  │
                  ▼
   Training pipeline (Databricks Job)
                  │
                  ▼
   MLflow Model Registry  ◄────  versions, stages, approvals
                  │
                  ▼
   Mosaic AI Model Serving  ◄────  A/B + canary
                  │
                  ▼
   Monitoring (drift, calibration, performance)
                  │
                  ▼
   Retraining trigger (event, schedule, drift threshold)</code></pre>
<h4>Feature Store — point-in-time correctness</h4>
<p>The Feature Store enforces <strong>point-in-time correctness</strong>: training features are joined as they were at the historical point in time the label was generated. This eliminates leakage that destroys offline evaluation reliability. Online serving uses the same feature definitions to keep training and serving consistent.</p>
<h4>MLflow Model Registry — lifecycle stages</h4>
<p>Models progress through stages with explicit gates:</p>
<table>
<thead>
<tr>
<th>Stage</th>
<th>Gate</th>
</tr>
</thead>
<tbody>
<tr>
<td>Staging</td>
<td>Passes regression suite + calibration checks</td>
</tr>
<tr>
<td>Production</td>
<td>Passes A/B + canary criteria</td>
</tr>
<tr>
<td>Archived</td>
<td>Replaced by a newer Production model</td>
</tr>
</tbody>
</table>
<p>Every stage transition is logged with the user, the reason, and the metrics that justified it.</p>
<h4>Calibration-first evaluation</h4>
<p>We require every model to ship with <strong>Expected Calibration Error (ECE)</strong> and <strong>conformal prediction</strong> intervals (LACP). Headline accuracy is reported but is not the gate.</p>
<table>
<thead>
<tr>
<th>Gate</th>
<th>Default threshold</th>
</tr>
</thead>
<tbody>
<tr>
<td>ECE</td>
<td>&lt; 0.02 on holdout</td>
</tr>
<tr>
<td>Reliability diagram</td>
<td>No bin &gt; 0.05 deviation</td>
</tr>
<tr>
<td>Conformal coverage</td>
<td>Within 2pp of stated coverage</td>
</tr>
<tr>
<td>Performance regression</td>
<td>No metric below the prior production model</td>
</tr>
</tbody>
</table>
<h4>Mosaic AI Model Serving — A/B and canary</h4>
<p>Traffic splits and canary rollouts are first-class. New versions get 5% of traffic, observed for SLAs and metrics, then ramp. Rollback is one click.</p>
<h4>Monitoring — drift, calibration, performance</h4>
<p>Three things to monitor:</p>
<ul>
<li><strong>Feature drift</strong> — input distribution shift</li>
<li><strong>Calibration drift</strong> — ECE moving</li>
<li><strong>Performance drift</strong> — labeled outcomes degrading</li>
</ul>
<p>Monitoring runs as a Databricks Job. Alerts go to Slack / Teams / PagerDuty.</p>
<h4>Closing</h4>
<p>Production ML on Databricks is straightforward when the stack is right: Feature Store for consistency, MLflow Registry for lifecycle, Mosaic AI Model Serving for delivery, calibration-first evaluation, and disciplined monitoring. The training is the easy part.</p>
<hr>
<p>The post <a href="https://zorost.com/production-ml-databricks-mlflow-feature-store-calibration/">Production ML on Databricks: Mlflow, Feature Store, Calibration</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24307</post-id>	</item>
		<item>
		<title>Building Multi-Agent Workflows on Databricks (mosaic AI Agent Framework)</title>
		<link>https://zorost.com/multi-agent-databricks-mosaic-ai-agent-framework/</link>
		
		<dc:creator><![CDATA[Zorost Intelligence]]></dc:creator>
		<pubDate>Tue, 24 Feb 2026 09:00:00 +0000</pubDate>
				<category><![CDATA[Databricks Modernization]]></category>
		<category><![CDATA[Agent Framework]]></category>
		<category><![CDATA[Agentic AI]]></category>
		<category><![CDATA[MLflow]]></category>
		<category><![CDATA[Mosaic AI]]></category>
		<category><![CDATA[Multi-Agent]]></category>
		<guid isPermaLink="false">https://zorost.com/multi-agent-databricks-mosaic-ai-agent-framework/</guid>

					<description><![CDATA[<p>Multi-agent workflows native to the Lakehouse — designed, built, evaluated, and deployed on the Mosaic AI Agent Framework with typed tools and an evaluation harness.</p>
<p>The post <a href="https://zorost.com/multi-agent-databricks-mosaic-ai-agent-framework/">Building Multi-Agent Workflows on Databricks (mosaic AI Agent Framework)</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></description>
										<content:encoded><![CDATA[<blockquote>
<p><strong>Pull-quote:</strong> &#8220;Agents on the Lakehouse mean tools that read and write Delta tables, models that serve under MLflow, and evaluations that ship as Delta tables themselves.&#8221;</p>
</blockquote>
<h4>Why this matters</h4>
<p>Agentic workflows are the next layer on the Lakehouse — agents that reason, plan, call tools, and produce verifiable artifacts. The Mosaic AI Agent Framework provides the runtime. The architectural decisions still belong to you.</p>
<h4>Reference architecture</h4>
<pre><code>┌──────────────────────────────────────────────────────────────────┐
│                    AGENT (LangGraph / LlamaIndex / Custom)        │
│                                                                    │
│   Planner ──► Executor ──► Critic ──► Referee                    │
└─────────────────────┬────────────────────────────────────────────┘
                      │
                      ▼
       ┌──────────────────────────────┐
       │   Typed Tools                 │ ◄── Tool catalog
       │   - read Delta tables         │     (Unity Catalog)
       │   - write Delta tables        │
       │   - call MLflow models        │
       │   - call REST APIs            │
       └──────────────┬───────────────┘
                      │
                      ▼
       ┌──────────────────────────────┐
       │   Mosaic AI Model Serving     │
       │   - foundation models         │
       │   - fine-tuned models         │
       │   - per-agent traffic split   │
       └──────────────┬───────────────┘
                      │
                      ▼
       ┌──────────────────────────────┐
       │   Evaluations as Delta tables │ ◄── Versioned
       │   - golden datasets           │
       │   - regression suite          │
       │   - hallucination detection   │
       └──────────────────────────────┘</code></pre>
<h4>What &#8220;typed tools&#8221; means</h4>
<p>Every tool has a JSON schema for inputs and outputs. The agent cannot call a tool with invalid inputs — the schema rejects the call. This eliminates an entire class of failure that plagues unconstrained agents.</p>
<h4>What &#8220;evaluations as Delta tables&#8221; means</h4>
<p>Evaluation results are stored as rows in versioned Delta tables. Each row is <code>(agent_version, input, expected_output, actual_output, score, metadata)</code>. Regression analysis is a <code>JOIN</code> between two <code>agent_version</code> slices. New versions don&#8217;t promote unless they pass.</p>
<h4>The agent / human contract</h4>
<p>Where humans fit:</p>
<ul>
<li><strong>High-risk operations</strong> require human-in-the-loop checkpoints. Agents can propose; humans approve.</li>
<li><strong>Critic disagreements with the executor</strong> route to humans when the referee cannot adjudicate.</li>
<li><strong>Periodic spot-checks</strong> on agent decisions are scheduled into the evaluation harness.</li>
</ul>
<p>This is not &#8220;manual override.&#8221; This is a designed-in contract about which decisions are agent-final and which are human-final.</p>
<h4>Common architectural decisions</h4>
<table>
<thead>
<tr>
<th>Decision</th>
<th>Default</th>
</tr>
</thead>
<tbody>
<tr>
<td>Number of executors</td>
<td>One unless sub-goals are independent</td>
</tr>
<tr>
<td>Critic per executor or shared</td>
<td>Shared unless executors are heterogeneous</td>
</tr>
<tr>
<td>Memory model</td>
<td>Working memory in agent state; long-term memory in Delta table</td>
</tr>
<tr>
<td>Tool call timeout</td>
<td>30 s default, with retries on idempotent tools</td>
</tr>
<tr>
<td>Cost ceiling per session</td>
<td>Configurable; defaults to a hard cap</td>
</tr>
</tbody>
</table>
<h4>Closing</h4>
<p>Multi-agent workflows on Databricks are productive when the framework is paired with discipline: typed tools, deterministic logging, evaluations as Delta tables, and a designed-in agent / human contract. The Mosaic AI Agent Framework is the runtime; the architecture is yours.</p>
<hr>
<p>The post <a href="https://zorost.com/multi-agent-databricks-mosaic-ai-agent-framework/">Building Multi-Agent Workflows on Databricks (mosaic AI Agent Framework)</a> appeared first on <a href="https://zorost.com">Zorost Intelligence | AI, Cloud &amp; Data Experts</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">24306</post-id>	</item>
	</channel>
</rss>
