All work
MLOps·2025·Solo, MLOps

Shadow Sentinel

A shadow mode deployment system that runs a challenger model silently next to production, measures how often the two agree on real traffic, and automatically promotes or rolls back, all gated by a full CI/CD pipeline.

GitHub ActionsDockerFastAPIShadow DeploymentStreamlitCI/CD

Outcome

Zero

User impact while testing new models

Shadow Sentinel interface
00

In the product

Shadow Sentinel live monitoring dashboard showing agreement rate and model status
01 / 02
01

Context

Shipping a new model is risky because you cannot tell how it behaves on real traffic until it is already serving users. Shadow Sentinel removes that risk: a challenger model runs on live requests in parallel with production, affecting no one, while the system measures agreement and decides automatically whether the new model is safe to promote.

02

Approach

  • 01On every push, GitHub Actions runs unit tests, code quality checks, and model validation on accuracy and latency, then builds and pushes a Docker image. A failing gate blocks the deploy and notifies the developer.
  • 02A FastAPI gateway routes all requests to Model A in production, which serves the user response, while Model B, the challenger, runs silently in parallel on the same traffic.
  • 03A comparison layer logs the agreement rate between the two models across requests, plus a confidence gap and high confidence conflicts.
  • 04After 200+ requests, the system decides automatically: 80% agreement or higher promotes Model B to production, anything lower triggers an automatic rollback, fires an alert, and keeps Model A. A Streamlit dashboard shows it all live.
03

How it works

  1. Developer

    Push to GitHub

    A new model or code change

  2. CI

    GitHub Actions

    Tests, quality, model validation, Docker build

  3. Gate

    Tests pass?

    Fail blocks the deploy and alerts the developer

  4. Shadow CD

    FastAPI gateway

    Model A serves users, Model B runs silently alongside

  5. Compare

    Comparison layer

    Logs agreement rate over 200+ live requests

  6. Decision

    Promote or roll back

    ≥80% agreement promotes, otherwise auto rollback

04

Results

≥80%

Agreement gate to auto promote

200+

Live requests before any decision

Auto

Promotion and rollback, no manual step

05

Reflection

This is the project that thinks most like production. Shadow mode plus automated promotion and rollback is exactly the reliability work that separates a model that demos well from one a team can actually trust on live traffic.