Zouhair Loucif
Client
Digitalent
Role
Full-Stack Developer
Period
2021-10 → 2022-04
Location
Casablanca · Morocco
Read
3 min

Productionising the MIA AI engine

Six months on a no-code AI platform — Nuxt server-side rendering, containerised ML on Kubernetes, GitHub Actions CI/CD, and a partnership with the AI engineers that productionised LLM-adjacent models.

  • −50%
    Load-time reduction
  • Months → weeks
    Release cadence
  • Nuxt · Flask · K8s · AWS
    Stack

MIA was a no-code AI platform built around a model registry and a workflow surface that let non-engineers compose pipelines. The product had a research-grade engine and a frontend that hadn’t kept up. Six months at Digitalent was about closing that gap.

The frontend had to feel like a product

The MIA UI ran client-side rendered, which made the first paint slow and the experience feel unfinished — heavy on a workstation, painful on anything else. We rebuilt it on Nuxt with server-side rendering.

The win was straightforward: load times halved on the routes that mattered. The deeper win was what SSR forced us to think about — hydration boundaries, where the server’s output ended and the client’s interactivity began, what state was authoritative on the wire and what state was local. Every component had to declare its boundaries. The codebase got more honest as a side effect.

User retention lifted in the months that followed. Hard to attribute it cleanly to the SSR work alone, but the leading metrics — bounce on first paint, time-to-first-interactive — moved in the right direction.

Containerising the ML so it could scale

The ML side ran as a Flask service that the frontend talked to. It worked at low traffic and broke at the inflection point where inference traffic outgrew a single instance.

We containerised the workload — Docker images per model variant, Kubernetes orchestrating them on AWS, autoscaling tied to inference queue depth. The frontend stayed mostly the same; the contract with the ML tier didn’t change. What changed was that we could now scale horizontally without manual intervention, and the platform stopped buckling on traffic spikes.

The autoscaling work survived contact with growth. Traffic kept rising over the months I was there; the system kept up.

A pipeline that actually shipped

The release cadence was the other lever. The team had been shipping monthly, sometimes less, with manual verification between merge and deploy. I instrumented GitHub Actions CI/CD on top of the existing test surface — every PR ran the full suite, every merge triggered a deploy candidate, every deploy candidate had a one-button promotion path.

Release cadence compressed from months to weeks. Engineers stopped batching changes to ride a single deploy. The cost of a small fix went from “hold for the next release” to “merge and watch.”

Partnering with the AI engineers

The most interesting work was at the seam between application and model. The AI engineers were doing serious work on machine-learning and LLM-adjacent models — predictive signals, classification heads, language tasks. My job at that seam was translation: their research output into something the frontend could call, monitor, and recover from when the model misbehaved.

We landed model-versioning, prediction logging, and a degraded-mode behaviour for when an inference call timed out. None of that is novel research. All of it is the work that has to happen for an AI product to be a product.

Six months wasn’t long enough to do everything we’d scoped. It was long enough to ship the foundation that the AI side could keep building on. That’s the version of AI engineering I find most useful — not the demo, the deploy.

Stack
  • Nuxt.js
  • Vue.js
  • Flask
  • Python
  • Docker
  • Kubernetes
  • AWS
  • GitHub Actions