01 · Context

A signed SLA. Nothing to measure it with.

Fever ran a Virtual Power Plant + energy-trading platform. A contractual availability SLA was already baked into customer contracts — and it was blocking enterprise sales, because nobody could answer 'how do you actually measure this?'

Contract

Contract SLA existed

Fever had a contractual availability SLA baked into customer contracts before I joined — signed with customers, real money attached.

Measurement

Nothing measurable beneath it

No SLOs. No SLIs. No way to actually confirm whether we were meeting the number we'd promised.

Wording

Worded around the uncontrollable

The SLA was written around battery physical performance — dependent on grid conditions, weather, hardware we didn't control.

Why now
  • Sales due-diligence was blocking enterprise deals on 'how do you actually measure this?' — once we had a defensible number, sales could answer it and deals unblocked.
  • Scaling to larger customers with sharper reliability expectations
  • Reliability was handled by on-call heroics — no shared definition of 'working.'
Guiding principle

Measure what the customer experiences, not what the service reports about itself.

Black-box, not white-box. This choice is what makes the whole system defensible — and what caught the third-party issue you'll see in Impact.
What the numbers showed

Measured reliability came in closer to four-nines. That gave room to tighten the SLO internally later while keeping negotiating slack on the 99% SLA if customers pushed for more.

My role: solo · ~1 week to first working pipeline · a second week of monitoring, refinement, demos·Team: 8 backend + 5 data engineers·Socialised via ADRs + demos at each stage