← Blog Posts
June 30, 2026 4 min read Building with AI

Adding a coaching framework to my health database

The data warehouse has been running for about six months, but it holds almost two years of data. Apple Watch metrics, APAP session data, nutrition logs, workouts, labs. Thousands of data points landing in SQLite every night. Reports for HRV, sleep staging, VO2 max, resting heart rate, training load.

Lots of numbers. No opinion.

The missing piece was judgment. A framework that looks at all of it and says: here’s what matters, here’s where you’re off, here’s what to do about it. I wanted a coach, not another dashboard.

I’d been reading Peter Attia’s Outlive, which lays out what he calls Medicine 3.0: instead of waiting for disease and treating the population average, you measure your own data against known risk factors and act early on the highest-leverage interventions. He identifies four conditions that kill most people (cardiovascular disease, cancer, neurodegeneration, and metabolic disease), and five levers to pull against them: Exercise, Nutrition, Sleep, Emotional health, and what he calls exogenous molecules. Exercise has four sub-pillars of its own: Zone 2 cardio, VO2 max work, strength, and stability.

The actionable framework is well-documented publicly. You don’t need an LLM to summarize the book. The real work is mapping it onto your actual data and wiring it so it shows up automatically.

What I Built

Six pieces, all running off the existing health.db:

A coaching framework document (healthspan_framework.md) that maps each lever to my actual numbers: current value, target, status. Every quantitative target (the 80/20 Zone 2 split, the 4x4 interval protocol, 1 g/lb protein, the ApoB threshold) was verified against primary sources. For anything health-adjacent, that matters.

A reorganized goals file built around the five levers. Existing goals got tagged by lever. New ones added: Zone 2 cardio with weekly interval sessions, strength work, protein targets.

A living action plan (health_plan.md): the “what to actually do” layer. Weekly exercise template, dietary priorities ranked by leverage, and a single highest-leverage “one thing” field. Every prescription is written in plain language, including what the protocol is, what it feels like, and what heart rate to target. An unexplained prescription is just noise.

An auto-generated scorecard (scorecard.py producing scorecard.md). Each lever gets graded. Each metric shows current value vs target, a status, and a trend arrow vs the prior window. It pulls directly from the database and is never hand-edited. You can’t fudge a script.

An illness-watch tripwire, calibrated to my own history rather than a textbook rule. Wrist temperature barely moved during two past illnesses I could identify. The real tell both times was autonomic: HRV dropping and resting HR spiking together. The flag is built autonomic-first, with temperature as corroboration, and explicitly checks for confounders (alcohol, travel, hard training) before alerting.

A weekly automated routine that runs every Monday morning. It drains the data inbox, reimports everything, regenerates the reports and scorecard, reruns the illness watch, updates goal progress, and surfaces a tight weekly summary. The plan stays current with no manual work.

What the Numbers Actually Showed

This is where it got useful.

Running my data through the framework surfaced something the raw dashboards never made visible. My cardio was almost entirely Zone 2: walking, steady-state effort. A legitimate aerobic base. But no interval or high-intensity work.

That is exactly why my VO2 max had plateaued despite consistent activity. Zone 2 builds the aerobic base. VO2 max responds to peak-aerobic stress. Without the peak work, the ceiling doesn’t move.

VO2 max is one of the strongest single predictors of all-cause mortality. Moving from below-average to above-average for your age is associated with roughly a 70% reduction in all-cause mortality risk. That’s not a marginal gain.

The prescription: keep the walking, add one weekly 4x4 session. Four minutes hard, four minutes easy, four rounds. One 35-minute session a week. The highest-leverage change available.

The second gap was protein, well below the target Attia recommends. Less exciting, equally real.

The scorecard’s untracked cells were a finding too. Stability work, some diet categories: blank. Blank cells aren’t missing data. They’re the to-do list.

A Few Notes on the Design

The illness-watch calibration is the design principle I’d highlight most. The textbook signal was wrong for me specifically. My own history pointed at a better one. Calibrate to the individual, not the average.

The whole system is plain markdown files and Python scripts. No new infrastructure. An LLM and a human can both read and edit every piece of it.

Lab-dependent markers Attia emphasizes (ApoB, Lp(a), fasting insulin, DEXA, continuous glucose monitoring) are explicitly parked in the framework, flagged to raise with an actual doctor at a real appointment. The system is decision support. Not medical advice.

The data was already there. It just needed something to make it opinionated.