Recovery Score Explained: HRV, Sleep, and Soreness
Your recovery score should tell you when to push and when to deload — but most apps mash three different metrics into one number. Here's what each really measures.
Why a single “recovery score” is misleading
A single hrv recovery score looks clean: one number that tells you whether to train hard today. The problem is that most “recovery” algorithms blend together variables that measure different things on different time scales, then pretend the output is a single biological truth. It isn’t.
Three common inputs—HRV, sleep, and soreness—do not track the same bottleneck:
- HRV primarily reflects autonomic balance (parasympathetic vs sympathetic), which is sensitive to psychological stress, illness, alcohol, and energy availability—not just training load.
- Sleep is a behavioral and physiological driver of recovery capacity, but sleep staging from wearables is noisy and easy to overinterpret.
- Soreness is a local tissue/peripheral signal (muscle damage, novelty, eccentric stress) that can be high even when systemic readiness is fine—or low when you’re systemically cooked.
When apps “mash” these into one number, two errors happen:
- False reassurance: decent sleep + low soreness can hide a sustained HRV suppression trend from life stress, illness, or under-fueling.
- False alarm: DOMS from a new movement can crush a score even though performance is intact and adaptation is underway.
A useful recovery score isn’t a magic predictor. It’s a decision aid that should tell you (1) what domain is off, (2) whether it’s a one-day blip or a trend, and (3) what to change in training or lifestyle.
What HRV actually measures (vagal tone, not “fatigue”)
HRV (typically rMSSD or a similar time-domain metric) is a proxy for parasympathetic modulation—often simplified as “vagal tone.” Higher HRV (relative to your baseline) usually indicates more parasympathetic influence and better ability to shift between stress and recovery states. Lower HRV suggests higher sympathetic drive, reduced parasympathetic modulation, or both.
HRV is not a direct fatigue meter
Calling HRV “fatigue” is sloppy. Training can lower HRV acutely, but so can:
- Sleep restriction
- Alcohol (often with elevated resting HR)
- Psychological stress
- Illness/inflammation
- Low carbohydrate availability or aggressive dieting (common in physique phases)
- Dehydration and heat stress
Even within training, HRV’s relationship with performance is context-dependent. In some athletes, HRV rebounds quickly while muscles are still sore; in others, HRV tanks more from life stress than from lifting volume. Reviews on HRV-guided training generally support using HRV trends to adjust endurance training load, but findings are more mixed for strength training where readiness is more local/peripheral and skill-dependent (Plews 2013; broader HRV-guided training literature suggests “responders” and “non-responders”).
How to make HRV actually usable
- Measure at the same time (ideally upon waking) and same posture.
- Use a baseline (rolling 14–28 days is common) rather than comparing to population norms.
- Look at HRV + resting HR together. Low HRV with elevated resting HR is more concerning than low HRV alone.
For lifters, the best use of HRV is identifying systemic strain that can make heavy work feel unusually hard, reduce bar speed, or degrade coordination—especially during high-volume blocks, calorie deficits, or high-stress life periods.
Sleep quality: REM vs deep sleep vs total time (which matters)
Sleep is both a recovery input and a recovery output: better sleep improves training tolerance, and hard training can improve sleep—until load, stress, or timing pushes it the other way.
Total sleep time is the anchor variable
If a wearable says “poor recovery” because “deep sleep was low,” ignore that until total sleep time is handled. For strength athletes, the most consistent association is that short sleep reduces performance and increases perceived effort, and chronic restriction worsens mood, soreness perception, and adherence. A practical target is 7–9 hours, with many hard-training intermediates doing better at the upper end during accumulation phases.
Sleep stages: useful concept, unreliable measurement
Wearables estimate stages (REM, “deep,” light) indirectly from movement and heart signals. Accuracy is improving but still limited compared to polysomnography; treat stage numbers as rough rather than precise.
- REM: more linked to cognitive recovery, mood, learning, and emotional regulation. Poor REM often tracks stress, alcohol, and late sleep timing.
- Deep (slow-wave sleep): associated with physical restoration processes and growth hormone pulsatility, but your device’s “deep minutes” are not a lab measure of tissue repair.
- Continuity: frequent awakenings and low sleep efficiency matter because they fragment recovery even if total time looks okay.
What to prioritize for a recovery score
In an algorithm, give the most weight to variables with the best signal-to-noise ratio and the most consistent relationship to training tolerance:
- Total sleep time (highest priority)
- Sleep regularity (bed/wake timing consistency)
- Sleep efficiency (time asleep / time in bed)
- Stage estimates (lowest priority; use as tie-breakers)
Subjective soreness: the underrated signal everyone skips
Most lifters hate subjective metrics because they feel “unscientific.” That’s a mistake. A simple soreness rating (plus perceived fatigue) can outperform fancy biomarkers in day-to-day decision-making because it integrates the stuff sensors miss: local tissue stress, joint irritation, and “how the warm-ups feel.”
What soreness is (and isn’t)
- DOMS is not muscle growth. It correlates with novelty and eccentric load more than hypertrophy stimulus.
- Low soreness doesn’t guarantee readiness. CNS/systemic stress can be high while muscles feel fine.
- High soreness can still be trainable. You can often train around it by managing exercise selection, ROM, and loading.
Subjective ratings have decent validity when collected consistently. In sports science, session RPE and wellness questionnaires are widely used because they track accumulated strain and predict performance drops. For lifters, a 1–10 soreness scale per muscle group plus a general “overall fatigue” rating is more actionable than a generic “recovery” number.
Make soreness useful: local, not global
Global soreness (“I’m sore”) is vague. Better:
- Rate soreness for the muscle groups you’ll train today (e.g., quads, hams/glutes, chest, back, shoulders).
- Note whether soreness is muscular or joint/tendon (the latter is a bigger red flag).
- Track whether soreness is improving across warm-ups. If it drops significantly by your second ramp set, it’s usually manageable.
How to weight these 3 inputs (recommended ratios + why)
Weighting depends on the goal of the score. For serious lifters, the score should predict training tolerance and performance quality for the day, not just “health.” That argues for combining one systemic signal (HRV), one behavioral capacity signal (sleep), and one peripheral readiness signal (soreness) without letting any single input dominate.
| Input | What it mostly captures | Typical time scale | Recommended weight |
|---|---|---|---|
| HRV (plus resting HR context) | Autonomic/systemic stress, illness, under-recovery | 1–3 days (trend matters most) | 40% |
| Sleep (time + regularity + efficiency) | Recovery capacity and resilience to load | Same day + cumulative week | 35% |
| Subjective soreness (local) + fatigue | Peripheral tissue stress; movement tolerance | Same day (often) | 25% |
Why this split?
- HRV gets the biggest slice because it flags systemic stressors that can quietly accumulate and because it’s less “gameable” than self-report.
- Sleep is close behind because short/fragmented sleep reliably worsens performance and pain perception, and it often explains HRV drops.
- Soreness stays meaningful but capped because DOMS can be high for benign reasons (novel exercise, long eccentrics), and you don’t want the score to forbid training during productive blocks.
Adjustment rule for advanced lifters: during a calorie deficit or contest prep, shift slightly more weight toward sleep and HRV (e.g., 45/35/20) because systemic recovery margins shrink and soreness perception becomes noisier with low energy availability (dieting literature consistently shows performance and recovery costs when energy availability is low; see Helms 2014 for physique dieting context).
Trends > daily scores (rolling 7-day average rule)
Single-day scores are fragile. HRV is noisy. Sleep varies. Soreness fluctuates with exercise selection. The fix is to treat a daily number as a data point and make decisions off trends.
The rolling 7-day rule
- Use a 7-day rolling average for HRV and sleep duration/efficiency.
- Compare it to your 28-day baseline (or at least 21 days) rather than last week’s best day.
- Act on sustained deviations: 3+ consecutive days below baseline matters more than one bad day.
Practical interpretation for lifters:
- One-day dip (score down 10–20 points): usually proceed with the plan but cap top sets or volume if warm-ups feel heavy.
- Three-day dip (score down 15–30 points with HRV trend down): reduce volume 20–40% or drop intensity 5–10% for 1–3 sessions.
- Week-long suppression (7-day average down meaningfully): plan a deload or pivot to lower-stress variations and more sleep/food.
Red flags: 3 patterns that signal genuine overtraining
Actual overtraining syndrome is rare in recreational lifters; functional overreaching is common. The point of a recovery score is catching the slide before performance and joints fall apart.
Pattern 1: HRV down, resting HR up, sleep getting worse
This combo often indicates high allostatic load: training plus life stress, illness, or under-fueling. If it persists for 4–7 days and performance drops (bar speed, reps at a given load), treat it seriously.
Pattern 2: Soreness becomes “sticky” and local aches appear
DOMS that normally resolves in 24–72 hours starts lasting 4–5 days, and it shifts from muscle belly soreness to tendon/joint irritation (elbow, patellar, Achilles, anterior shoulder). That’s often a programming error (too much eccentric stress, not enough variation, or load jumps) more than a “recovery” problem.
Pattern 3: Performance regression across multiple lifts despite normal effort
If loads that were routine at RPE 7–8 suddenly feel like RPE 9 across multiple movement patterns for more than a week, and the recovery trend data is negative, that’s the time to deload. A single lift regressing can just be technique, setup, or a localized issue.
How Apex Fitness computes the score (transparent math, no black box)
Apex Fitness’ hrv recovery score should be interpretable. The score below is a transparent model that matches how a serious lifter should think: normalize each input to your baseline, apply conservative caps, then weight and combine.
Step 1: Normalize each input to a 0–100 sub-score
HRV sub-score (0–100) uses morning rMSSD against a 28-day baseline:
- Compute z-score: (today rMSSD − 28-day mean) / 28-day SD
- Map z to points with a cap to prevent overreacting to noise:
- z ≤ −1.5 → 0–20 points
- z = −1.0 → ~35 points
- z = 0 → 70 points
- z ≥ +1.0 → 85–95 points
- Apply a penalty if resting HR is elevated vs baseline (e.g., +5 bpm above 28-day mean reduces HRV sub-score by 10 points; +10 bpm reduces by 20).
Sleep sub-score (0–100) prioritizes total sleep time and regularity:
- Total sleep time:
- < 6:00 → 0–30 points
- 6:00–7:00 → 40–60 points
- 7:00–9:00 → 70–100 points (scaled)
- > 9:30 → cap at 95 (long sleep can reflect illness or debt)
- Regularity penalty: bedtime/wake time deviation > 90 minutes from 7-day average reduces 5–10 points.
- Efficiency penalty: < 85% reduces 5–10 points.
Soreness sub-score (0–100) is based on local ratings and a simple rule: soreness is informative but not allowed to “veto” training by itself.
- Rate target muscle group soreness 0–10.
- Map: 0–2 → 90–100 points; 3–5 → 70–85; 6–7 → 50–65; 8–10 → 25–45.
- If soreness is reported as tendon/joint pain rather than muscle soreness, cap at 60 until addressed.
Step 2: Combine with the recommended weights
| Component | Weight | Calculation |
|---|---|---|
| HRV | 0.40 | 0.40 × HRV sub-score |
| Sleep | 0.35 | 0.35 × Sleep sub-score |
| Soreness | 0.25 | 0.25 × Soreness sub-score |
Recovery Score = (0.40×HRV) + (0.35×Sleep) + (0.25×Soreness), rounded to the nearest whole number.
Step 3: Add guardrails so one metric doesn’t lie to you
- Trend modifier: if 7-day HRV average is >1 SD below baseline, subtract 5–10 points regardless of today’s HRV.
- Illness flag: if resting HR is high and sleep is long but low quality, show a warning rather than telling you to “push.”
- Training context: heavy eccentric blocks (RDLs, Bulgarian split squats, long-length work) should raise expected soreness; the score should be interpreted with that context.
Action items: what to actually change when your score drops
A recovery score is only useful if it changes decisions. The decision should target the input that’s driving the drop and the type of session planned.
If HRV is down (trend) but sleep is fine
- Reduce intensity exposure: replace 1–3 top sets at RPE 8–9 with back-off volume at RPE 6–7, or cap at ~75–85% 1RM for the day.
- Keep frequency, lower stress: keep the habit but swap to lower-skill variations (e.g., high-bar squat instead of comp low-bar; dumbbell press instead of heavy barbell bench).
- Check non-training stressors: alcohol, under-eating (especially carbs), dehydration, and late caffeine are common HRV killers.
If sleep is short/fragmented
- Lower volume 20–40% for that session (fewer hard sets), especially on big lower-body days.
- Avoid grinders: keep reps in reserve (2–4 RIR) and avoid true failure work.
- Move training earlier if late sessions are pushing bedtime; late high-intensity work can worsen sleep onset for some lifters.
If soreness is high in the target muscles
- Adjust exercise selection: pick movements with less stretch-mediated damage (e.g., leg press over deep SSB squats; machine fly over deep DB fly).
- Reduce ROM strategically for one session (partials or controlled ROM) if soreness is limiting bracing or technique.
- Keep some loading if it’s muscle soreness, not pain: light-to-moderate work (RPE 5–7) often reduces perceived soreness via increased blood flow and movement.
Simple decision table (what to do today)
| Score / pattern | Recommended session change |
|---|---|
| 80–100 and stable trends | Run the plan; keep top sets as programmed |
| 65–79 or one-day dip | Keep intensity but cut 1–2 hard sets per lift, or cap at RPE 8 |
| 50–64 for 2–3 days | Reduce volume 20–40% and avoid failure; consider swapping to lower-fatigue variations |
| < 50 or 7-day trend clearly down | Deload (volume down 40–60%, intensity down 5–10%) for 3–7 days; prioritize sleep and calories |
Important nuance: a low score doesn’t automatically mean “rest day.” For intermediates chasing skill and hypertrophy, consistency matters. The better default is train, but reduce the cost—fewer hard sets, fewer grinders, more machine work, and better warm-up/autoregulation.
Apex Fitness is most useful when the recovery score is treated as a dashboard: HRV trend for systemic stress, sleep for capacity, and soreness for local readiness, alongside the actual session plan and performance logs, so training adjustments are made with context instead of chasing a single number.
Train smarter, not just harder.
Apex Fitness adapts your workout when you skip a day, gets sharper after every PR, and tracks recovery without the spreadsheet. Get founding-member access — lifetime perks before public launch.
Join the Waitlist →