← 2026-05-20  |  2026-05-22 →

Cossatot River — Opus Supplement: 2026-05-20

Author: Opus 4.7 (retrospective, post-fix) Purpose: Cross-check the nightly qwen3.6:27b analysis for 2026-05-19, which classified that day's 4.74 ft prediction as a false_positive and blamed QPE data quality. After porting Richland's build_baseline fix to cossatot_predict.py (cossatot_predict.py.bak.20260520_baseline_fix), we replayed the top-5 worst archived errors through the fixed code to see which ones actually were the same Richland pathology vs. genuinely different failure modes.

Replay table

Settled per-prediction errors against the fixed build_baseline (RISING branch now holds at current_height until band_end = max(arrival_center + dispersion/2), then transitions to exponential recession — no trend extrapolation, no +2.0 cap).

generated_at (UTC) OLD peak NEW peak ACTUAL OLD err NEW err Verdict
2026-05-20T06:09 5.28 ft 3.33 ft 3.29 ft +1.99 +0.04 Pathology, fixed
2026-05-20T05:09 5.23 ft 3.40 ft 3.30 ft +1.93 +0.10 Pathology, fixed
2026-05-20T04:09 4.74 ft 3.38 ft 3.30 ft +1.44 +0.08 Pathology, fixed
2026-05-19T21:09 3.49 ft 3.51 ft 2.34 ft +1.15 +1.17 Different problem
2026-05-19T20:09 3.45 ft 3.43 ft 2.32 ft +1.13 +1.11 Different problem

The three dawn errors collapsed to single-digit-cents agreement with actuals. The two evening errors moved by less than 0.02 ft — the fix is irrelevant to them.

Two distinct failure modes

Mode 1 — Baseline RISING branch double-counted band UH (2026-05-20 dawn)

The archived 04:09 UTC record shows the diagnosis directly:

t (hr) predicted_ft source
+0.5 3.14 band_5
+1.5 3.21 band_5
+2.5 3.34 band_5
+3.5 3.58 baseline
+4.5 3.82 baseline
+5.5 4.06 baseline
+6.5 4.30 baseline
+7.5 4.54 baseline

Bands 4+5 had arrival_center_hours_from_now = -1.72 / -0.28 — their contributions were almost fully in the past, so the band UH faded out after hour ~3. From that point on the OLD code's baseline(t) = current + trend_rate * t ramped linearly at +0.24 ft/hr until clamping at current + 2.0 = 4.74. Pure artifact of the baseline branch.

Under the fix, band_end = max(-0.28 + 5/2, -1.72 + 4/2) = 2.22 hr. After that, baseline transitions to recession via the basin's recently-fit k = 0.044, h_base = 2.40. Peak = current_height + dominant band UH overlap ≈ 3.38 ft — which matches the actual to 0.08 ft.

Mode 2 — Band response over-predicted from the rainfall input (2026-05-19 evening)

The 21:09 record has all five bands firing with substantial contributions: band 2 = 2.09 ft, band 3 = 1.07 ft, band 4 = 0.63 ft, band 5 = 0.50 ft. Sum across the UH integration → predicted peak ≈ 3.51 ft. Actual peak: 2.34 ft (rise from 2.25 ft baseline = 0.09 ft, not the predicted ~1.2 ft rise).

This is NOT the baseline pathology. The fix's "hold then recede" baseline is the right shape; the problem is that the band contributions themselves were too large. Possible causes (not distinguishable from this one record):

The yesterday-evening analyzer ran into a worse version of this on 2026-05-19 and correctly classified the day as false_positive, but it declined to adjust coefficients because it couldn't tell which input was at fault. That restraint was actually the right call — adjusting coefficients to fit one ambiguous event could degrade the model on cleaner events.

What changes now

  1. The Mode-1 pathology is gone. cossatot_predict.py is now structurally identical to richland_predict.py in the RISING / recession-transition logic. Next time bands contribute mostly-past content with no fresh rain arriving, the predictor will hold at current_height and roll into recession instead of fabricating a 2 ft rise.
  2. Mode 2 is left to the nightly analyzer. With 3 of the 5 worst archived errors no longer in the dataset, the residual bias should drop closer to zero and the analyzer can focus on the genuinely-mis-coefficiented events. Re-run python3 /home/dave/evaluate_predictors.py after a few more event days to see whether Mode 2 is rare or systematic.
  3. If Mode 2 turns out to be systematic, two concrete probes:
  4. Compare gauge-precip ground truth to Band 1 QPE on dry-gauge days to quantify the QPE-vs-gauge bias.
  5. Tighten the moisture multiplier — WET × 1.5 may be too aggressive when the 7-day total is driven by a single big day rather than steady accumulation. Could replace the rolling total with a decay-weighted index.

Files referenced

← 2026-05-20  |  2026-05-22 →