Stop Using Brier Score Wrong

Brier score hides the most important distinction in probability evaluation. Here's the decomposition every ML practitioner should know.

Apr 04, 2026

∙ Paid

My previous post post showed something uncomfortable. Platt scaling and isotonic regression — the two most common calibrators — degraded strong models across 30 datasets. Platt improved log-loss in only 49.8% of cases. A coin flip.

Continue reading this post for free, courtesy of Valeriy Manokhin.

Or purchase a paid subscription.

Valeriy’s Substack

Stop Using Brier Score Wrong

Brier score hides the most important distinction in probability evaluation. Here's the decomposition every ML practitioner should know.

Continue reading this post for free, courtesy of Valeriy Manokhin.