Stop Using Brier Score Wrong
Brier score hides the most important distinction in probability evaluation. Here's the decomposition every ML practitioner should know.
My previous post post showed something uncomfortable. Platt scaling and isotonic regression — the two most common calibrators — degraded strong models across 30 datasets. Platt improved log-loss in only 49.8% of cases. A coin flip.


