Calibration Curve — Claimed vs Observed Confidence
Claimed confidence (what the model said)
Observed hit-rate (what actually happened)
A well-calibrated engine has the blue bar reaching the amber line:
when it claims 70% confidence, outcomes prove true ~70% of the time.