Reliability degeneration
Practitioners of deep learning often assume that test data and training data share the same distribution. Unfortunately, this assumption doesn’t always hold in practice. The world evolves, and data generated from the future is often out-of-distribution (ood). Consequently, as context changes, the in-distribution assumption becomes less realistic, and so does the reliability of our predictions and uncertainties (Fort, Hu, and Lakshminarayanan 2019, Nalisnick et al. 2019, Ovadia et al. 2019). In fact, predictive performance can decrease while measures of confidence increase, which causes a silent failure.