An information theoretic approach to uncertainty - AWS Prescriptive Guidance

An information theoretic approach to uncertainty

The explanation of uncertainty in the previous section relies only on the variance notion of uncertainty, but information theoretic notions of uncertainty exist, too. Incorporating information theoretic aleatoric uncertainty improves robustness of the total uncertainty estimate (Gal 2016, Hein, Andriushchenko, and Bitterwolf 2019, van Amersfoort et al. 2020). Total uncertainty is measured by Shannon’s entropy:

Shannon's entropy

where Ellipsis symbol represented by three dots in parentheses. is the dot product operator and Letter K icon representing a single alphabetic character or keyboard key. is the number of classes.

The predictive entropy Mathematical formula H(p) representing an entropy function. is available to both Bayesian and non-Bayesian neural networks. In order to decompose this total uncertainty into the epistemic and aleatoric components, you must estimate the mutual information Mathematical formula showing MI as a function of p and theta in parentheses. , and this requires a Bayesian approach.

Mutual information