Spaces:
Paused
Paused
24 | |
FUNDAMENTALS | |
I. Maximum softmax probability (MSP): g(x) = maxy0 βY fy0 (x) | |
II. Maximum logit: g(x) = maxy0 βY zy0 (x), with logits z β RK | |
P | |
III. Negative entropy: g(x) = β y0 βY fy0 (x) log fy0 (x) | |
IV. Margin: g(x) = maxy0 βY fy0 (x) β maxy00 βY\y0 fy00 (x) | |
V. Distance-based measures | |
β’ kNN distance: A 1D outlier score derived from the average distance | |
of the feature representation of x to its k nearest neighbors in the | |
training distribution | |
β’ Mahalanobis distance [390]: The minimum distance of the feature | |
map (e.g., penultimate layer activations) of a test input to classconditional Gaussian distributions of the training data. | |
VI. Bayesian uncertainty estimation | |
Chapter 3 used MSP and negative entropy as CSFs, next to various PUQ | |
methods for Bayesian uncertainty estimation. Other chapters used MSP as it | |
is the most common CSF in practice, requiring only logits as input. From the | |
use of CSFs also follows the need to evaluate their statistical quality next to | |
task-specific predictive performance metrics, which is discussed next. | |
2.2.3 | |
Evaluation Metrics | |
In an ideal world, the evaluation metric of interest would be the same as the loss | |
function used for training, yet this is rarely the case in practice, as the gradientbased optimization process requires a continuously differentiable function, while | |
the metric of interest is often non-differentiable, e.g., accuracy vs. cross-entropy | |
in classification. | |
Throughout our works, we have used (or extended) multiple predictive | |
performance, calibration, and robustness metrics, of which the most interesting | |
are respectively outlined. | |
Average Normalized Levenshtein Similarity (ANLS) is a metric introduced in [39] for the evaluation of VQA, which was then extended [449] to | |
support lists and be invariant to the order of provided answers. We adapted the | |
underlying Levenshtein Distance (LD) metric [251] to support not-answerable | |
questions, NA(G) = I[type(G) = not-answerable ] (see Equation (2.7)). | |