Spaces:
Running
A newer version of the Gradio SDK is available:
5.6.0
title: Accuracy
emoji: 🤗
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 3.19.1
app_file: app.py
pinned: false
tags:
- evaluate
- metric
description: >-
Accuracy is the proportion of correct predictions among the total number of
cases processed. It can be computed with: Accuracy = (TP + TN) / (TP + TN + FP
+ FN) Where: TP: True positive TN: True negative FP: False positive FN: False
negative
Metric Card for Accuracy
Metric Description
Accuracy is the proportion of correct predictions among the total number of cases processed. It can be computed with: Accuracy = (TP + TN) / (TP + TN + FP + FN) Where: TP: True positive TN: True negative FP: False positive FN: False negative
How to Use
At minimum, this metric requires predictions and references as inputs.
>>> accuracy_metric = evaluate.load("accuracy")
>>> results = accuracy_metric.compute(references=[0, 1], predictions=[0, 1])
>>> print(results)
{'accuracy': 1.0}
Inputs
- predictions (
list
ofint
): Predicted labels. - references (
list
ofint
): Ground truth labels. - normalize (
boolean
): If set to False, returns the number of correctly classified samples. Otherwise, returns the fraction of correctly classified samples. Defaults to True. - sample_weight (
list
offloat
): Sample weights Defaults to None.
Output Values
- accuracy(
float
orint
): Accuracy score. Minimum possible value is 0. Maximum possible value is 1.0, or the number of examples input, ifnormalize
is set toTrue
. A higher score means higher accuracy.
Output Example(s):
{'accuracy': 1.0}
This metric outputs a dictionary, containing the accuracy score.
Values from Popular Papers
Top-1 or top-5 accuracy is often used to report performance on supervised classification tasks such as image classification (e.g. on ImageNet) or sentiment analysis (e.g. on IMDB).
Examples
Example 1-A simple example
>>> accuracy_metric = evaluate.load("accuracy")
>>> results = accuracy_metric.compute(references=[0, 1, 2, 0, 1, 2], predictions=[0, 1, 1, 2, 1, 0])
>>> print(results)
{'accuracy': 0.5}
Example 2-The same as Example 1, except with normalize
set to False
.
>>> accuracy_metric = evaluate.load("accuracy")
>>> results = accuracy_metric.compute(references=[0, 1, 2, 0, 1, 2], predictions=[0, 1, 1, 2, 1, 0], normalize=False)
>>> print(results)
{'accuracy': 3.0}
Example 3-The same as Example 1, except with sample_weight
set.
>>> accuracy_metric = evaluate.load("accuracy")
>>> results = accuracy_metric.compute(references=[0, 1, 2, 0, 1, 2], predictions=[0, 1, 1, 2, 1, 0], sample_weight=[0.5, 2, 0.7, 0.5, 9, 0.4])
>>> print(results)
{'accuracy': 0.8778625954198473}
Limitations and Bias
This metric can be easily misleading, especially in the case of unbalanced classes. For example, a high accuracy might be because a model is doing well, but if the data is unbalanced, it might also be because the model is only accurately labeling the high-frequency class. In such cases, a more detailed analysis of the model's behavior, or the use of a different metric entirely, is necessary to determine how well the model is actually performing.
Citation(s)
@article{scikit-learn,
title={Scikit-learn: Machine Learning in {P}ython},
author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
journal={Journal of Machine Learning Research},
volume={12},
pages={2825--2830},
year={2011}
}