File size: 2,854 Bytes
038c995
 
728e0e5
 
 
1566eee
 
 
038c995
1566eee
038c995
 
 
 
728e0e5
5ae6761
728e0e5
5ae6761
728e0e5
5ae6761
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
728e0e5
5ae6761
 
a686ae2
 
5ae6761
 
 
 
 
 
 
 
 
 
 
 
 
 
 
728e0e5
 
1c7b8ac
 
728e0e5
38f0d16
728e0e5
 
1c7b8ac
 
 
 
 
728e0e5
38f0d16
 
 
 
 
 
 
 
 
 
 
728e0e5
 
1c7b8ac
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
title: Multi Label Precision Recall Accuracy Fscore
tags:
- evaluate
- metric
description: >-
  Implementation of example based evaluation metrics for multi-label
  classification presented in Zhang and Zhou (2014).
sdk: gradio
sdk_version: 5.4.0
app_file: app.py
pinned: false
---

# Metric Card for Multi Label Precision Recall Accuracy Fscore
Implementation of example based evaluation metrics for multi-label classification presented in Zhang and Zhou (2014).

## How to Use

    >>> multi_label_precision_recall_accuracy_fscore = evaluate.load("mdocekal/multi_label_precision_recall_accuracy_fscore")
    >>> results = multi_label_precision_recall_accuracy_fscore.compute(
                predictions=[
                    ["0", "1"],
                    ["1", "2"],
                    ["0", "1", "2"],
                ],
                references=[
                    ["0", "1"],
                    ["1", "2"],
                    ["0", "1", "2"],
                ]
            )
    >>> print(results)
    {
        "precision": 1.0,
        "recall": 1.0,
        "accuracy": 1.0,
        "fscore": 1.0
    }

There is also multiset configuration available, which allows to calculate the metrics for multi-label classification with repeated labels.
It uses the same definition as in previous case, but it works with multiset of labels. Thus, intersection, union, and cardinality for multisets are used instead.
    
    >>> multi_label_precision_recall_accuracy_fscore = evaluate.load("mdocekal/multi_label_precision_recall_accuracy_fscore", config_name="multiset")
    >>> results = multi_label_precision_recall_accuracy_fscore.compute(
                predictions=[
                    [0, 1, 1]
                ],
                references=[
                    [1, 0, 1, 1, 0, 0],
                ]
            )
    >>> print(results)
    {
        "precision": 1.0,
        "recall": 0.5,
        "accuracy": 0.5,
        "fscore": 0.6666666666666666
    }

### Inputs
- **predictions** *(list[Union[int,str]]): list of predictions to score. Each predictions should be a list of predicted labels*
- **references** *(list[Union[int,str]]): list of reference for each prediction. Each reference should be a list of reference labels*


### Output Values

This metric outputs a dictionary, containing:
- precision 
- recall
- accuracy
- fscore


Ff prediction and reference are empty lists, the output will be:
```python
{
    "precision": 1.0,
    "recall": 1.0,
    "accuracy": 1.0,
    "fscore": 1.0
}
```

## Citation

```bibtex
@article{Zhang2014ARO,
  title={A Review on Multi-Label Learning Algorithms},
  author={Min-Ling Zhang and Zhi-Hua Zhou},
  journal={IEEE Transactions on Knowledge and Data Engineering},
  year={2014},
  volume={26},
  pages={1819-1837},
  url={https://api.semanticscholar.org/CorpusID:1008003}
}
```