Spaces:

codeparrot
/

code-generation-models

Running

loubnabnl HF Staff commited on Jun 4, 2022

Commit

835e9c6

1 Parent(s): 337d672

Update evaluation/demo_humaneval.md

Files changed (1) hide show

evaluation/demo_humaneval.md CHANGED Viewed

@@ -52,18 +52,4 @@ Results: {'pass@1': 0.1, 'pass@10': 0.7631, 'pass@20': 1.0}
 ````
 If we take a closer look at the unit test results for each candidate solution, we find that 2 passed the unit test. This means that we have 2 correct solutions among 20, which corresponds to our pass@1 value `2/20 = 0.1`. The scores pass@10 and pass@20 are higher, because the more samples we select from the candidate completions, the more likely we are to include the correct implementation. As
-for pass@20, it is `1`, since if we select all 20 candidates the problem gets solved which gives 100% success rate. If you are curious about the candidate solutions that passed the tests, they both implemented this function:
-```python
-def truncate_number(number: float) -> float:
-    """ Given a positive floating point number, it can be decomposed into
-    and integer part (largest integer smaller than given number) and decimals
-    (leftover part always smaller than 1).
-    Return the decimal part of the number.
-    >>> truncate_number(3.5)
-    0.5
-    """
-    return number % 1
-```

 ````
 If we take a closer look at the unit test results for each candidate solution, we find that 2 passed the unit test. This means that we have 2 correct solutions among 20, which corresponds to our pass@1 value `2/20 = 0.1`. The scores pass@10 and pass@20 are higher, because the more samples we select from the candidate completions, the more likely we are to include the correct implementation. As
+for pass@20, it is `1`, since if we select all 20 candidates the problem gets solved which gives 100% success rate.