Update README.md
Browse files
README.md
CHANGED
@@ -112,7 +112,7 @@ model = AutoModelForSequenceClassification.from_pretrained("Fsoft-AIC/Codebert-d
|
|
112 |
## Limitations
|
113 |
This model is trained on 5M subset of The Vault in a self-supervised manner. Since the negative samples are generated artificially, the model's ability to identify instances that require a strong semantic understanding between the code and the docstring might be restricted.
|
114 |
|
115 |
-
It is hard to evaluate the model due to the unavailable labeled datasets.
|
116 |
|
117 |
## Additional information
|
118 |
### Licensing Information
|
|
|
112 |
## Limitations
|
113 |
This model is trained on 5M subset of The Vault in a self-supervised manner. Since the negative samples are generated artificially, the model's ability to identify instances that require a strong semantic understanding between the code and the docstring might be restricted.
|
114 |
|
115 |
+
It is hard to evaluate the model due to the unavailable labeled datasets. GPT-3.5-turbo is adopted as a reference to measure the correlation between the model and GPT-3.5-turbo's scores. However, the result could be influenced by GPT-3.5-turbo's potential biases and ambiguous conditions. Therefore, we recommend having human labeling dataset and fine-tune this model to achieve the best result.
|
116 |
|
117 |
## Additional information
|
118 |
### Licensing Information
|