metadata

language:
  - en
license: mit
datasets:
  - code_x_glue_ct_code_to_text
metrics:
  - bleu
  - sacrebleu

Codet5+ 220m Py Sum

This Model is based on the CodeT5+ (220m) from salesforce and was finetuned for the code summarization task by using the XCodeGlue Dataset. The Code is accessible on Github.

Results

Modell	BLEU
CodeT5-base-sum-python	23.564
CodeT5-base-multi-sum	23.985
Code-Trans-S-ST	5.495
Code-Trans-S-TF	21.093
Code-Trans-S-MT	5.450
Code-Trans-S-MT-TF	16.378
Code-Trans-B-ST	4.638
Code-Trans-B-TF	21.671
Code-Trans-B-MT	2.957
Code-Trans-B-MT-TF	13.766
Code-Trans-L-TF	23.306
Code-Trans-L-MT	13.487
Code-Trans-L-MT-TF	16.362
CodeT5+ 220m Py Sum*	25.245

Example on how to use

The model can be easily download from Huggingface and used in a summarization pipeline.

from transformers import AutoTokenizer, AutoModelWithLMHead, SummarizationPipeline

pipeline = SummarizationPipeline(
    model=AutoModelWithLMHead.from_pretrained("Paul-B98/codet5p_220m_py_sum"),
    tokenizer=AutoTokenizer.from_pretrained("Salesforce/codet5p-220m"),
    device=0
)

example_method = """
def greet(name):
    print(f"Hello, {name}!")
"""

pipeline([example_method])[0]["summary_text"]