ammarnasr
/

codegen-350M-mono-rust

Text Generation

Model card Files Files and versions Community

codegen-350M-mono-rust / README.md

ammarnasr's picture

Upload README.md

2060deb about 1 year ago

|

No virus

2.1 kB

	---
	license: mit
	datasets:
	- ammarnasr/the-stack-rust-clean
	library_name: adapter-transformers
	tags:
	- code
	pipeline_tag: text-generation
	language:
	- code
	---


	# CodeGen (CodeGen-Mono 350M LoRa Rust)

	## Model description
	CodeGen LoRa Rust is a family of autoregressive language models fine-tuned using LoRa on Different Programming Langauges.
	## Training data
	<!-- https://huggingface.co/datasets/ammarnasr/the-stack-rust-clean -->
	This model was fine-tuned on the cleaned Rust subset from TheStack Avilable [here](https://huggingface.co/datasets/ammarnasr/the-stack-rust-clean). The data consists of 1 Million Rust code files.

	## Training procedure

	This model was fine-tuned using LoRa on 1 T4 GPU. The model was trained for 10,000 steps with batch size of 4. The model was trained using causal language modeling loss.

	## Evaluation results

	We evaluate our models on the MultiPle-E bencchmark. The model achieves 8.9 Pass@10 Rate.


	## Intended Use and Limitations

	However, the model is intended for and best at program synthesis, that is, generating executable code given English prompts, where the prompts should be in the form of a comment string. The model can complete partially-generated code in Rust and Python.

	## How to use

	This model can be easily loaded using the `AutoModelForCausalLM` functionality:

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	tokenizer = AutoTokenizer.from_pretrained("ammmarnasr/codegen-350M-mono-rust")
	model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen-350M-mono")

	text = "def hello_world():"
	input_ids = tokenizer(text, return_tensors="pt").input_ids

	generated_ids = model.generate(input_ids, max_length=128)
	print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
	```

	## BibTeX entry and citation info

	```bibtex
	@article{Nijkamp2022ACP,
	title={A Conversational Paradigm for Program Synthesis},
	author={Nijkamp, Erik and Pang, Bo and Hayashi, Hiroaki and Tu, Lifu and Wang, Huan and Zhou, Yingbo and Savarese, Silvio and Xiong, Caiming},
	journal={arXiv preprint},
	year={2022}
	}
	```