Instructions to use maharnab/gpt2_pycode with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use maharnab/gpt2_pycode with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="maharnab/gpt2_pycode")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("maharnab/gpt2_pycode")
model = AutoModelForMultimodalLM.from_pretrained("maharnab/gpt2_pycode")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use maharnab/gpt2_pycode with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "maharnab/gpt2_pycode"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "maharnab/gpt2_pycode",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/maharnab/gpt2_pycode

SGLang

How to use maharnab/gpt2_pycode with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "maharnab/gpt2_pycode" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "maharnab/gpt2_pycode",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "maharnab/gpt2_pycode" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "maharnab/gpt2_pycode",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use maharnab/gpt2_pycode with Docker Model Runner:
```
docker model run hf.co/maharnab/gpt2_pycode
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

GPT2 PyCode

This model is a fine-tuned version of the GPT 124M model, specifically adapted for testing purposes in Python code generation. It was trained on a small corpus of 25,000 Python code samples.

Model Description

This project features a GPT (Generative Pre-trained Transformer) language model with 124 million parameters that has been fine-tuned for Python code generation. Unlike larger models like GPT-2 or GPT-3, this is a smaller-scale model designed primarily for testing and experimental purposes.

Developed by: Maharnab Saikia
Model type: Language model
Language(s) (NLP): English
License: MIT
Finetuned from model: GPT2 124M

Uses

Research: Studying the behavior of small-scale language models in code generation tasks
Benchmarking: Providing a baseline for comparing different model architectures or training strategies
Rapid Prototyping: Quick tests of code generation ideas without the overhead of larger models
Education: Demonstrating the principles of fine-tuning language models for specific tasks

Bias, Risks, and Limitations

It's crucial to understand the limitations of this model:

Limited knowledge base due to the small training corpus
May struggle with complex or specialized Python code
Not suitable for production-level code generation tasks
Performance will likely be significantly lower than larger, more comprehensively trained models

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch
import re


device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

tokenizer = GPT2Tokenizer.from_pretrained('maharnab/gpt2_pycode')
model = GPT2LMHeadModel.from_pretrained('maharnab/gpt2_pycode')
model.to(device)

prompt = "How to reverse a string in Python."
encoded_input = tokenizer.encode_plus(f"<sos><user>{prompt}</user><assistant>", max_length=20, truncation=True, return_tensors="pt").to(device)

input_ids = encoded_input['input_ids']
attention_mask = encoded_input['attention_mask']

output = model.generate(
    input_ids, 
    max_length=512, 
    num_return_sequences=1, 
    no_repeat_ngram_size=2,
    temperature=0.7,
    do_sample=True,
    top_k=50,
    top_p=0.95,
    attention_mask=attention_mask,
    pad_token_id=tokenizer.pad_token_id
)

generated_code = tokenizer.decode(output[0])
generated_code = re.search(r'<assistant>(.*?)</assistant>', generated_code, re.DOTALL).group(1)

print(f"Prompt: {prompt}\nGenerated Code:\n{generated_code}")

Training Details

Training Data

Model: GPT with 124 million parameters
Training Data: 25,000 Python code samples
Fine-tuning: Adapted specifically for Python code generation tasks

Training Hyperparameters

Epochs: 5
Batch Size: 8
Learning Rate: 5e-5
Contex Window: 512

Environmental Impact

Carbon emissions was estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: P100 GPU
Hours used: 5
Cloud Provider: Kaggle
Compute Region: South Asia
Carbon Emitted: 1.15

Acknowledgements

This project builds upon the GPT-2 model developed by OpenAI. We acknowledge their groundbreaking work in the field of natural language processing.

Downloads last month: 10

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for maharnab/gpt2_pycode

Quantizations

2 models

Dataset used to train maharnab/gpt2_pycode

Paper for maharnab/gpt2_pycode

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 51