RAGPT: Fine-tuned GPT-2 for Context-Based Question Answering

Model Description

RAGPT is a fine-tuned version of GPT-2 small, specifically adapted for context-based question answering tasks. This model has been trained to generate relevant answers based on a given context and question, similar to a Retrieval-Augmented Generation (RAG) system.

Key Features

Based on the GPT-2 small architecture (124M parameters)
Fine-tuned on the "neural-bridge/rag-dataset-12000" dataset from Hugging Face
Capable of generating answers based on provided context and questions
Suitable for various question-answering applications

Training Data

The model was fine-tuned using the "neural-bridge/rag-dataset-12000" dataset, which contains:

Context passages
Questions related to the context
Corresponding answers

Fine-tuning Process

The fine-tuning process involved:

Loading the pre-trained GPT-2 small model
Preprocessing the dataset to combine context, question, and answer into a single text
Training the model to predict the next token given the context and question

Hyperparameters

Base model: GPT-2 small
Number of training epochs: 3
Batch size: 4
Learning rate: Default AdamW optimizer settings
Max sequence length: 512 tokens

Usage

To use the model:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "BueormLLC/RAGPT"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Prepare input
context = "Your context here"
question = "Your question here"
input_text = f"Contexto: {context}\nPregunta: {question}\nRespuesta:"

# Generate answer
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = model.generate(input_ids, max_length=150, num_return_sequences=1)
answer = tokenizer.decode(output[0], skip_special_tokens=True)

Limitations

The model's knowledge is limited to its training data and the base GPT-2 model.
It may sometimes generate irrelevant or incorrect answers, especially for topics outside its training domain.
The model does not have access to external information or real-time data.

Ethical Considerations

Users should be aware that this model, like all language models, may reflect biases present in its training data. It should not be used as a sole source of information for critical decisions.

Future Improvements

Fine-tuning on a larger and more diverse dataset
Experimenting with larger base models (e.g., GPT-2 medium or large)
Implementing techniques to improve factual accuracy and reduce hallucinations

Support us

We appreciate your support, without you we could not do what we do.

Citation

If you use this model in your research, please cite:

@misc{RAGPT,
  author = {Bueorm},
  title = {RAGPT: Fine-tuned GPT-2 for Context-Based Question Answering},
  year = {2024},
  publisher = {GitHub},
  journal = {None},
  howpublished = {\url{https://huggingface.co/BueormLLC/RAGPT}}
}

Downloads last month: 5

Safetensors

Model size

0.1B params

Tensor type

F32

BueormLLC
/

RAGPT