Model Description
The Sinhala Story Generation Model is based on fine-tuning the XLM-RoBERTa base model on a dataset of Sinhala language stories. It is designed to generate coherent and contextually relevant Sinhala text based on story beginnings.
Intended Use
The model is intended for generating creative Sinhala stories or text based on initial prompts. It can be used in applications requiring automated generation of Sinhala text, such as chatbots, content generation, or educational tools.
Example Use Cases
- Creative Writing: Generate new story ideas or expand on existing story prompts.
- Language Learning: Create exercises or content in Sinhala for language learners.
- Content Generation: Automatically generate text for social media posts, blogs, or websites.
Limitations and Ethical Considerations
- The model's output is based on patterns in the training data and may not always generate accurate or contextually appropriate text.
- Users are advised to review and refine generated text for accuracy and appropriateness before use in sensitive or critical applications.
Model Details
- Model Architecture: XLM-RoBERTa base
- Training Data: Sinhala language stories dataset. Dataset is created using various sources such as social media and web content.
- Tokenization: AutoTokenizer from Hugging Face Transformers library
- Fine-tuning: Fine-tuned on Sinhala story dataset for text generation task
Example Inference
To use the model for inference via the Hugging Face Inference API, consider the following example Python code:
model_name = "your-username/model-name"
generator = pipeline("text-generation", model=model_name, tokenizer=model_name)
input_text = "අද සුන්දර දවසක්. හෙට ගැන සිතමින් මම පාර <mask>"
output = generator(input_text, max_length=150, num_return_sequences=1)
print(output[0]['generated_text'])```
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.