metadata

library_name: transformers
license: llama3
datasets:
  - smallstepai/marathi-instruction-tuning-alpaca
  - ai4bharat/indic-align
language:
  - mr
  - en

Model Card for Model ID

Model Details

Shivneri Marathi LLM is being built with the wish to bring the benefits of Generative AI to non-English (especially Marathi) speaking population of India. Marathi has the third largest number of native speakers in India, after Hindi and Bengali. Almost 83 million people speak the language. This is a preliminary version of our Marathi LLM (Large Language Model)! Built on the mighty Llama3 8B instruct model, Shivneri LLM can generate creative and informative text in both Marathi and English. This is just the beginning – we're constantly improving Shivneri, and even more exciting features are on the horizon!

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: Amit Ghadge
Funded by [optional]: [More Information Needed]
Shared by [optional]: [Amit Ghadge]
Model type: [ Decoder-only large language model (LLM) with a transformer architecture]
Language(s) (NLP): [Marathi, English]
License: [More Information Needed]
Finetuned from model [optional]: [Meta-Llama-3-8B-Instruct]

Model Sources [optional]

Repository: [https://github.com/amitagh/shivneri-llm]
Paper [optional]: [https://www.linkedin.com/pulse/releasing-shivneri-llm-instruct-model-version-amit-ghadge-j051f/]
Demo [optional]: [Coming soon]

Uses

This is a very preliminary version. Please use with caution. Would suggest to more updates and final models to try out.

Training Details

Training Data

[SFT with Lora on mentioned datasets above]

Training Procedure

SFT with Lora

Model Architecture and Objective

[ Decoder-only large language model (LLM) with a transformer architecture]

Compute Infrastructure

[A100 80 GB]

Meet the Developers

Get to know the creators behind this innovative model and follow their contributions to the field:

Amit Ghadge

Model Release Date May 1st, 2024.

Status This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback.

License

The model inherits the license from meta-llama3.

How to use

Use pretty much remains the same as original Meta-Llama-3-8B-Instruct model. Visit its page for more details. With this model you can now use Marathi prompts and build conversational apps using it.

Citation [optional]

If you use this model in your research, please cite:

@misc{amitghadge2024ShivneriLLMv01,
      title={Shivneri-LLM: Your Bilingual Marathi and English Text Generation LLM}, 
      author={Amit Ghadge},
      year={2024},
      eprint={https://www.linkedin.com/pulse/releasing-shivneri-llm-instruct-model-version-amit-ghadge-j051f/},

}

We hope this model serves as a valuable tool in your NLP toolkit and look forward to seeing the advancements it will enable in the understanding and generation of the Marathi language.