Initial
Browse files
README.md
CHANGED
@@ -16,32 +16,33 @@ pipeline_tag: text-generation
|
|
16 |
|
17 |
|
18 |
## Model Details
|
19 |
-
|
20 |
-
### Model Description
|
21 |
-
|
22 |
-
<!-- Shivneri Marathi LLM is being built with the wish to bring the benefits of Generative AI to non-English (especially Marathi) speaking population of India.
|
23 |
Marathi has the third largest number of native speakers in India, after Hindi and Bengali.
|
24 |
Almost 83 million people speak the language.
|
25 |
This is a preliminary version of our Marathi LLM (Large Language Model)!
|
26 |
-
Built on the mighty Gemma 7B base model, Shivneri LLM can generate creative and informative text in both Marathi and English. This is just the beginning – we're constantly improving Shivneri, and even more exciting features are on the horizon!
|
|
|
|
|
|
|
|
|
27 |
|
28 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
29 |
|
30 |
- **Developed by:** Amit Ghadge
|
31 |
- **Funded by [optional]:** [More Information Needed]
|
32 |
-
- **Shared by [optional]:** [
|
33 |
-
- **Model type:** [
|
34 |
-
- **Language(s) (NLP):** [
|
35 |
- **License:** [More Information Needed]
|
36 |
-
- **Finetuned from model [optional]:** [
|
37 |
|
38 |
### Model Sources [optional]
|
39 |
|
40 |
<!-- Provide the basic links for the model. -->
|
41 |
|
42 |
-
- **Repository:** [
|
43 |
-
- **Paper [optional]:** [
|
44 |
-
- **Demo [optional]:** [
|
45 |
|
46 |
## Uses
|
47 |
|
@@ -89,11 +90,12 @@ Use the code below to get started with the model.
|
|
89 |
|
90 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
91 |
|
92 |
-
[
|
93 |
|
94 |
### Training Procedure
|
95 |
|
96 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
|
|
97 |
|
98 |
#### Preprocessing [optional]
|
99 |
|
@@ -164,11 +166,11 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
|
|
164 |
|
165 |
### Model Architecture and Objective
|
166 |
|
167 |
-
[
|
168 |
|
169 |
### Compute Infrastructure
|
170 |
|
171 |
-
[
|
172 |
|
173 |
#### Hardware
|
174 |
|
|
|
16 |
|
17 |
|
18 |
## Model Details
|
19 |
+
Shivneri Marathi LLM is being built with the wish to bring the benefits of Generative AI to non-English (especially Marathi) speaking population of India.
|
|
|
|
|
|
|
20 |
Marathi has the third largest number of native speakers in India, after Hindi and Bengali.
|
21 |
Almost 83 million people speak the language.
|
22 |
This is a preliminary version of our Marathi LLM (Large Language Model)!
|
23 |
+
Built on the mighty Gemma 7B base model, Shivneri LLM can generate creative and informative text in both Marathi and English. This is just the beginning – we're constantly improving Shivneri, and even more exciting features are on the horizon!
|
24 |
+
|
25 |
+
### Model Description
|
26 |
+
|
27 |
+
<!-- -->
|
28 |
|
29 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
30 |
|
31 |
- **Developed by:** Amit Ghadge
|
32 |
- **Funded by [optional]:** [More Information Needed]
|
33 |
+
- **Shared by [optional]:** [Amit Ghadge]
|
34 |
+
- **Model type:** [ Decoder-only large language model (LLM) with a transformer architecture]
|
35 |
+
- **Language(s) (NLP):** [Marathi, English]
|
36 |
- **License:** [More Information Needed]
|
37 |
+
- **Finetuned from model [optional]:** [Gemma-7B]
|
38 |
|
39 |
### Model Sources [optional]
|
40 |
|
41 |
<!-- Provide the basic links for the model. -->
|
42 |
|
43 |
+
- **Repository:** [https://github.com/amitagh/shivneri-llm]
|
44 |
+
- **Paper [optional]:** [https://medium.com/@amitagh/shivneri-marathi-llm-e823f0a045d8]
|
45 |
+
- **Demo [optional]:** [Coming soon]
|
46 |
|
47 |
## Uses
|
48 |
|
|
|
90 |
|
91 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
92 |
|
93 |
+
[Continually Pretrained with Lora on AI4Bharat/Sangraha dataset]
|
94 |
|
95 |
### Training Procedure
|
96 |
|
97 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
98 |
+
Continually Pretrained with Lora
|
99 |
|
100 |
#### Preprocessing [optional]
|
101 |
|
|
|
166 |
|
167 |
### Model Architecture and Objective
|
168 |
|
169 |
+
[ Decoder-only large language model (LLM) with a transformer architecture]
|
170 |
|
171 |
### Compute Infrastructure
|
172 |
|
173 |
+
[A100 80 GB]
|
174 |
|
175 |
#### Hardware
|
176 |
|