Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -85,7 +85,7 @@ Please note tht this is still a research project, and the purpose of releasing t
|
|
85 |
```python
|
86 |
import transformers
|
87 |
|
88 |
-
model_id = "
|
89 |
|
90 |
pipeline = transformers.pipeline(
|
91 |
"text-generation",
|
@@ -116,7 +116,7 @@ Parts of the following publicly available datasets were used:
|
|
116 |
|
117 |
### Data Selection
|
118 |
|
119 |
-
To ensure the highest quality training data, only a small subset of the original raw data was used.
|
120 |
|
121 |
- **Categorization Methods:**
|
122 |
- Inspired by the [FineWeb](https://example.com/FineWeb) project.
|
@@ -136,4 +136,4 @@ The model is released under the [Llama 3.1 Community License](https://github.com
|
|
136 |
---
|
137 |
|
138 |
### Citing & Authors
|
139 |
-
The model was trained and documentation written by Per Egil Kummervold.
|
|
|
85 |
```python
|
86 |
import transformers
|
87 |
|
88 |
+
model_id = "NbAiLab/nb-llama-3.1-8B"
|
89 |
|
90 |
pipeline = transformers.pipeline(
|
91 |
"text-generation",
|
|
|
116 |
|
117 |
### Data Selection
|
118 |
|
119 |
+
To ensure the highest quality training data, only a small subset of the original raw data was used. [Corpus Quality Classifiers](https://huggingface.co/collections/NbAiLab/corpus-quality-classifier-673f15926c2774fcc88f23aa) built on [nb-bert-base](https://huggingface.co/NbAiLab/nb-bert-base) were trained to evaluate both educational value and linguistic quality of the training samples. These models are released along with the NB-Llama-3.x models, and are considered the main output from this initiative.
|
120 |
|
121 |
- **Categorization Methods:**
|
122 |
- Inspired by the [FineWeb](https://example.com/FineWeb) project.
|
|
|
136 |
---
|
137 |
|
138 |
### Citing & Authors
|
139 |
+
The model was trained and documentation written by Per Egil Kummervold as part of the NoTraM-project.
|