pere commited on
Commit
1d809ca
·
verified ·
1 Parent(s): c1c83d1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -85,7 +85,7 @@ Please note tht this is still a research project, and the purpose of releasing t
85
  ```python
86
  import transformers
87
 
88
- model_id = "north/nb-llama-3.2-1B"
89
 
90
  pipeline = transformers.pipeline(
91
  "text-generation",
@@ -116,7 +116,7 @@ Parts of the following publicly available datasets were used:
116
 
117
  ### Data Selection
118
 
119
- To ensure the highest quality training data, only a small subset of the original raw data was used. An encoder-only classifier built on [nb-bert-base](https://huggingface.co/NbAiLab/nb-bert-base) was trained to evaluate both educational value and linguistic quality of the training samples.
120
 
121
  - **Categorization Methods:**
122
  - Inspired by the [FineWeb](https://example.com/FineWeb) project.
@@ -136,4 +136,4 @@ The model is released under the [Llama 3.2 Community License](https://github.com
136
  ---
137
 
138
  ### Citing & Authors
139
- The model was trained and documentation written by Per Egil Kummervold.
 
85
  ```python
86
  import transformers
87
 
88
+ model_id = "NbAiLab/nb-llama-3.2-1B"
89
 
90
  pipeline = transformers.pipeline(
91
  "text-generation",
 
116
 
117
  ### Data Selection
118
 
119
+ To ensure the highest quality training data, only a small subset of the original raw data was used. [Corpus Quality Classifiers](https://huggingface.co/collections/NbAiLab/corpus-quality-classifier-673f15926c2774fcc88f23aa) built on [nb-bert-base](https://huggingface.co/NbAiLab/nb-bert-base) were trained to evaluate both educational value and linguistic quality of the training samples. These models are released along with the NB-Llama-3.x models, and are considered the main output from this initiative.
120
 
121
  - **Categorization Methods:**
122
  - Inspired by the [FineWeb](https://example.com/FineWeb) project.
 
136
  ---
137
 
138
  ### Citing & Authors
139
+ The model was trained and documentation written by Per Egil Kummervold as part of the NoTraM-project.