Improve model card: Update license & pipeline tag, add project page
Browse filesThis PR improves the model card by:
* Updating the `license` to `mit` for consistency with the associated GitHub repository.
* Changing the `pipeline_tag` to `text-generation` to better reflect the model's primary use case as a large language model and align with the provided usage examples.
* Adding a link to the project page (`https://itay1itzhak.github.io/planted-in-pretraining`) in the model card content for easier access to more project details.
Please review and merge this PR if everything looks good.
README.md
CHANGED
|
@@ -1,18 +1,18 @@
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
- language-modeling
|
| 5 |
-
- causal-lm
|
| 6 |
-
- bias-analysis
|
| 7 |
-
- cognitive-bias
|
| 8 |
datasets:
|
| 9 |
- allenai/tulu-v2-sft-mixture
|
| 10 |
language:
|
| 11 |
- en
|
| 12 |
-
base_model:
|
| 13 |
-
- google/t5-v1_1-xxl
|
| 14 |
-
pipeline_tag: text2text-generation
|
| 15 |
library_name: transformers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
---
|
| 17 |
|
| 18 |
# Model Card for T5-Tulu
|
|
@@ -25,12 +25,13 @@ This 🤗 Transformers model was finetuned using LoRA adapters for the arXiv pap
|
|
| 25 |
We study whether cognitive biases in LLMs emerge from pretraining, instruction tuning, or training randomness.
|
| 26 |
This is one of 3 idnetical versions trained with different random seeds.
|
| 27 |
|
| 28 |
-
-
|
| 29 |
-
-
|
| 30 |
-
-
|
| 31 |
-
-
|
| 32 |
-
-
|
| 33 |
-
-
|
|
|
|
| 34 |
|
| 35 |
## Uses
|
| 36 |
|
|
@@ -55,26 +56,26 @@ print(tokenizer.decode(outputs[0]))
|
|
| 55 |
|
| 56 |
## Training Details
|
| 57 |
|
| 58 |
-
-
|
| 59 |
-
-
|
| 60 |
-
-
|
| 61 |
-
-
|
| 62 |
-
-
|
| 63 |
-
-
|
| 64 |
-
-
|
| 65 |
|
| 66 |
## Evaluation
|
| 67 |
|
| 68 |
-
-
|
| 69 |
-
-
|
| 70 |
-
-
|
| 71 |
|
| 72 |
## Environmental Impact
|
| 73 |
|
| 74 |
-
-
|
| 75 |
-
-
|
| 76 |
|
| 77 |
## Technical Specifications
|
| 78 |
|
| 79 |
-
-
|
| 80 |
-
-
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- google/t5-v1_1-xxl
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
datasets:
|
| 5 |
- allenai/tulu-v2-sft-mixture
|
| 6 |
language:
|
| 7 |
- en
|
|
|
|
|
|
|
|
|
|
| 8 |
library_name: transformers
|
| 9 |
+
license: mit
|
| 10 |
+
pipeline_tag: text-generation
|
| 11 |
+
tags:
|
| 12 |
+
- language-modeling
|
| 13 |
+
- causal-lm
|
| 14 |
+
- bias-analysis
|
| 15 |
+
- cognitive-bias
|
| 16 |
---
|
| 17 |
|
| 18 |
# Model Card for T5-Tulu
|
|
|
|
| 25 |
We study whether cognitive biases in LLMs emerge from pretraining, instruction tuning, or training randomness.
|
| 26 |
This is one of 3 idnetical versions trained with different random seeds.
|
| 27 |
|
| 28 |
+
- **Model type**: encoder-decoder based transformer
|
| 29 |
+
- **Language(s)**: English
|
| 30 |
+
- **License**: MIT
|
| 31 |
+
- **Finetuned from**: `google/t5-v1_1-xxl`
|
| 32 |
+
- **Paper**: https://arxiv.org/abs/2507.07186
|
| 33 |
+
- **Repository**: https://github.com/itay1itzhak/planted-in-pretraining
|
| 34 |
+
- **Project Page**: https://itay1itzhak.github.io/planted-in-pretraining
|
| 35 |
|
| 36 |
## Uses
|
| 37 |
|
|
|
|
| 56 |
|
| 57 |
## Training Details
|
| 58 |
|
| 59 |
+
- Finetuning method: LoRA (high-rank, rank ∈ [64, 512])
|
| 60 |
+
- Instruction data: Tulu-2
|
| 61 |
+
- Seeds: 3 per setting to evaluate randomness effects
|
| 62 |
+
- Batch size: 128 (OLMo) / 64 (T5)
|
| 63 |
+
- Learning rate: 1e-6 to 1e-3
|
| 64 |
+
- Steps: ~5.5k (OLMo) / ~16k (T5)
|
| 65 |
+
- Mixed precision: fp16 (OLMo) / bf16 (T5)
|
| 66 |
|
| 67 |
## Evaluation
|
| 68 |
|
| 69 |
+
- Evaluated on 32 cognitive biases from Itzhak et al. (2024) and Malberg et al. (2024)
|
| 70 |
+
- Metrics: mean bias score, PCA clustering, MMLU accuracy
|
| 71 |
+
- Findings: Biases primarily originate in pretraining; randomness introduces moderate variation
|
| 72 |
|
| 73 |
## Environmental Impact
|
| 74 |
|
| 75 |
+
- Hardware: 4× NVIDIA A40
|
| 76 |
+
- Estimated time: ~120 GPU hours/model
|
| 77 |
|
| 78 |
## Technical Specifications
|
| 79 |
|
| 80 |
+
- Architecture: T5-11B
|
| 81 |
+
- Instruction dataset: Tulu-2
|