Update README.md
Browse files
README.md
CHANGED
@@ -33,14 +33,14 @@ Parts:
|
|
33 |
## Training
|
34 |
|
35 |
Trained using [`qlora.py`](https://github.com/scottlogic-alex/qlora/blob/stepwise/qlora.py) from our [`stepwise`](https://github.com/scottlogic-alex/qlora/tree/stepwise) branch of [qlora](https://github.com/artidoro/qlora).
|
36 |
-
Known-good as of commit [`
|
37 |
|
38 |
`python -m qlora --model_name_or_path huggyllama/llama-7b --lora_name_or_path tloen/alpaca-lora-7b --dataset prm800k-solutions --dataset_format prm800k-solutions --bf16 --max_memory_MB 24000 --use_bos_token_in_prompt --truncate_toward_center --source_max_len 184 --target_max_len 998 --gradient_accumulation_steps 4 --per_device_train_batch_size 4 --per_device_eval_batch_size 4 --learning_rate 0.0002 --run_name 13b_alpaca_special_tokens_long --report_to wandb --save_steps 64 --save_total_limit 3 --max_steps 1664 --evaluation_strategy steps --eval_steps 64 --generate_steps 16 --register_process_supervision_tokens`
|
39 |
|
40 |
## Usage
|
41 |
|
42 |
You can load using [`evaluate.py`](https://github.com/scottlogic-alex/qlora/blob/stepwise/evaluate.py#L209-L278) from our [`stepwise`](https://github.com/scottlogic-alex/qlora/tree/stepwise) branch of [qlora](https://github.com/artidoro/qlora).
|
43 |
-
Known-good as of commit [`
|
44 |
|
45 |
You'll need to download `embed_tokens.pt` and `lm_head.pt` from this repository, and ensure they are saved to the root of the `qlora` repository, then run `evaluate.py` like so:
|
46 |
|
|
|
33 |
## Training
|
34 |
|
35 |
Trained using [`qlora.py`](https://github.com/scottlogic-alex/qlora/blob/stepwise/qlora.py) from our [`stepwise`](https://github.com/scottlogic-alex/qlora/tree/stepwise) branch of [qlora](https://github.com/artidoro/qlora).
|
36 |
+
Known-good as of commit [`3a86919`](https://github.com/scottlogic-alex/qlora/blob/3a8691986b6718562bcd8e3522447b52842c1d9a/qlora.py).
|
37 |
|
38 |
`python -m qlora --model_name_or_path huggyllama/llama-7b --lora_name_or_path tloen/alpaca-lora-7b --dataset prm800k-solutions --dataset_format prm800k-solutions --bf16 --max_memory_MB 24000 --use_bos_token_in_prompt --truncate_toward_center --source_max_len 184 --target_max_len 998 --gradient_accumulation_steps 4 --per_device_train_batch_size 4 --per_device_eval_batch_size 4 --learning_rate 0.0002 --run_name 13b_alpaca_special_tokens_long --report_to wandb --save_steps 64 --save_total_limit 3 --max_steps 1664 --evaluation_strategy steps --eval_steps 64 --generate_steps 16 --register_process_supervision_tokens`
|
39 |
|
40 |
## Usage
|
41 |
|
42 |
You can load using [`evaluate.py`](https://github.com/scottlogic-alex/qlora/blob/stepwise/evaluate.py#L209-L278) from our [`stepwise`](https://github.com/scottlogic-alex/qlora/tree/stepwise) branch of [qlora](https://github.com/artidoro/qlora).
|
43 |
+
Known-good as of commit [`3a86919`](https://github.com/scottlogic-alex/qlora/blob/3a8691986b6718562bcd8e3522447b52842c1d9a/evaluate.py).
|
44 |
|
45 |
You'll need to download `embed_tokens.pt` and `lm_head.pt` from this repository, and ensure they are saved to the root of the `qlora` repository, then run `evaluate.py` like so:
|
46 |
|