formll
/

resolving-scaling-law-discrepancies

Model card Files Files and versions Community

TomerPorian commited on Jul 7

Commit

b2fa501

•

1 Parent(s): 5eb16a6

Update README.md

Files changed (1) hide show

README.md +23 -1

README.md CHANGED Viewed

@@ -1,3 +1,14 @@
 # Resolving Discrepancies in Compute-Optimal Scaling of Language Models: Checkpoints
 This repository contains the model checkpoints in the paper ["Resolving Discrepancies in Compute-Optimal Scaling of Language Models"](https://arxiv.org/abs/2406.19146), by Tomer Porian, Mithcell Wortsman, Jenia Jitsev, Ludwig Schmidt, and Yair Carmon.
@@ -8,7 +19,18 @@ Each checkpoint directory is in the path
 `dataset={dataset}/hparams={hparams}_warmup={warmup}_decay={decay}/params={int(params / 1e6)}M_maxstep={maxstep}`
-where `dataset, hparams, warmup, decay, params, maxstep` are as defined in the github repository.
 ## Citation

+---
+license: mit
+datasets:
+- RefinedWeb
+- EleutherAI/OpenWebText2
+library_name: open_lm
+tokenizer: GPT-NeoX-20B
+---
 # Resolving Discrepancies in Compute-Optimal Scaling of Language Models: Checkpoints
 This repository contains the model checkpoints in the paper ["Resolving Discrepancies in Compute-Optimal Scaling of Language Models"](https://arxiv.org/abs/2406.19146), by Tomer Porian, Mithcell Wortsman, Jenia Jitsev, Ludwig Schmidt, and Yair Carmon.
 `dataset={dataset}/hparams={hparams}_warmup={warmup}_decay={decay}/params={int(params / 1e6)}M_maxstep={maxstep}`
+where `dataset, hparams, warmup, decay, params, maxstep` are as defined in the ["github repository"](https://github.com/formll/resolving-scaling-law-discrepancies), which contains the code and data for reproducing the figures in the paper.
+## Code snippet
+```
+# create args.yaml file for the model size...
+args.resume = f'dataset={dataset}/hparams={hparams}_warmup={warmup}_decay={decay}/params={int(params / 1e6)}M_maxstep={maxstep}/{model_name}.pt'
+# create model with open_lm create_model function...
+load_model(args, model, None)
+# create data with open_lm get_data function...
+metrics = evaluate(model, data, 0, args, None)
+```
 ## Citation