open_lm
TomerPorian commited on
Commit
b2fa501
1 Parent(s): 5eb16a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -1
README.md CHANGED
@@ -1,3 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
1
  # Resolving Discrepancies in Compute-Optimal Scaling of Language Models: Checkpoints
2
 
3
  This repository contains the model checkpoints in the paper ["Resolving Discrepancies in Compute-Optimal Scaling of Language Models"](https://arxiv.org/abs/2406.19146), by Tomer Porian, Mithcell Wortsman, Jenia Jitsev, Ludwig Schmidt, and Yair Carmon.
@@ -8,7 +19,18 @@ Each checkpoint directory is in the path
8
 
9
  `dataset={dataset}/hparams={hparams}_warmup={warmup}_decay={decay}/params={int(params / 1e6)}M_maxstep={maxstep}`
10
 
11
- where `dataset, hparams, warmup, decay, params, maxstep` are as defined in the github repository.
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  ## Citation
14
 
 
1
+ ---
2
+ license: mit
3
+
4
+ datasets:
5
+ - RefinedWeb
6
+ - EleutherAI/OpenWebText2
7
+
8
+ library_name: open_lm
9
+
10
+ tokenizer: GPT-NeoX-20B
11
+ ---
12
  # Resolving Discrepancies in Compute-Optimal Scaling of Language Models: Checkpoints
13
 
14
  This repository contains the model checkpoints in the paper ["Resolving Discrepancies in Compute-Optimal Scaling of Language Models"](https://arxiv.org/abs/2406.19146), by Tomer Porian, Mithcell Wortsman, Jenia Jitsev, Ludwig Schmidt, and Yair Carmon.
 
19
 
20
  `dataset={dataset}/hparams={hparams}_warmup={warmup}_decay={decay}/params={int(params / 1e6)}M_maxstep={maxstep}`
21
 
22
+ where `dataset, hparams, warmup, decay, params, maxstep` are as defined in the ["github repository"](https://github.com/formll/resolving-scaling-law-discrepancies), which contains the code and data for reproducing the figures in the paper.
23
+
24
+ ## Code snippet
25
+
26
+ ```
27
+ # create args.yaml file for the model size...
28
+ args.resume = f'dataset={dataset}/hparams={hparams}_warmup={warmup}_decay={decay}/params={int(params / 1e6)}M_maxstep={maxstep}/{model_name}.pt'
29
+ # create model with open_lm create_model function...
30
+ load_model(args, model, None)
31
+ # create data with open_lm get_data function...
32
+ metrics = evaluate(model, data, 0, args, None)
33
+ ```
34
 
35
  ## Citation
36