Text Generation
PyTorch
Safetensors
English
openlm
mamba
linear
Eval Results
sedrickkeh ivas-tri commited on
Commit
8037865
1 Parent(s): accd6ba

Update README.md (#3)

Browse files

- Update README.md (99149753a9419bf5e57be294e9f537e85e05566b)


Co-authored-by: Igor Vasiljevic <ivas-tri@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -93,7 +93,7 @@ We follow their training recipe and release our version of Mamba-7B.
93
 
94
  ## Training Details
95
  - Mamba-7B was trained using AWS SageMaker on 128 H100 80GB GPUs.
96
- - Training began in March 2024 and lasted around 3 weeks (some down time due to crashes and loss spikes)
97
  | **Hyperparameter** | **Value** |
98
  |--------------------|------------|
99
  | Precision | `bfloat16` |
 
93
 
94
  ## Training Details
95
  - Mamba-7B was trained using AWS SageMaker on 128 H100 80GB GPUs.
96
+ - Training began in March 2024 and lasted three weeks.
97
  | **Hyperparameter** | **Value** |
98
  |--------------------|------------|
99
  | Precision | `bfloat16` |