Commit History
fix bettertransformers save, force it to skip after saving correctly in callback
1a82082
more tweaks to do pre-training with bettertransformers
1210dc8
experimental expansion of ctx len
488a67d
add validation/warning for bettertransformers and torch version
71a43f8
use pythia-12b, neox-20b is flaky
3961902
add flash attn context for efficient training and attempt setting model to train mode:
8792199
add support for opimum bettertransformers
1edc30c
Merge pull request #181 from OpenAccess-AI-Collective/xpos-rope
41e4f6c
unverified
Merge pull request #180 from Glavin001/feat/stream-inference
215d775
unverified
formatting for linter
f36e227
unverified
add option to readme
5878bb1
add support to extend context with xpos rope
a03a7d7
Add streaming inference & fix stopping at EOS
fec6bcc
Merge pull request #179 from OpenAccess-AI-Collective/fix-max_seq_len
931e606
unverified
fix for max sequence len across different model types
7f09106
Merge pull request #178 from PocketDocLabs/main
6b50200
unverified
Update README.md to reflect current gradient checkpointing support
16f9e28
unverified
Merge pull request #176 from NanoCode012/fix/peft-import
b9083a7
unverified
Fix backward compat for peft
aefb2fc
Merge pull request #169 from NanoCode012/feat/landmark
b5aa8d8
unverified
Merge pull request #171 from OpenAccess-AI-Collective/NanoCode012-falcon-lora-matrix
4d6490b
unverified
Fix falcon support lora
b242b69
unverified
Merge pull request #170 from OpenAccess-AI-Collective/NanoCode012-lambdalabs-fix
320beb2
unverified
Improve lambda labs instruction
2e13cef
unverified
Fix grad checkpoint and outputs param
2a801b0
Fix patching via import instead of hijacking
e44c9e0
Feat: Add landmark attention
55b8542
Merge pull request #168 from bratao/main
febe902
unverified
Disable Wandb
f4df266
Bruno Cabral
commited on