Commit History

Phi2 multipack (#1173)
814aee6
unverified

winglian commited on

Falcon embeddings (#1149) [skip docker]
e799e08
unverified

winglian commited on

Multipack simplify for Mixtral (#1142)
6910e6a
unverified

winglian commited on

Add shifted sparse attention (#973) [skip-ci]
1d70f24
unverified

jrc joecummings winglian commited on

keep gate in fp32 for 16 bit loras (#1105)
da97285
unverified

winglian commited on

attempt to also run e2e tests that needs gpus (#1070)
788649f
unverified

winglian commited on

Phi2 rewrite (#1058)
732851f
unverified

winglian commited on

bump transformers and update attention class map name (#1023)
bcc78d8
unverified

winglian commited on

support for mamba (#915)
40a6362
unverified

winglian commited on

Phi update 202311 (#876)
9bf854e
unverified

winglian commited on

add e2e tests for checking functionality of resume from checkpoint (#865)
b3a61e8
unverified

winglian commited on

use temp_dir kwarg instead
6dc68a6

winglian commited on

missing dunder-init
7de6a56

winglian commited on

chore: lint
c74f045

winglian commited on

make sure to cleanup tmp output_dir for e2e tests
0402d19

winglian commited on

simplify by removing duplicate base_model_config (#772)
2d8def6
unverified

winglian commited on

remove lora fused packing test (#758)
21cf09b
unverified

winglian commited on

Feat: Allow usage of native Mistral FA when no sample_packing (#669)
697c50d
unverified

Nanobit commited on

add mistral e2e tests (#649)
5b0bc48
unverified

winglian commited on

misc fixes to add gptq tests (#621)
03e5907
unverified

winglian commited on

Support Sample packing for phi arch (#586)
12a2dbb
unverified

winglian commited on

E2e device cuda (#575)
2414673
unverified

winglian commited on

e2e testing (#574)
9218ebe
unverified

winglian commited on