Commits · Dovakiins/qwerrwe

Respect sliding_window=None (#1214)

62ca4a2
unverified

DreamGenX commited on Jan 26

Mixtral fixes 20240124 (#1192) [skip ci]

54d2ac1
unverified

winglian commited on Jan 24

Phi2 multipack (#1173)

814aee6
unverified

winglian commited on Jan 23

Falcon embeddings (#1149) [skip docker]

e799e08
unverified

winglian commited on Jan 23

Qwen2 (#1166)

f5a828a
unverified

winglian commited on Jan 22

Multipack simplify for Mixtral (#1142)

6910e6a
unverified

winglian commited on Jan 18

Add shifted sparse attention (#973) [skip-ci]

1d70f24
unverified

jrc joecummings

winglian commited on Jan 18

optimize calculation of cu_seqlens from position_ids (#1084) [skip ci]

90036eb
unverified

winglian commited on Jan 10

Added chatglm3 conversation type for training models like TinyLLama (#1036)

59b2d30
unverified

xaviviro commited on Jan 4

bump transformers and update attention class map name (#1023)

bcc78d8
unverified

winglian commited on Jan 3

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

winglian commited on Dec 28, 2023

fix mistral prompt assembly (#982)

7bbaac9
unverified

hamel commited on Dec 21, 2023

Fix prompt assembly for llama (#952)

5ada140
unverified

hamel

tokestermw commited on Dec 14, 2023

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

dg-kalle commited on Dec 13, 2023

Mixtral official (#942)

7fabc4d
unverified

winglian commited on Dec 12, 2023

adds llama and mistral dropout support (#858)

db8a8af
unverified

winglian commited on Nov 15, 2023

various bugfixes (#856)

1470650
unverified

winglian commited on Nov 15, 2023

refactor neft patch to be more re-usable similar to trl's impl (#796)

827ec3d
unverified

winglian commited on Oct 29, 2023

Hotfix for not saving correctly (#762)

32eeeb5
unverified

casperhansen commited on Oct 22, 2023

Implement fused modules (#747)

15d3a65
unverified

casperhansen

winglian commited on Oct 21, 2023

Mistral: Sliding Window Attention with Flash Attention and Sample Packing (#732)

a045db0
unverified

casperhansen

winglian commited on Oct 16, 2023

add noisy embedding (#721)

3bd9528
unverified

Maxime Maxime commited on Oct 13, 2023

flash_attention + sample packing for stablelm 3b (#671)

2d60ba3
unverified

winglian commited on Oct 5, 2023

fix for flash attn w mistral w/o sammple packing (#648)

b2edaae
unverified

winglian commited on Sep 28, 2023

Mistral flash attn packing (#646)

b6ab8aa
unverified

winglian commited on Sep 27, 2023

skip some flash attn patches unless explicitly enabled (#643)

895f0a0
unverified

winglian commited on Sep 27, 2023

use fastchat conversations template (#578)

e7d3e2d
unverified

winglian commited on Sep 27, 2023

update for recent transformers updates (#636)

60c7c48
unverified

winglian commited on Sep 27, 2023

Feat: Add support for upstream FA2 (#626)

19a600a
unverified

Nanobit commited on Sep 26, 2023

btlm and falcon monkey patches for flash attn (#566)

6b9b229
unverified

winglian commited on Sep 17, 2023

Add training callback to send predictions to WandB table (#521)

5b67ea9
unverified

Glavin001 commited on Sep 13, 2023

reorg a bit

fc8766e

tmm1 commited on Sep 5, 2023

use flash_attn rmsnorm when available (#526)

72a6fe1
unverified

tmm1 commited on Sep 4, 2023

use flash_attn xentropy when available (#525)

5fe30b1
unverified

tmm1 commited on Sep 4, 2023

fix checkpints on multigpu (#481)

31f3e71
unverified

winglian commited on Aug 26, 2023

ReLoRA implementation (with quantization) (#322)

bde3c5a
unverified

chargoddard

winglian commited on Aug 24, 2023

fix eval regression caused in 13f7efaf74fcd3c4514277ccb71914c589873f6a

a213d99

tmm1 commited on Aug 21, 2023

is_causal fix for evals?

fbf49a4

winglian commited on Aug 21, 2023

fix evals (#447)

ee26281
unverified

winglian commited on Aug 21, 2023

standardize attn hijack patches (#381)

06edf17
unverified

tmm1

winglian commited on Aug 18, 2023

fix check for flash attn branching (#377)

343ac84
unverified

winglian commited on Aug 13, 2023

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

winglian commited on Aug 12, 2023

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339)

10405b9
unverified

ssmi153 commited on Aug 6, 2023

move flash-attn monkey patch alongside the others

312a9fa

tmm1 commited on Aug 3, 2023

fix sdp attention to use the flash/mem-efficient context manaager

a032c9f

winglian commited on Jul 20, 2023

Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var

b1f4f7a

theobjectivedad commited on Jul 15, 2023

Adding logging enhancement

553a86b

theobjectivedad commited on Jul 14, 2023

Fix set mem_id for inference and refactor

974dc00

Nanobit commited on Jun 11, 2023

Clean up landmark patching

a6190c8

Nanobit commited on Jun 11, 2023

Refactor landmark attention patch

919727b

Nanobit commited on Jun 9, 2023

Commit History

Respect sliding_window=None (#1214) 62ca4a2 unverified

Mixtral fixes 20240124 (#1192) [skip ci] 54d2ac1 unverified

Phi2 multipack (#1173) 814aee6 unverified

Falcon embeddings (#1149) [skip docker] e799e08 unverified

Qwen2 (#1166) f5a828a unverified

Multipack simplify for Mixtral (#1142) 6910e6a unverified

Add shifted sparse attention (#973) [skip-ci] 1d70f24 unverified

optimize calculation of cu_seqlens from position_ids (#1084) [skip ci] 90036eb unverified

Added chatglm3 conversation type for training models like TinyLLama (#1036) 59b2d30 unverified

bump transformers and update attention class map name (#1023) bcc78d8 unverified

remove landmark attn and xpos rope implementations (#1010) 70b46ca unverified

fix mistral prompt assembly (#982) 7bbaac9 unverified

Fix prompt assembly for llama (#952) 5ada140 unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941) ef24342 unverified

Mixtral official (#942) 7fabc4d unverified

adds llama and mistral dropout support (#858) db8a8af unverified

various bugfixes (#856) 1470650 unverified

refactor neft patch to be more re-usable similar to trl's impl (#796) 827ec3d unverified

Hotfix for not saving correctly (#762) 32eeeb5 unverified

Implement fused modules (#747) 15d3a65 unverified

Mistral: Sliding Window Attention with Flash Attention and Sample Packing (#732) a045db0 unverified

add noisy embedding (#721) 3bd9528 unverified

flash_attention + sample packing for stablelm 3b (#671) 2d60ba3 unverified

fix for flash attn w mistral w/o sammple packing (#648) b2edaae unverified

Mistral flash attn packing (#646) b6ab8aa unverified

skip some flash attn patches unless explicitly enabled (#643) 895f0a0 unverified

use fastchat conversations template (#578) e7d3e2d unverified

update for recent transformers updates (#636) 60c7c48 unverified

Feat: Add support for upstream FA2 (#626) 19a600a unverified

btlm and falcon monkey patches for flash attn (#566) 6b9b229 unverified

Add training callback to send predictions to WandB table (#521) 5b67ea9 unverified

reorg a bit fc8766e

use flash_attn rmsnorm when available (#526) 72a6fe1 unverified

use flash_attn xentropy when available (#525) 5fe30b1 unverified

fix checkpints on multigpu (#481) 31f3e71 unverified

ReLoRA implementation (with quantization) (#322) bde3c5a unverified

fix eval regression caused in 13f7efaf74fcd3c4514277ccb71914c589873f6a a213d99

is_causal fix for evals? fbf49a4

fix evals (#447) ee26281 unverified

standardize attn hijack patches (#381) 06edf17 unverified

fix check for flash attn branching (#377) 343ac84 unverified

Attention mask and position id fixes for packing (#285) 2bb0b78 unverified

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339) 10405b9 unverified

move flash-attn monkey patch alongside the others 312a9fa

fix sdp attention to use the flash/mem-efficient context manaager a032c9f

Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var b1f4f7a

Adding logging enhancement 553a86b

Fix set mem_id for inference and refactor 974dc00

Clean up landmark patching a6190c8

Refactor landmark attention patch 919727b

Respect sliding_window=None (#1214)

62ca4a2
unverified

Mixtral fixes 20240124 (#1192) [skip ci]

54d2ac1
unverified

Phi2 multipack (#1173)

814aee6
unverified

Falcon embeddings (#1149) [skip docker]

e799e08
unverified

Qwen2 (#1166)

f5a828a
unverified

Multipack simplify for Mixtral (#1142)

6910e6a
unverified

Add shifted sparse attention (#973) [skip-ci]

1d70f24
unverified

optimize calculation of cu_seqlens from position_ids (#1084) [skip ci]

90036eb
unverified

Added chatglm3 conversation type for training models like TinyLLama (#1036)

59b2d30
unverified

bump transformers and update attention class map name (#1023)

bcc78d8
unverified

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

fix mistral prompt assembly (#982)

7bbaac9
unverified

Fix prompt assembly for llama (#952)

5ada140
unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

Mixtral official (#942)

7fabc4d
unverified

adds llama and mistral dropout support (#858)

db8a8af
unverified

various bugfixes (#856)

1470650
unverified

refactor neft patch to be more re-usable similar to trl's impl (#796)

827ec3d
unverified

Hotfix for not saving correctly (#762)

32eeeb5
unverified

Implement fused modules (#747)

15d3a65
unverified

Mistral: Sliding Window Attention with Flash Attention and Sample Packing (#732)

a045db0
unverified

add noisy embedding (#721)

3bd9528
unverified

flash_attention + sample packing for stablelm 3b (#671)

2d60ba3
unverified

fix for flash attn w mistral w/o sammple packing (#648)

b2edaae
unverified

Mistral flash attn packing (#646)

b6ab8aa
unverified

skip some flash attn patches unless explicitly enabled (#643)

895f0a0
unverified

use fastchat conversations template (#578)

e7d3e2d
unverified

update for recent transformers updates (#636)

60c7c48
unverified

Feat: Add support for upstream FA2 (#626)

19a600a
unverified

btlm and falcon monkey patches for flash attn (#566)

6b9b229
unverified

Add training callback to send predictions to WandB table (#521)

5b67ea9
unverified

reorg a bit

fc8766e

use flash_attn rmsnorm when available (#526)

72a6fe1
unverified

use flash_attn xentropy when available (#525)

5fe30b1
unverified

fix checkpints on multigpu (#481)

31f3e71
unverified

ReLoRA implementation (with quantization) (#322)

bde3c5a
unverified

fix eval regression caused in 13f7efaf74fcd3c4514277ccb71914c589873f6a

a213d99

is_causal fix for evals?

fbf49a4

fix evals (#447)

ee26281
unverified

standardize attn hijack patches (#381)

06edf17
unverified

fix check for flash attn branching (#377)

343ac84
unverified

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339)

10405b9
unverified

move flash-attn monkey patch alongside the others

312a9fa

fix sdp attention to use the flash/mem-efficient context manaager

a032c9f

Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var

b1f4f7a

Adding logging enhancement

553a86b

Fix set mem_id for inference and refactor

974dc00

Clean up landmark patching

a6190c8

Refactor landmark attention patch

919727b