Ragged attention supported in vLLM
1
#18 opened about 18 hours ago
by
patrickvonplaten
Passkey evaluation on Flash Infer backend
#16 opened 8 days ago
by
joejose2728
Base model?
#15 opened 10 days ago
by
deltanym
Weird behavior of chat template
#14 opened 10 days ago
by
kpriyanshu256
Request access to Ministral 3B
1
#12 opened 17 days ago
by
tinatywang
Can not use HF transformers for inference?
#11 opened 27 days ago
by
haili-tian
Error when setting max_model_len to 65536 for Ministral-8B-Instruct-2410 on A100 | VLLM
#10 opened about 1 month ago
by
Byerose
Where is Ministral 3B?
1
#9 opened about 1 month ago
by
ZeroWw
an error when trying to infer in Chinese
1
#8 opened about 1 month ago
by
mario479
Looks like not as good as Qwen2.5 7B
9
#5 opened about 1 month ago
by
MonolithFoundation
3B Version Weights
6
#4 opened about 1 month ago
by
TKDKid1000
This LLM is hallucinating like crazy. Can someone verify these prompts?
28
#3 opened about 1 month ago
by
phil111
Not MRL again :(
3
#1 opened about 1 month ago
by
ProdeusUnity