non-reasoning data
#132 opened about 8 hours ago
by
cmgzy
能不能放一些 4bit的权重,现在手里面的卡都不支持FP8
#131 opened about 17 hours ago
by
zhnagchenchne
For the universe! DeepPhaser.py DeepCoralX.py and DeepSynapse.py
#129 opened about 19 hours ago
by
karmikovic
Request: Create distill of Mistral Small 24B
2
#128 opened 1 day ago
by
Kenshiro-28
which vision model is R1 using for text extraction from image or pdfs.
1
#127 opened 1 day ago
by
ashutoshroy02
![](https://cdn-avatars.huggingface.co/v1/production/uploads/6575424efca370282a705480/4cz9tmxUHKEZILq8jRUyc.jpeg)
Request: DOI
#125 opened 2 days ago
by
Yungchizzy
Little brother(s) of big DeepSeek-R1 ?
#124 opened 2 days ago
by
MrDevolver
Upload gugagagaggagagagga.pdf
1
#123 opened 2 days ago
by
HahahhahH
Tool / Function Calling
2
#122 opened 3 days ago
by
smcleod
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630fff3f02ce39336c495fe9/CZmQtRB4eGVbRBYT3_IH3.png)
Change quant_method to bitsandbytes_4bit
#121 opened 3 days ago
by
ngoc24794
Unknown quantization type
4
#120 opened 3 days ago
by
Reewaz321
UPdate config.json
#119 opened 3 days ago
by
keerthanaOfficial2001
所以部署一个671B的模型 显存需要多少 有什么基准的硬件配置?
5
#118 opened 4 days ago
by
cena163
![](https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/MImUSc6O78sw8ig1TkKSL.png)
Distill Compatibility for PC w/ Ryzen 7 Pro 8840HS w/ 780M Graphics 2x32GB RAM 1TB DDR5 SSD
1
#115 opened 5 days ago
by
arzx
Upload gitattributes.txt
#114 opened 5 days ago
by
SafeerChalil
![](https://cdn-avatars.huggingface.co/v1/production/uploads/67a126a8b5e1e31e384b882d/oBb_OaXfzUdPWDynvOixn.jpeg)
Introducing Deepseek's TinyZero
1
#113 opened 5 days ago
by
DeepSeekModerator
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/2ajguWzqCpnmc-r7PCdLc.jpeg)
Create Kuch v
1
#112 opened 5 days ago
by
gamerdowntown
Request: DOI
#111 opened 5 days ago
by
Hassanabbas2975
quantization fp8 error occuring while using pipeline approach or transformer based approach
1
#110 opened 5 days ago
by
neethuvm
Deepseek-R1
#109 opened 5 days ago
by
KudanTao
deepseek-r1 源码中采用 MLA 架构的 KV Cache 压缩存储策略的实现似乎与文中说的不一致,这是为什么?代码中似乎没实现这个大优化
2
#108 opened 6 days ago
by
Darkdust
Sitting in bed room
#107 opened 6 days ago
by
makecash
Eating food in a car
#106 opened 6 days ago
by
Ayinbaby1313
Update README.md
#103 opened 7 days ago
by
jungvaclav
error while downloading model
4
#102 opened 7 days ago
by
heikhama1982
Upload IMG_20250112_172711.jpg
#101 opened 7 days ago
by
aamir1
help from italy
4
#100 opened 7 days ago
by
MMPPIIAA
R1 distill to Mistral Small?
3
#99 opened 7 days ago
by
nfunctor
Running this model on Google Colab?
3
#98 opened 7 days ago
by
Zakia
![](https://cdn-avatars.huggingface.co/v1/production/uploads/6193f56f1b31b96e2f7e24b9/NV9FlVOzNM038XaJxDeMS.jpeg)
请问下deepseek的同学,能不能train出一个 stable 的 moe model?
#97 opened 8 days ago
by
tflchina
How to download DeepSeek-R1 7B parameters
1
#96 opened 8 days ago
by
barqawiz
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1664733360795-noauth.png)
HuggingFace version does NOT use efficient MLA caching
#95 opened 8 days ago
by
Avelina
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/2p_IJctIFETtjciHxNOpN.jpeg)
Found a bug
#93 opened 8 days ago
by
amalgunatilake
Let's Give Credit Where It’s Due: Adding Source Links to AI Responses
3
#88 opened 9 days ago
by
Munis01
When will this be available in Transformers library?
#87 opened 10 days ago
by
solwol
![](https://cdn-avatars.huggingface.co/v1/production/uploads/64c59f1b61cc71b9c0b412d4/-l2LYlImvjtiFo9IpcE_W.jpeg)
cannot regenerate (blank respone)
#86 opened 10 days ago
by
pluhong
A Bug using hugging face API
3
#85 opened 10 days ago
by
Kevin355
Do we need an authorization access to use this ?
#84 opened 10 days ago
by
Natwar
where is the source code for this Model ? - what does they prodoudly say by open-source models?
1
#83 opened 10 days ago
by
tstarksys
智王发布deepseek-r1懒人包,解压即用Deepseek-r1 Lazy Package, easy to decompress and use
1
#81 opened 10 days ago
by
zwpython
![](https://cdn-avatars.huggingface.co/v1/production/uploads/61c55ef7f1141119998b05c4/IukOtCmEzYA5Xf4LvLjZH.png)
model-00078-of-000163.safetensors not marked safe?
2
#80 opened 10 days ago
by
aborst
![](https://cdn-avatars.huggingface.co/v1/production/uploads/644691f39026066056698d25/JYYdHiE-hgKSoBpfOD3Ev.png)
Create Dare
#79 opened 10 days ago
by
Dara996
problem with using serverless inference
1
#78 opened 10 days ago
by
manju2345
Some weird sensorship on unsensitive topic. 对非敏感话题的奇怪审查。
8
#77 opened 11 days ago
by
junnanwu
Upload dkfoEtm3H4bMcaI0KEJbq.1023.jpeg
#76 opened 11 days ago
by
luckysalami089
Update README.md
#75 opened 11 days ago
by
NuoNb
🚩 Report: Ethical issue(s)
#74 opened 11 days ago
by
Typeofprototype
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/i3BB-JDVbTaFKfYf7UO7L.jpeg)
Deepseek-R1 falls: ZW demon redesigns' Nine Birds' Deepseek-R1沦陷:zw魔改版“九只鸟”
#73 opened 11 days ago
by
zwpython
![](https://cdn-avatars.huggingface.co/v1/production/uploads/61c55ef7f1141119998b05c4/IukOtCmEzYA5Xf4LvLjZH.png)