DevQuasar

community

Verified

https://devquasar.com/

Activity Feed

AI & ML interests

Open-Source LLMs, Local AI Projects: https://pypi.org/project/llm-predictive-router/

Recent Activity

csabakecskemeti updated a model 13 minutes ago

DevQuasar/deepseek-ai.DeepSeek-R1-Zero-bf16

csabakecskemeti published a Space about 4 hours ago

DevQuasar/Mi50

csabakecskemeti updated a Space about 5 hours ago

DevQuasar/Mi50

View all activity

DevQuasar's activity

csabakecskemeti

updated a model 13 minutes ago

DevQuasar/deepseek-ai.DeepSeek-R1-Zero-bf16

Text Generation • Updated 13 minutes ago

csabakecskemeti

published a Space about 4 hours ago

Running

👁

Mi50

MI50 inference

csabakecskemeti

updated a Space about 5 hours ago

Running

👁

Mi50

MI50 inference

csabakecskemeti

published a model about 18 hours ago

DevQuasar/DevQuasar-R1-Uncensored-Llama-8B

Text Generation • Updated about 18 hours ago • 5 • 1

csabakecskemeti

posted an update about 18 hours ago

Post

872

I've made an uncensored version of DeepSeek-R1-Distill-Llama-8B with merge. It's passing the "say f***" censor test.
Tested with lm-evaluation-harness on standard open llm leaderboard tests + hellaswag. Scores are improved in most. Details on the model card.

Model:
DevQuasar/DevQuasar-R1-Uncensored-Llama-8B
Quants:
DevQuasar/DevQuasar-R1-Uncensored-Llama-8B-GGUF

3 replies

csabakecskemeti

updated 2 models about 18 hours ago

DevQuasar/DevQuasar-R1-Uncensored-Llama-8B

Text Generation • Updated about 18 hours ago • 5 • 1

DevQuasar/DevQuasar-R1-Uncensored-Llama-8B-GGUF

Text Generation • Updated about 18 hours ago • 166 • 1

csabakecskemeti

updated a model about 22 hours ago

DevQuasar/huihui-ai.DeepSeek-R1-Distill-Qwen-32B-abliterated-GGUF

Text Generation • Updated about 22 hours ago • 11

csabakecskemeti

posted an update 4 days ago

Post

2258

I've run the open llm leaderboard evaluations + hellaswag on deepseek-ai/DeepSeek-R1-Distill-Llama-8B and compared to meta-llama/Llama-3.1-8B-Instruct and at first glance R1 do not beat Llama overall.

If anyone wants to double check the results are posted here:
https://github.com/csabakecskemeti/lm_eval_results

Am I made some mistake, or (at least this distilled version) not as good/better than the competition?

I'll run the same on the Qwen 7B distilled version too.

7 replies

csabakecskemeti

posted an update 10 days ago

Post

474

NVIDIA's new AceInstruct and AceMath models quantized here:

DevQuasar/nvidia-aceinstruct-and-acemath-678d716f736603ddc8d7cbd4

(some still uploading please be patient)

csabakecskemeti

posted an update 26 days ago

Post

576

Managed to run the Q2 quantized Deepseek V3 base locally
The quants are uploading (probably ~10-12hrs) here: DevQuasar/deepseek-ai.DeepSeek-V3-Base-GGUF

1 reply

csabakecskemeti

posted an update 27 days ago

Post

616

Just wondering why the number of parameters changed in the model attributes/Model size from 685B to 684B after converting deepseek-ai/DeepSeek-V3-Base from FP8 to BF16:
DevQuasar/deepseek-ai.DeepSeek-V3-Base-bf16
and not just for me:
opensourcerelease/DeepSeek-V3-Base-bf16

??

csabakecskemeti

posted an update 29 days ago

Post

1544

Happy New Year, Huggingface community!
In 2025, I'll continue my quantization (and some fine-tuning) efforts to support the open-source AI and Make knowledge free for everyone.

https://huggingface.co/DevQuasar
https://devquasar.com/

1 reply

csabakecskemeti

posted an update about 1 month ago

Post

2115

The deepseek-ai/DeepSeek-V3-Base
model has featured today on CNBC tech news. The whale made a splash by using FP8 and shrink the cost of training significantly!

https://youtu.be/NJljq429cGk?si=kgk-ogPTMfJKsaA2

3 replies

csabakecskemeti

posted an update about 1 month ago

Post

1474

I've built a small utility to split safetensors file by file.
The issue/need came up when I've tried to convert the new Deepseek V3 model from FP8 to BF16.
The only Ada architecture GPU I have is an RTX 4080 and the 16GB vram was just wasn't enough for the conversion.

BTW: I'll upload the bf16 version here:
DevQuasar/deepseek-ai.DeepSeek-V3-Base-bf16
(it will take a while - days with my upload speed)
If anyone has access the resources to test it I'd appreciate a feedback if it's working or not.

The tool, is available from here:
https://github.com/csabakecskemeti/ai_utils/blob/main/safetensor_splitter.py
It's splitting every file to n pieces by the layers if possible, and create a new "model.safetensors.index.json" file.
I've tested it with Llama 3.1 8B and multiple split sizes, and validated by using inference pipeline.
use --help for usage
Please note current version expects the model is already multiple file and have a "model.safetensors.index.json" layer-safetensor mapping file.

csabakecskemeti

posted an update about 1 month ago

Post

1227

tiiuae Falcon3 10B Q8 playground:
DevQuasar/Mi50

Also find my tiiuae Falcon3 Quant collection here:
https://huggingface.co/collections/DevQuasar/tiiuae-falcon3-676236626f3c57d1a19c6c1d

Enjoy!

csabakecskemeti

posted an update about 2 months ago

Post

4525

The AMD Instinct MI50 (~$110) is surprisingly fast for inference Quantized models.

This runs a Llama 3.1 8B Q8 with Llama.cpp
DevQuasar/Mi50

A little blogpost about the HW
http://devquasar.com/uncategorized/amd-radeon-instinct-mi50-cheap-inference/

csabakecskemeti

posted an update 2 months ago

Post

1168

Fine Tuned a Llama3.2 3B on the MS Orca-Agents dataset for Analytical-Reasoning
r=16, Alpha=32

If you want to give it a try:

Model:
DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit

Adapter:
DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit_adapter

Quants:
DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit-GGUF

1 reply

csabakecskemeti

posted an update 2 months ago

Post

1183

I have this small utility: no_more_typo
It is running in the background and able to call the LLM model to update the text on the clipboard. I think it would be ideal to fix typos and syntax.
I have just added the option to use custom prompt templates to perform different tasks.

Details, code and executable:
https://github.com/csabakecskemeti/no_more_typo

https://devquasar.com/no-more-typo/

csabakecskemeti

posted an update 2 months ago

Post

294

Repurposed my older AI workstation to a homelab server, it has received 2xV100 + 1xP40
I can reach huge 210k token context size with MegaBeam-Mistral-7B-512k-GGUF ~70+tok/s, or run Llama-3.1-Nemotron-70B-Instruct-HF-GGUF with 50k Context ~10tok/s (V100 only 40k ctx and 15tok/s).
Also able to Lora finetune with similar performace as an RTX3090.
It moved to the garage to no complaints for the noise from the family. Will move to a Rack soon :D

2 replies

AI & ML interests

Recent Activity

Team members 2

DevQuasar's activity

Mi50

Mi50