Kacper Gล‚ombiewski

Clausss

AI & ML interests

None yet

Recent Activity

Organizations

None yet

Clausss's activity

Reacted to mikelabs's post with ๐Ÿ”ฅ 3 days ago
New activity in Qwen/Qwen2.5-Coder-Artifacts 3 days ago

Upload 81 files

1
#8 opened 5 days ago by Jorge-Ali
Reacted to fdaudens's post with โค๏ธ 4 days ago
view post
Post
1848
๐Ÿฆ‹ Hug the butterfly! You can now add your Bluesky handle to your Hugging Face profile! โœจ
New activity in Qwen/Qwen2.5-Coder-Artifacts 7 days ago

Update app.py

1
#7 opened 8 days ago by yesirecarolinasoto
Reacted to etemiz's post with ๐Ÿ”ฅ 7 days ago
view post
Post
955
if I host in hf spaces, can I interact with the app using an API?
  • 1 reply
ยท
Reacted to m-ric's post with ๐Ÿ‘€ 15 days ago
view post
Post
2349
A non-Instruct LLM assistant is mostly useless. ๐Ÿง

Since it's mostly a model trained to complete text, when you ask it a question like "What to do during a stopover in Paris?", it can just go on and on adding more details to your question instead of answering, which would be valid to complete text from its training corpus, but not to answer questions.

โžก๏ธ So the post-training stage includes an important Instruction tuning step where you teach your model how to be useful : answer questions, be concise, be polite... RLHF is a well known technique for this.

For people interested to understand how this step works, the folks at Adaptive ML have made a great guide!

Read it here ๐Ÿ‘‰ https://www.adaptive-ml.com/post/from-zero-to-ppo
Reacted to nroggendorff's post with ๐Ÿ‘€ 21 days ago
view post
Post
2240
I still think whitespace in tokenizers are so dumb.
Congrats, you just doubled your vocab size for no reason.
  • 3 replies
ยท
New activity in cfahlgren1/qwen-2.5-code-interpreter 25 days ago

this is 4 bit quantized?

1
#3 opened about 1 month ago by Clausss
Reacted to automatedstockminingorg's post with ๐Ÿ‘€ 25 days ago
view post
Post
1756
hi everyone, i have just uploaded my first fine tuned model, but serverless inference client is'nt available, its built with transformer architecture and is just a fine tuned llama 8b instruct. does anyone know how to make serverless inference available on a model?
ยท
Reacted to m-ric's post with ๐Ÿ”ฅ 25 days ago
view post
Post
2363
> Oasis: First Real-Time Video Game Without a Game Engine! ๐ŸŽฎ

DecartAI & Etched just released Oasis - a fully AI-generated video game running at 20 FPS (frames per second). The model takes keyboard inputs and generates everything - physics, rules, graphics - on the fly, without any game engine.

โšก๏ธ What makes this special? Current text-to-video models (Mochi-1, Sora, Kling) generate about 1 frame every 10-20 seconds (that's the kind of device I had to play LoL back in the day, thus my low rankings). Oasis is 200 times faster, making it the first playable AI-generated game.

โš™๏ธ Under the hood, it uses a vision transformer to encode space and a diffusion model to generate frames. The secret sauce is "dynamic noising" - a technique that keeps the video stable between frames.

Key insights:
โšก๏ธ Generates 20 FPS, vs 0.2 FPS for other DIT-based video models
โ€ฃ The specialized hardware Sohu developed by Etched allows to handle 10x more player than H100

๐ŸŽฎ Features real game mechanics
โ€ฃ Movement, jumping, item management
โ€ฃ Physics and lighting
โ€ฃ Procedurally generated worlds

โš ๏ธ Current limitations
โ€ฃ Blurry graphics at a distance
โ€ฃ Objects sometimes change appearance
โ€ฃ Memory issues in long sessions

Try it yourself, the playable demo is impressive! ๐Ÿ‘‰ https://oasis.decart.ai/welcome
Code ๐Ÿ‘‰ https://github.com/etched-ai/open-oasis
Read it in full ๐Ÿ‘‰ https://oasis-model.github.io/
New activity in cfahlgren1/qwen-2.5-code-interpreter about 1 month ago

this is 4 bit quantized?

1
#3 opened about 1 month ago by Clausss
Reacted to huggingface0's post with ๐Ÿคฏ about 2 months ago
view post
Post
3951
1+2=3
  • 2 replies
ยท
Reacted to takeraparterer's post with ๐Ÿ‘€๐Ÿš€ about 2 months ago
view post
Post
2241
Check this out: I trained an AI on huggingface posts! all of these are AI generated:
----------
Hello!

I'm excited to share that my colleague @felipeebert and I have released the largest Spanish LLM benchmark to date.

We've developed the Spanish LLM Evaluation Benchmark (SLAB), a set of benchmarks designed to evaluate the ability of language models to understand, generate and translate in Spanish.

SLAB includes five different benchmarks:
- Sentiment Analysis: evaluate models' ability to detect and describe sentiment in natural language
- Fact Checking: evaluate models' ability to detect and refute factual errors in text
- Question Answering: evaluate models' ability to answer questions in Spanish
- Open-ended Questions: evaluate models' ability to generate coherent responses in Spanish
- Translation: evaluate models' ability to translate in Spanish

SLAB is aligned with the latest Spanish LLM industry developments and includes the most recent models available on the market. We aim to keep our benchmarks up-to-date and relevant to the Spanish language ecosystem.

SLAB is available at: https://huggingface.co/datasets/argilla/SLAB.

If you would like to collaborate on building additional Spanish LLM benchmarks, let's discuss in the comments.

๐Ÿ”— SLAB Blog Post: https://argilla.com/blog/slab
----------
Hello everyone,

I'm thrilled to announce the release of

https://huggingface.co/01-AI/01AI-GPT-4o -

A new family of models that brings the power of transformer AI to the masses.

This model is designed to be accessible and easy to use, while still offering high-quality results.

Key features:
- Small model size: only 23M parameters
- Supports text generation, image generation, and text-to-image tasks
- Data-efficient training with a lightweight tokenizer
- Optimized for efficient on-device usage
- Uses the powerful transformer architecture to deliver high-quality results

Excited to see what you all think!

https://huggingface.co/01-AI/01AI-GPT-4o
ยท
New activity in Qwen/Qwen2.5-1.5B-Instruct about 2 months ago

this model have 50m download?

#2 opened about 2 months ago by Clausss
Reacted to MonsterMMORPG's post with ๐Ÿคฏ๐Ÿค about 2 months ago
view post
Post
4073
Huge news for Kohya GUI - Now you can fully Fine Tune / DreamBooth FLUX Dev with as low as 6 GB GPUs without any quality loss compared to 48 GB GPUs - Moreover, Fine Tuning yields better results than any LoRA training could

Config Files
I published all configs here : https://www.patreon.com/posts/112099700

Tutorials
Fine tuning tutorial in production

Windows FLUX LoRA training (fine tuning is same just config changes) : https://youtu.be/nySGu12Y05k

Cloud FLUX LoRA training (RunPod and Massed Compute ultra cheap) : https://youtu.be/-uhL2nW7Ddw

LoRA Extraction
The checkpoint sizes are 23.8 GB but you can extract LoRA with almost no loss quality - I made a research and public article / guide for this as well

LoRA extraction guide from Fine Tuned checkpoint is here : https://www.patreon.com/posts/112335162

Info
This is just mind blowing. The recent improvements Kohya made for block swapping is just amazing.

Speeds are also amazing that you can see in image 2 - of course those values are based on my researched config and tested on RTX A6000 - same speed as almost RTX 3090

Also all trainings experiments are made at 1024x1024px. If you use lower resolution it will be lesser VRAM + faster speed

The VRAM usages would change according to your own configuration - likely speed as well

Moreover, Fine Tuning / DreamBooth yields better results than any LoRA could

Installers
1-Kohya GUI accurate branch and Windows Torch 2.5 Installers and test prompts shared here : https://www.patreon.com/posts/110879657

The link of Kohya GUI with accurate branch : https://github.com/bmaltais/kohya_ss/tree/sd3-flux.1