stargates goin craaa
alkinun
AtAndDev
AI & ML interests
LLMs, Alignment, Merging, Unsloth, DPO, SFT, ORPO, SPIN..
Recent Activity
replied to
their
post
2 days ago
@nroggendorff is that you sama?
posted
an
update
2 days ago
@nroggendorff is that you sama?
reacted
to
nroggendorff's
post
with ๐ค
2 days ago
hello, dev mode explorers!
Organizations
AtAndDev's activity
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630f3e4002ce39336c411048/Wlldb-4EU5poO6JYmZp2I.png)
replied to
their
post
2 days ago
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630f3e4002ce39336c411048/Wlldb-4EU5poO6JYmZp2I.png)
reacted to
nroggendorff's
post with ๐ค
2 days ago
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630f3e4002ce39336c411048/Wlldb-4EU5poO6JYmZp2I.png)
reacted to
merve's
post with โค๏ธ๐ฅ๐
2 days ago
Post
4276
Your weekly recap of open AI is here, and it's packed with models!
merve/feb-14-releases-67af876b404cc27c6d837767
๐ Multimodal
> OpenGVLab released InternVideo 2.5 Chat models, new video LMs with long context
> AIDC released Ovis2 model family along with Ovis dataset, new vision LMs in different sizes (1B, 2B, 4B, 8B, 16B, 34B), with video and OCR support
> ColQwenStella-2b is a multilingual visual retrieval model that is sota in it's size
> Hoags-2B-Exp is a new multilingual vision LM with contextual reasoning, long context video understanding
๐ฌ LLMs
A lot of math models!
> Open-R1 team released OpenR1-Math-220k large scale math reasoning dataset, along with Qwen2.5-220K-Math fine-tuned on the dataset, OpenR1-Qwen-7B
> Nomic AI released new Nomic Embed multilingual retrieval model, a MoE with 500 params with 305M active params, outperforming other models
> DeepScaleR-1.5B-Preview is a new DeepSeek-R1-Distill fine-tune using distributed RL on math
> LIMO is a new fine-tune of Qwen2.5-32B-Instruct on Math
๐ฃ๏ธ Audio
> Zonos-v0.1 is a new family of speech recognition models, which contains the model itself and embeddings
๐ผ๏ธ Vision and Image Generation
> We have ported DepthPro of Apple to transformers for your convenience!
> illustrious-xl-v1.0 is a new illustration generation model
๐ Multimodal
> OpenGVLab released InternVideo 2.5 Chat models, new video LMs with long context
> AIDC released Ovis2 model family along with Ovis dataset, new vision LMs in different sizes (1B, 2B, 4B, 8B, 16B, 34B), with video and OCR support
> ColQwenStella-2b is a multilingual visual retrieval model that is sota in it's size
> Hoags-2B-Exp is a new multilingual vision LM with contextual reasoning, long context video understanding
๐ฌ LLMs
A lot of math models!
> Open-R1 team released OpenR1-Math-220k large scale math reasoning dataset, along with Qwen2.5-220K-Math fine-tuned on the dataset, OpenR1-Qwen-7B
> Nomic AI released new Nomic Embed multilingual retrieval model, a MoE with 500 params with 305M active params, outperforming other models
> DeepScaleR-1.5B-Preview is a new DeepSeek-R1-Distill fine-tune using distributed RL on math
> LIMO is a new fine-tune of Qwen2.5-32B-Instruct on Math
๐ฃ๏ธ Audio
> Zonos-v0.1 is a new family of speech recognition models, which contains the model itself and embeddings
๐ผ๏ธ Vision and Image Generation
> We have ported DepthPro of Apple to transformers for your convenience!
> illustrious-xl-v1.0 is a new illustration generation model
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630f3e4002ce39336c411048/Wlldb-4EU5poO6JYmZp2I.png)
reacted to
m-ric's
post with ๐
2 days ago
Post
2159
For those who haven't come across it yet, here's a handy trick to discuss an entire GitHub repo with an LLM:
=> Just replace "github" with "gitingest" in the url, and you get the whole repo as a single string that you can then paste in your LLMs
=> Just replace "github" with "gitingest" in the url, and you get the whole repo as a single string that you can then paste in your LLMs
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630f3e4002ce39336c411048/Wlldb-4EU5poO6JYmZp2I.png)
reacted to
merve's
post with ๐
9 days ago
Post
2223
IBM released
ibm-granite/granite-vision-3.1-2b-preview, a small vision LM with impressive performance on different tasks ๐ฎ๐ฅ
it comes with transformers and vLLM support from the get-go ๐
you can run it in Colab T4, so I built a notebook to put it to test, find it here: https://github.com/merveenoyan/smol-vision/blob/main/inference_gists/IBM_Granite_Vision.ipynb
it comes with transformers and vLLM support from the get-go ๐
you can run it in Colab T4, so I built a notebook to put it to test, find it here: https://github.com/merveenoyan/smol-vision/blob/main/inference_gists/IBM_Granite_Vision.ipynb
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630f3e4002ce39336c411048/Wlldb-4EU5poO6JYmZp2I.png)
reacted to
ginipick's
post with ๐๐ฅ
9 days ago
Post
5172
๐ 3D Llama Studio - AI 3D Generation Platform
๐ Project Overview
3D Llama Studio is an all-in-one AI platform that generates high-quality 3D models and stylized images from text or image inputs.
โจ Key Features
Text/Image to 3D Conversion ๐ฏ
Generate 3D models from detailed text descriptions or reference images
Intuitive user interface
Text to Styled Image Generation ๐จ
Customizable image generation settings
Adjustable resolution, generation steps, and guidance scale
Supports both English and Korean prompts
๐ ๏ธ Technical Features
Gradio-based web interface
Dark theme UI/UX
Real-time image generation and 3D modeling
๐ซ Highlights
User-friendly interface
Real-time preview
Random seed generation
High-resolution output support (up to 2048x2048)
๐ฏ Applications
Product design
Game asset creation
Architectural visualization
Educational 3D content
๐ Try It Now!
Experience 3D Llama Studio:
ginigen/3D-LLAMA
#AI #3DGeneration #MachineLearning #ComputerVision #DeepLearning
๐ Project Overview
3D Llama Studio is an all-in-one AI platform that generates high-quality 3D models and stylized images from text or image inputs.
โจ Key Features
Text/Image to 3D Conversion ๐ฏ
Generate 3D models from detailed text descriptions or reference images
Intuitive user interface
Text to Styled Image Generation ๐จ
Customizable image generation settings
Adjustable resolution, generation steps, and guidance scale
Supports both English and Korean prompts
๐ ๏ธ Technical Features
Gradio-based web interface
Dark theme UI/UX
Real-time image generation and 3D modeling
๐ซ Highlights
User-friendly interface
Real-time preview
Random seed generation
High-resolution output support (up to 2048x2048)
๐ฏ Applications
Product design
Game asset creation
Architectural visualization
Educational 3D content
๐ Try It Now!
Experience 3D Llama Studio:
ginigen/3D-LLAMA
#AI #3DGeneration #MachineLearning #ComputerVision #DeepLearning
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630f3e4002ce39336c411048/Wlldb-4EU5poO6JYmZp2I.png)
reacted to
fdaudens's
post with ๐ฅ
14 days ago
Post
3323
๐ฏ Kokoro TTS just hit v1.0! ๐
Small but mighty: 82M parameters, runs locally, speaks multiple languages. The best part? It's Apache 2.0 licensed!
This could unlock so many possibilities โจ
Check it out: hexgrad/Kokoro-82M
Small but mighty: 82M parameters, runs locally, speaks multiple languages. The best part? It's Apache 2.0 licensed!
This could unlock so many possibilities โจ
Check it out: hexgrad/Kokoro-82M
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630f3e4002ce39336c411048/Wlldb-4EU5poO6JYmZp2I.png)
reacted to
prithivMLmods's
post with ๐๐ฅ
20 days ago
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630f3e4002ce39336c411048/Wlldb-4EU5poO6JYmZp2I.png)
reacted to
onekq's
post with ๐
24 days ago
Post
2284
So ๐DeepSeek๐ hits the mainstream media. But it has been a star in our little cult for at least 6 months. Its meteoric success is not overnight, but two years in the making.
To learn their history, just look at their ๐ค repo https://huggingface.co/deepseek-ai
* End of 2023, they launched the first model (pretrained by themselves) following Llama 2 architecture
* June 2024, v2 (MoE architecture) surpassed Gemini 1.5, but behind Mistral
* September, v2.5 surpassed GPT 4o mini
* December, v3 surpassed GPT 4o
* Now R1 surpassed o1
Most importantly, if you think DeepSeek success is singular and unrivaled, that's WRONG. The following models are also near or equal the o1 bar.
* Minimax-01
* Kimi k1.5
* Doubao 1.5 pro
To learn their history, just look at their ๐ค repo https://huggingface.co/deepseek-ai
* End of 2023, they launched the first model (pretrained by themselves) following Llama 2 architecture
* June 2024, v2 (MoE architecture) surpassed Gemini 1.5, but behind Mistral
* September, v2.5 surpassed GPT 4o mini
* December, v3 surpassed GPT 4o
* Now R1 surpassed o1
Most importantly, if you think DeepSeek success is singular and unrivaled, that's WRONG. The following models are also near or equal the o1 bar.
* Minimax-01
* Kimi k1.5
* Doubao 1.5 pro
i believe sglang would be even faster but not sure if it supports non-nvidia devices
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630f3e4002ce39336c411048/Wlldb-4EU5poO6JYmZp2I.png)
upvoted
a
collection
25 days ago