Audiovisual - a melsiddieg Collection

melsiddieg 's Collections

from_scratch_pretrain

bert and friends

Research and Optimization

finetune_datasets

Audiovisual

updated 17 days ago

microsoft/VibeVoice-1.5B

Text-to-Speech • 3B • Updated Jan 22 • 117k • 2.22k
ibm-granite/granite-docling-258M

Image-Text-to-Text • Updated Sep 23, 2025 • 190k • 1.13k
deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • Updated Nov 4, 2025 • 3.21M • 3.16k
Qwen/Qwen3-VL-2B-Thinking

Image-Text-to-Text • 2B • Updated Oct 20, 2025 • 51.1k • 105
datalab-to/chandra

Image-Text-to-Text • 9B • Updated Oct 21, 2025 • 186k • 490
Qwen/Qwen3-VL-2B-Instruct

Image-Text-to-Text • Updated Oct 23, 2025 • 12.7M • 331
PokeeAI/pokee_research_7b

Text Generation • 8B • Updated Oct 23, 2025 • 354 • 100
openbmb/MiniCPM-o-4_5

Any-to-Any • 9B • Updated 8 days ago • 85.6k • 873
Qwen/Qwen3-ForcedAligner-0.6B

Automatic Speech Recognition • Updated 24 days ago • 72k • 89