Urro's picture

In a Training Loop 🔄

Urro PRO

urroxyz

·

https://urro.xyz/

urroxyz

AI & ML interests

computational linguistics major 🤖🔎🔠 i am autistic. if i come off rude, i probably didn't mean to. please feel free to ask me for clarification.

Recent Activity

updated a collection about 3 hours ago

WTF GENIUS PAPERS

upvoted a paper about 3 hours ago

IMU-1: Sample-Efficient Pre-training of Small Language Models

upvoted a paper 1 day ago

COSMOS: Predictable and Cost-Effective Adaptation of LLMs

View all activity

Organizations

updated a collection about 3 hours ago

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 84 items • Updated about 3 hours ago • 11

upvoted a paper about 3 hours ago

IMU-1: Sample-Efficient Pre-training of Small Language Models

Paper • 2602.02522 • Published Jan 25 • 7

upvoted 4 papers 1 day ago

COSMOS: Predictable and Cost-Effective Adaptation of LLMs

Paper • 2505.01449 • Published Apr 30, 2025 • 4

SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models

Paper • 2403.07384 • Published Mar 12, 2024 • 3

Less is More: Improving LLM Alignment via Preference Data Selection

Paper • 2502.14560 • Published Feb 20, 2025 • 1

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

Paper • 2312.15685 • Published Dec 25, 2023 • 17

New activity in blog-explorers/README 2 days ago

The Next Evolution of AI: From Passive Models to Autonomous Systems

#15 opened 2 days ago by

commentedon Welcome Gemma 4: Frontier multimodal intelligence on device 2 days ago

Somewhat disappointing release, in my opinion.

However, I adore audio understanding models, so it's nice to see more of those. My favorite right now is MERALION, but it's 10B. I guess another perk would be the safety alignment. I'm sure the Gemma 4 series will be useful to some people. Not me, though...

I just wish larger tech companies would take a page out of OpenAI's book and release actually competitive OSS instead of putting out generic models just to say they support public research.

But the models are stable. That's good. Definitely more consistency and token efficiency than Qwen.

commented a paper 3 days ago

Embarrassingly Simple Self-Distillation Improves Code Generation

Paper • 2604.01193 • Published 3 days ago • 24 •

updated a collection 3 days ago

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 84 items • Updated about 3 hours ago • 11

upvoted 2 papers 3 days ago

GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation

Paper • 2603.26661 • Published 8 days ago • 18

Embarrassingly Simple Self-Distillation Improves Code Generation

Paper • 2604.01193 • Published 3 days ago • 24

updated 2 collections 4 days ago

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 84 items • Updated about 3 hours ago • 11

TINY MODELS WITH BIG INTELLIGENCE

Tiny (<30B) models that tend to outperform their same-parameter counterparts. • 17 items • Updated 4 days ago • 3

liked a model 4 days ago

LiquidAI/LFM2.5-350M

Text Generation • 0.4B • Updated 3 days ago • 12.5k • 225

upvoted 2 collections 4 days ago

Bonsai-Auxiliary

3 items • Updated 4 days ago • 6

Bonsai

1-bit Bonsai models • 6 items • Updated 4 days ago • 138

updated a collection 4 days ago

TINY MODELS WITH BIG INTELLIGENCE

Tiny (<30B) models that tend to outperform their same-parameter counterparts. • 17 items • Updated 4 days ago • 3

liked a model 4 days ago

prism-ml/Bonsai-8B-gguf

Text Generation • 8B • Updated 5 days ago • 32.9k • 387

updated a collection 4 days ago

HUMAN-WRITTEN & LEGALLY-SOURCED*

Datasets written by humans and/or reverse-engineered from text with deterministic algorithms. No illegal scraping or unethical synthesis *...mostly. • 162 items • Updated 4 days ago • 2