Xi's picture

Xi

xi0v

·

AI & ML interests

RL, Model merging, Model Editing and Vision/Multimodal Model Fine-tuning.

Recent Activity

liked a model about 11 hours ago

CabalResearch/NoobAI-V-pred-to-Flow-LoRA

liked a model about 22 hours ago

Qwen/Qwen3-30B-A3B-Instruct-2507

liked a Space 2 days ago

IdlecloudX/NewBie-image-Exp0.1-Diffusers

View all activity

Organizations

upvoted a paper 19 days ago

Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance

Paper • 2511.13254 • Published 23 days ago • 134

upvoted a collection 23 days ago

timm DINOv3

Meta AI's DINOv3 weights in timm. ViTs with `qkvb` have a zero QV bias present, otherwise bias is disabled. QKV bias are all 0 in original weights. • 18 items • Updated Sep 19 • 24

upvoted a paper 24 days ago

Black-Box On-Policy Distillation of Large Language Models

Paper • 2511.10643 • Published 27 days ago • 46

upvoted an article 24 days ago

Article

Projected Abliteration

Oct 25

•

31

upvoted a paper about 1 month ago

π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

Paper • 2510.25889 • Published Oct 29 • 64

upvoted a collection 2 months ago

_Originals

37 items • Updated Jan 20 • 1

upvoted a paper 3 months ago

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56

upvoted 2 papers 4 months ago

Puppeteer: Rig and Animate Your 3D Models

Paper • 2508.10898 • Published Aug 14 • 33

Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory

Paper • 2508.09736 • Published Aug 13 • 57

upvoted a collection 4 months ago

Hybrid Linear Attention Research

All 1.3B & 340M hybrid linear-attention experiments. • 62 items • Updated Sep 11 • 12

upvoted a paper 4 months ago

Geometric-Mean Policy Optimization

Paper • 2507.20673 • Published Jul 28 • 31

upvoted an article 5 months ago

Article

Vibe coding for data science: how to label a dataset with Kimi K2

Jul 22

•

21

upvoted 5 papers 5 months ago

Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities

Paper • 2507.13158 • Published Jul 17 • 23

One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11 • 31

Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation

Paper • 2507.02608 • Published Jul 3 • 21

Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3 • 26

FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing

Paper • 2506.20911 • Published Jun 26 • 41

upvoted an article 6 months ago

Article

Gemma 3n fully available in the open-source ecosystem!

+6

Jun 26

•

120

upvoted 2 papers 6 months ago

A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA

Paper • 2312.03732 • Published Nov 28, 2023 • 11

MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models

Paper • 2506.14435 • Published Jun 17 • 7