r's picture

r PRO

oceansweep

·

AI & ML interests

None yet

Recent Activity

liked a model about 14 hours ago

MiniMaxAI/MiniMax-M2

liked a model 7 days ago

IndexTeam/IndexTTS-2

liked a model 7 days ago

deepseek-ai/DeepSeek-OCR

View all activity

Organizations

None yet

upvoted 4 papers 9 days ago

Large Language Models Do NOT Really Know What They Don't Know

Paper • 2510.09033 • Published 18 days ago • 16

BitNet Distillation

Paper • 2510.13998 • Published 12 days ago • 49

AI for Service: Proactive Assistance with AI Glasses

Paper • 2510.14359 • Published 12 days ago • 71

When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA

Paper • 2510.04849 • Published 21 days ago • 107

upvoted a paper 18 days ago

UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG

Paper • 2510.03663 • Published 24 days ago • 15

upvoted a paper 23 days ago

Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs

Paper • 2509.22582 • Published Sep 26 • 10

upvoted 6 papers 24 days ago

Learning to Reason for Hallucination Span Detection

Paper • 2510.02173 • Published 25 days ago • 18

F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data

Paper • 2510.02294 • Published 25 days ago • 42

Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks

Paper • 2510.02286 • Published 25 days ago • 28

The Rogue Scalpel: Activation Steering Compromises LLM Safety

Paper • 2509.22067 • Published Sep 26 • 27

CLUE: Non-parametric Verification from Experience via Hidden-State Clustering

Paper • 2510.01591 • Published 26 days ago • 26

LongCodeZip: Compress Long Context for Code Language Models

Paper • 2510.00446 • Published 27 days ago • 107

upvoted 2 papers 26 days ago

jina-reranker-v3: Last but Not Late Interaction for Document Reranking

Paper • 2509.25085 • Published 28 days ago • 6

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published 27 days ago • 510

upvoted 2 papers about 2 months ago

FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games

Paper • 2509.01052 • Published Sep 1 • 20

ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding

Paper • 2508.21496 • Published Aug 29 • 54

upvoted a collection 2 months ago

VibeVoice

Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 5 items • Updated Sep 1 • 129

upvoted a paper 2 months ago

Story2Board: A Training-Free Approach for Expressive Storyboard Generation

Paper • 2508.09983 • Published Aug 13 • 68

upvoted a paper 3 months ago

SPAR: Scholar Paper Retrieval with LLM-based Agents for Enhanced Academic Search

Paper • 2507.15245 • Published Jul 21 • 11

upvoted a paper 4 months ago

Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs

Paper • 2507.02778 • Published Jul 3 • 9