Unchun Yang's picture

Unchun Yang

ucyang

·

https://ucyang.com/

AI & ML interests

None yet

Recent Activity

liked a dataset about 4 hours ago

allenai/dolmino-mix-1124

liked a model 1 day ago

MiniMaxAI/MiniMax-Text-01

upvoted a paper 1 day ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

View all activity

Organizations

ucyang's activity

upvoted a paper 1 day ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 4 days ago • 256

upvoted a collection 3 days ago

MiniCPM

The MiniCPM family of LLMs and VLLMs. • 31 items • Updated Oct 22, 2024 • 59

upvoted a paper 5 days ago

Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published 10 days ago • 77

upvoted a paper 6 days ago

Agentless: Demystifying LLM-based Software Engineering Agents

Paper • 2407.01489 • Published Jul 1, 2024 • 57

upvoted an article 7 days ago

Article

🐺🐦‍⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs

By

•

Dec 4, 2024

• 76

upvoted a paper 8 days ago

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 13

upvoted a paper 9 days ago

KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model

Paper • 2501.01028 • Published 16 days ago • 11

upvoted a collection 9 days ago

KaLM-embedding

6 items • Updated 3 days ago • 21

upvoted a collection 10 days ago

Deepseek V3 (All Versions)

Deepseek V3 - available in bf16, original, and GGUF formats, with support for 2, 3, 4, 5, 6 and 8-bit quantized versions. • 3 items • Updated 5 days ago • 24

upvoted a paper 14 days ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 74

upvoted a paper 15 days ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 78

upvoted a paper 17 days ago

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Paper • 2309.12284 • Published Sep 21, 2023 • 19

upvoted a paper 18 days ago

Hermes 3 Technical Report

Paper • 2408.11857 • Published Aug 15, 2024 • 44

upvoted a collection 21 days ago

Skywork-o1-Open

Skywork o1 open model collections • 3 items • Updated Nov 27, 2024 • 20

upvoted a paper 23 days ago

DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 37

upvoted a collection 23 days ago

DeepSeek-Prover

DeepSeek-V1-and-V1.5-Series • 7 items • Updated Aug 16, 2024 • 20

upvoted a paper 23 days ago

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Paper • 2412.10302 • Published Dec 13, 2024 • 11

upvoted 3 collections 23 days ago

DeepSeek-VL2

4 items • Updated Dec 18, 2024 • 36

DeepSeek-V3

3 items • Updated 12 days ago • 120

Gukbap-Series-LLM

General Korean LLM • 4 items • Updated Oct 25, 2024 • 2