AI Safety Research's picture

AI Safety Research

AISafety

·

https://humanaligned.ai

AI & ML interests

LLMs, planning, EA

Recent Activity

liked a dataset about 1 month ago

LightningRodLabs/future-as-label-paper-training-dataset

liked a model about 1 month ago

cyankiwi/GLM-4.7-Flash-AWQ-4bit

liked a model about 1 month ago

zai-org/GLM-4.7-Flash

View all activity

Organizations

upvoted 2 articles 2 months ago

Article

Exploring Environments Hub: Your Language Model needs better (open) environments to learn

Sep 4, 2025

•

30

Article

HUMAINE: A Rigorous Framework for Understanding AI Through Human Experience

Sep 16, 2025

•

7

upvoted an article 3 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

Aug 9, 2025

•

91

upvoted a collection 3 months ago

Olmo 3

Artifacts for the Olmo 3 release. • 7 items • Updated 1 day ago • 164

upvoted an article 3 months ago

Article

We Got Claude to Fine-Tune an Open Source LLM

Dec 4, 2025

•

606

upvoted a collection 3 months ago

Transformers.js demos

A collection of my favorite WebML demos, built with Transformers.js! • 30 items • Updated Jul 11, 2024 • 140

upvoted a paper 3 months ago

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Paper • 2511.22570 • Published Nov 27, 2025 • 91

upvoted a paper 4 months ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 132

upvoted a collection 4 months ago

The Bestiary

Decensored language models made using Heretic (https://github.com/p-e-w/heretic) • 6 items • Updated Nov 16, 2025 • 100

upvoted an article 4 months ago

Article

EuroLLM-9B

Dec 2, 2024

•

139

upvoted a collection 5 months ago

🎯 Liquid Nanos

Library of task-specific models: https://www.liquid.ai/blog/introducing-liquid-nanos-frontier-grade-performance-on-everyday-devices • 26 items • Updated 7 days ago • 109

upvoted an article 5 months ago

Article

SOTA OCR with Core ML and dots.ocr

Oct 2, 2025

•

63

upvoted a collection 5 months ago

DeepSeek-V3.2

4 items • Updated Dec 1, 2025 • 531

upvoted a paper 5 months ago

Reinforcement Learning on Pre-Training Data

Paper • 2509.19249 • Published Sep 23, 2025 • 67

upvoted 2 collections 6 months ago

InternVL3.5

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 45 items • Updated 1 day ago • 106

DeepSeek-V3.1

3 items • Updated 1 day ago • 261

upvoted an article 7 months ago

Article

Introducing AI Sheets: a tool to work with datasets using open AI models!

+4

Aug 8, 2025

•

108

upvoted a paper 7 months ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 206

upvoted 2 collections 7 months ago

GLM-4.5

GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 8 items • Updated 1 day ago • 252

Sparse Autoencoders

SAEs are tools for understanding the internal representations of neural networks. These can be loaded using https://github.com/EleutherAI/sae • 9 items • Updated Feb 26, 2025 • 7