Flax Community

non-profit

https://github.com/huggingface/transformers/tree/master/examples/research_projects/jax-projects

AI & ML interests

JAX, Flax, TPU, 🤗

Recent Activity

gigant authored a paper 6 days ago

Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation

gigant submitted a paper 7 days ago

Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation

gigant authored a paper 14 days ago

Efficient Pre-Training with Token Superposition

View all activity

submitted 2 papers to Daily Papers about 8 hours ago

ResearchMath-14K: Scaling Research-Level Mathematics via Agents

Paper • 2605.28003 • Published 1 day ago • 33

Chartographer: Counterfactual Chart Generation for Evaluating Vision-Language Models

Paper • 2605.27311 • Published 2 days ago • 3

submitted a paper to Daily Papers 16 days ago

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

Paper • 2605.09063 • Published 19 days ago • 79

authored 2 papers 3 months ago

BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data

Paper • 2510.10159 • Published Oct 11, 2025 • 3

Measuring what Matters: Construct Validity in Large Language Model Benchmarks

Paper • 2511.04703 • Published Nov 3, 2025 • 8

submitted a paper to Daily Papers 4 months ago

Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math

Paper • 2602.06291 • Published Feb 6 • 24

submitted a paper to Daily Papers 4 months ago

Learning to Discover at Test Time

Paper • 2601.16175 • Published Jan 22 • 45

authored a paper 6 months ago

Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem

Paper • 2512.03073 • Published Nov 27, 2025 • 7

authored 2 papers 6 months ago

On Space Folds of ReLU Neural Networks

Paper • 2502.09954 • Published Feb 14, 2025

The Space Between: On Folding, Symmetries and Sampling

Paper • 2503.08502 • Published Mar 11, 2025

authored a paper 7 months ago

The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models

Paper • 2510.13996 • Published Oct 15, 2025 • 9

authored a paper 7 months ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14, 2025 • 132

posted an update 8 months ago

Post

762

Something very cool is cooking at

1 reply

·

authored a paper 8 months ago

EmbeddingGemma: Powerful and Lightweight Text Representations

Paper • 2509.20354 • Published Sep 24, 2025 • 50

authored 4 papers 8 months ago

Adding LLMs to the psycholinguistic norming toolbox: A practical guide to getting the most out of human ratings

Paper • 2509.14405 • Published Sep 17, 2025 • 2

Psycholinguistic Word Features: a New Approach for the Evaluation of LLMs Alignment with Humans

Paper • 2506.22439 • Published May 29, 2025 • 3

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Paper • 2509.14233 • Published Sep 17, 2025 • 20

La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America

Paper • 2507.00999 • Published Jul 1, 2025 • 1

posted an update 8 months ago

Post

8945

We're kick-starting the process of Transformers v5, with @ArthurZ and @cyrilvallez !

v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.

Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago!

6 replies

·

in flax-community/roberta-base-mr 10 months ago

Adding `safetensors` variant of this model

#1 opened over 1 year ago by