view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels Aug 18 • 78
100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models Paper • 2505.00551 • Published May 1 • 36
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL Paper • 2503.23157 • Published Mar 29 • 10
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM Mar 12 • 465
view article Article Hugging Face Welcomes the Qwen2.5-Coder Series By ariG23498 • Nov 12, 2024 • 7
CHESS: Contextual Harnessing for Efficient SQL Synthesis Paper • 2405.16755 • Published May 27, 2024 • 1
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy Sep 18, 2024 • 267
view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention Aug 21, 2024 • 41
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28, 2024 • 253
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation Apr 29, 2024 • 79
view article Article Text2SQL using Hugging Face Dataset Viewer API and Motherduck DuckDB-NSQL-7B Apr 4, 2024 • 29