Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.09223

about 15 hours ago

STaR: Bootstrapping Reasoning With Reasoning

Paper • 2203.14465 • Published Mar 28, 2022 • 8
Let's Verify Step by Step

Paper • 2305.20050 • Published May 31, 2023 • 10
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 74
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 58

Papers - Intro, Review, Survey

about 20 hours ago

Language Models: A Guide for the Perplexed

Paper • 2311.17301 • Published Nov 29, 2023
The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published Jun 6, 2024 • 58
Reinforcement Learning: An Overview

Paper • 2412.05265 • Published Dec 6, 2024 • 4
A Primer on Large Language Models and their Limitations

Paper • 2412.04503 • Published Dec 3, 2024

Papers-Fundamentals

about 18 hours ago

RoFormer: Enhanced Transformer with Rotary Position Embedding

Paper • 2104.09864 • Published Apr 20, 2021 • 11
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 50
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4, 2024 • 61
Zero-Shot Tokenizer Transfer

Paper • 2405.07883 • Published May 13, 2024 • 5

#MustRead Papers

Signature papers in AI/ML with focus on generative AI or large language models that bring unique perspectives and/or are highly cited by peers

about 19 hours ago

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 50
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 12
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Paper • 2201.11903 • Published Jan 28, 2022 • 9
Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 71

Collection of LLMs

about 10 hours ago

microsoft/phi-1_5

Text Generation • Updated Apr 29, 2024 • 101k • 1.32k
mistralai/Mistral-7B-v0.1

Text Generation • Updated Jul 24, 2024 • 2.53M • 3.53k
meta-llama/Llama-2-7b

Text Generation • Updated Apr 17, 2024 • 4.22k
openai/whisper-large-v2

Automatic Speech Recognition • Updated Feb 29, 2024 • 1.34M • 1.68k

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs