fangtongen
's Collections
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
with Web Data, and Web Data Only
Paper
•
2306.01116
•
Published
•
31
FlashAttention: Fast and Memory-Efficient Exact Attention with
IO-Awareness
Paper
•
2205.14135
•
Published
•
11
RoFormer: Enhanced Transformer with Rotary Position Embedding
Paper
•
2104.09864
•
Published
•
10
Language Models are Few-Shot Learners
Paper
•
2005.14165
•
Published
•
11
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Paper
•
2101.00027
•
Published
•
6
Fast Transformer Decoding: One Write-Head is All You Need
Paper
•
1911.02150
•
Published
•
6
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper
•
2307.09288
•
Published
•
242
LLaMA: Open and Efficient Foundation Language Models
Paper
•
2302.13971
•
Published
•
13
Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Paper
•
2306.02707
•
Published
•
46
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity
Text Embeddings Through Self-Knowledge Distillation
Paper
•
2402.03216
•
Published
•
4
Paper
•
2310.06825
•
Published
•
47