-
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Paper • 2409.10516 • Published • 41 -
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Paper • 2409.11242 • Published • 6 -
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
Paper • 2409.11136 • Published • 22 -
On the Diagram of Thought
Paper • 2409.10038 • Published • 13
Tongyao PRO
tyzhu
AI & ML interests
Natural Language Processing
Recent Activity
updated
a model
about 1 hour ago
tyzhu/tiny_LLaMA_120M_2k_cc_repeat_2k_iter-060000-ckpt-step-15000_hf
published
a model
about 1 hour ago
tyzhu/tiny_LLaMA_120M_2k_cc_repeat_2k_iter-060000-ckpt-step-15000_hf
updated
a model
about 4 hours ago
tyzhu/tiny_LLaMA_120M_8k_cc_8k_iter-400000-ckpt-step-100000_hf
Organizations
None yet
Collections
1
models
223
tyzhu/tiny_LLaMA_120M_2k_cc_repeat_2k_iter-060000-ckpt-step-15000_hf
Updated
tyzhu/tiny_LLaMA_120M_8k_cc_8k_iter-400000-ckpt-step-100000_hf
Updated
•
27
tyzhu/tiny_LLaMA_1b_8k_intramask_cc_8k_iter-480000-ckpt-step-60000_hf
Text Generation
•
Updated
•
202
tyzhu/tiny_LLaMA_1b_8k_intramask_cc_8k_iter-320000-ckpt-step-40000_hf
Text Generation
•
Updated
•
284
tyzhu/tiny_LLaMA_1b_8k_cc_8k_iter-400000-ckpt-step-50000_hf
Text Generation
•
Updated
•
255
tyzhu/tiny_LLaMA_1b_2k_cc_2k_iter-400000-ckpt-step-50000_hf
Updated
•
11
tyzhu/llama3.2_3b_8k_intramask_cc_8k_iter-400000-ckpt-step-100000_hf
Updated
•
1.09k
tyzhu/tiny_LLaMA_1b_32k_cc_32k_iter-100000-ckpt-step-100000_hf
Updated
•
226
tyzhu/tiny_LLaMA_1b_32k_dm2_cc_32k_iter-100000-ckpt-step-100000_hf
Updated
•
199
tyzhu/temp_models
Updated
datasets
815
tyzhu/tpo
Viewer
•
Updated
•
269
•
44
tyzhu/quality
Viewer
•
Updated
•
173
•
65
tyzhu/the-stack-py
Viewer
•
Updated
•
16.3M
•
64
•
1
tyzhu/pystack_clean
Viewer
•
Updated
•
9.44M
•
46
tyzhu/id_cc_pool
Viewer
•
Updated
•
72.5M
•
189
tyzhu/proweb
Viewer
•
Updated
•
46.3M
•
179
tyzhu/anchorcontext_5M_v3_models
Updated
•
1
tyzhu/cmmlu_filtered
Updated
•
42
tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3
Viewer
•
Updated
•
76.7k
•
46
tyzhu/flan_max_300_added
Viewer
•
Updated
•
1.46M
•
40