Zeyu Qin's picture

24 25

Zeyu Qin

qqqzzzyyy

·

https://alan-qin.github.io/

Alan-Qin

AI & ML interests

Trustworthy ML, AI safety

Recent Activity

liked a model 3 days ago

sentence-transformers/all-mpnet-base-v2

liked a model 4 days ago

OFA-Sys/InsTagger

upvoted a paper 16 days ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

View all activity

Organizations

None yet

qqqzzzyyy's activity

liked a model 3 days ago

sentence-transformers/all-mpnet-base-v2

Sentence Similarity • Updated 23 days ago • 361M • • 908

liked a model 4 days ago

OFA-Sys/InsTagger

Text Generation • Updated Aug 16, 2023 • 1.91k • 19

upvoted a paper 16 days ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3 • 48

liked a dataset 17 days ago

PawanKrd/math-gpt-4o-200k

Viewer • Updated Jul 30 • 200k • 66 • 51

upvoted a paper 23 days ago

Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77

upvoted a collection 2 months ago

Qwen2.5-Math

Math-specific model series based on Qwen2.5 • 9 items • Updated about 15 hours ago • 52

liked a dataset 3 months ago

Magpie-Align/Magpie-Llama-3.1-Pro-MT-300K-Filtered

Viewer • Updated Aug 28 • 300k • 380 • 11

liked a model 3 months ago

rubra-ai/Mistral-7B-Instruct-v0.2

Text Generation • Updated Jul 4 • 13 • 2

liked a dataset 3 months ago

ScaleAI/mhj

Viewer • Updated Sep 19 • 1 • 199 • 19

Reacted to euclaise's post with ❤️ 3 months ago

Post

Memphis: Advancing language model reasoning without relying on proprietary model outputs

Memphis is a series of models which advance human-data models, offering good performance without relying on proprietary model outputs (e.g. GPT-generated datasets). I've developed a new iterative finetuning procedure to improve the reasoning ability of these models beyond what is possible using only SFT on the same data.

Currently, I've released two models: Memphis-CoT-3B, and Memphis-scribe-3B.

To create these models, I've created new datasets:
- euclaise/reddit-instruct : A dataset of instruction/QA-like data scraped from Reddit. A curated version, filtered using Lilac and neural embedding models, is available at euclaise/reddit-instruct-curated
- euclaise/TinyCoT : TinyCoT is a mtea-dataset that aggregates a variety of different human-sourced reasoning data. It is a curated version of my previous MegaCoT dataset euclaise/MegaCoT, which contains 629k responses which get cut down to 28k for TinyCoT. There's also an intermediate version euclaise/MiniCoT, which has 129k responses.

Memphis-CoT is trained on reddit-instruct, a filtered version of oasst2 sablo/oasst2_curated, and TinyCoT. Multiple iterations were performed on TinyCoT, while reddit-instruct and oasst2 were only used for the initial model.

Memphis-scribe further finetunes Memphis-CoT on more creative tasks. It was finetuned from Memphis-CoT on 18 different datasets, including datasets like euclaise/WritingPrompts_curated, lemonilia/LimaRP, and more.

To prevent catastrophic forgetting, I used weight averaging between iterations.

- euclaise/Memphis-CoT-3B
- euclaise/Memphis-scribe-3B

2 replies

·

upvoted a paper 3 months ago

Iterative Reasoning Preference Optimization

Paper • 2404.19733 • Published Apr 30 • 47

upvoted an article 3 months ago

Article

Let's talk about LLM evaluation

By

•

May 23

• 134

liked a model 4 months ago

HuggingFaceTB/SmolLM-360M-Instruct

Text Generation • Updated Aug 18 • 14.6k • 76

upvoted an article 4 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 273

liked a model 4 months ago

apple/DCLM-7B

Updated Jul 26 • 2.99k • 824

liked a model 5 months ago

ScalableMath/llemma-7b-sft-prm800k-level-1to3-hf

Text Generation • Updated Mar 1 • 547 • 2

upvoted 2 collections 5 months ago

Qwen1.5

Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated about 15 hours ago • 206

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated about 15 hours ago • 346

liked 2 datasets 5 months ago

deepmind/aqua_rat

Viewer • Updated Jan 9 • 196k • 1.2k • 44

5CD-AI/Vietnamese-nvidia-OpenMathInstruct-1-50k-gg-translated

Viewer • Updated Mar 13 • 50k • 52 • 6