Zeyu Qin

qqqzzzyyy
·

AI & ML interests

Trustworthy ML, AI safety

Recent Activity

Organizations

None yet

qqqzzzyyy's activity

Reacted to euclaise's post with ❤️ 3 months ago
view post
Post
Memphis: Advancing language model reasoning without relying on proprietary model outputs

Memphis is a series of models which advance human-data models, offering good performance without relying on proprietary model outputs (e.g. GPT-generated datasets). I've developed a new iterative finetuning procedure to improve the reasoning ability of these models beyond what is possible using only SFT on the same data.

Currently, I've released two models: Memphis-CoT-3B, and Memphis-scribe-3B.

To create these models, I've created new datasets:
- euclaise/reddit-instruct : A dataset of instruction/QA-like data scraped from Reddit. A curated version, filtered using Lilac and neural embedding models, is available at euclaise/reddit-instruct-curated
- euclaise/TinyCoT : TinyCoT is a mtea-dataset that aggregates a variety of different human-sourced reasoning data. It is a curated version of my previous MegaCoT dataset euclaise/MegaCoT, which contains 629k responses which get cut down to 28k for TinyCoT. There's also an intermediate version euclaise/MiniCoT, which has 129k responses.

Memphis-CoT is trained on reddit-instruct, a filtered version of oasst2 sablo/oasst2_curated, and TinyCoT. Multiple iterations were performed on TinyCoT, while reddit-instruct and oasst2 were only used for the initial model.

Memphis-scribe further finetunes Memphis-CoT on more creative tasks. It was finetuned from Memphis-CoT on 18 different datasets, including datasets like euclaise/WritingPrompts_curated, lemonilia/LimaRP, and more.

To prevent catastrophic forgetting, I used weight averaging between iterations.

- euclaise/Memphis-CoT-3B
- euclaise/Memphis-scribe-3B
  • 2 replies
·
upvoted an article 3 months ago
upvoted an article 4 months ago
view article
Article

SmolLM - blazingly fast and remarkably powerful

273