Maziyar Panahi's picture

Maziyar Panahi PRO

MaziyarPanahi

·

AI & ML interests

Fine-Tuning, RLHF, Merging, Quantizations, Leaderboards

Recent Activity

liked a model about 12 hours ago

MaziyarPanahi/Kokoro-82M

upvoted a paper 3 days ago

Search-o1: Agentic Search-Enhanced Large Reasoning Models

new activity 3 days ago

MaziyarPanahi/calme-2.3-llama3.1-70b:Feedback after several months of use.

View all activity

Organizations

MaziyarPanahi's activity

upvoted a paper 3 days ago

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published 10 days ago • 75

upvoted an article 4 days ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

4 days ago

• 103

upvoted a collection 4 days ago

InternLM3

6 items • Updated 2 days ago • 20

upvoted an article 6 days ago

Article

Mastering Tensor Dimensions in Transformers

By

•

7 days ago

• 33

upvoted a collection 11 days ago

Phi-4

Phi-4 small language model. • 2 items • Updated 11 days ago • 42

upvoted a collection 18 days ago

GIANTS

Frankenstein and giant models merged! • 11 items • Updated 18 days ago • 4

upvoted a collection 27 days ago

InternVL2.5-MPO

Enhancing the Reasoning Ability of MLLMs via Mixed Preference Optimization • 16 items • Updated 9 days ago • 25

upvoted an article 28 days ago

Article

Introduction to Quantization cooked in 🤗 with 💗🧑‍🍳

By

•

Aug 25, 2023

• 25

upvoted a paper about 1 month ago

NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data

Paper • 2402.15343 • Published Feb 23, 2024 • 13

upvoted 4 collections about 1 month ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated about 1 month ago • 124

Qwen2.5-Math

Math-specific model series based on Qwen2.5 • 11 items • Updated 5 days ago • 63

OLMo 2

Artifacts for the second set of OLMo models. • 22 items • Updated 13 days ago • 74

Common Models

The first generation of models pretrained on Common Corpus. • 5 items • Updated Dec 5, 2024 • 28

upvoted an article about 1 month ago

Article

They Said It Couldn’t Be Done

By

•

Dec 5, 2024

• 77

upvoted a paper about 2 months ago

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

Paper • 2411.14199 • Published Nov 21, 2024 • 30

upvoted 2 collections about 2 months ago

INTELLECT-1 Dataset

INTELLECT-1 Training dataset • 5 items • Updated Oct 8, 2024 • 21

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 13 days ago • 64

upvoted a paper about 2 months ago

OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset

Paper • 2402.10176 • Published Feb 15, 2024 • 37

upvoted 2 collections about 2 months ago

Awesome SFT datasets

A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12, 2024 • 127

OpenScholar_V1

The set of models, index, data associated with the paper "OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs". • 8 items • Updated Nov 22, 2024 • 31