Derek Thomas's picture

Derek Thomas

derek-thomas

·

https://datavistics.github.io

AI & ML interests

None yet

Recent Activity

updated a dataset about 3 hours ago

reddit-tools-HF/dataset-creator-reddit-bestofredditorupdates

upvoted an article 2 days ago

🥃 Distilling Tiny Embeddings

published a Space 13 days ago

AI71ai/orchestrator-evals

View all activity

Organizations

upvoted an article 2 days ago

Article

🥃 Distilling Tiny Embeddings

Jan 10

•

20

upvoted a collection 3 months ago

AgriLLM

A collection of the artifacts for the AgriLLM initiative. • 5 items • Updated Dec 15, 2025 • 5

upvoted an article 3 months ago

Article

We Got Claude to Fine-Tune an Open Source LLM

Dec 4, 2025

•

605

upvoted a collection 10 months ago

SmolDocling

3 items • Updated Apr 7, 2025 • 9

upvoted 2 articles about 1 year ago

Article

1 Billion Classifications

Feb 13, 2025

•

45

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

Jan 16, 2025

•

76

upvoted a collection about 1 year ago

Dataset Exploration

4 items • Updated Nov 30, 2024 • 6

upvoted a paper about 1 year ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3, 2024 • 51

upvoted a collection over 1 year ago

Prompt Order Experiment

Prompt Order Experiment shows how to run a simple experiment on the hub and leverage tools like AutoTrain, and Inference Endpoints. • 16 items • Updated Jan 14, 2025 • 2

upvoted an article over 1 year ago

Article

Low Code Large Language Model Alignment

Nov 19, 2024

•

13

upvoted a paper over 1 year ago

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines

Paper • 2410.12705 • Published Oct 16, 2024 • 32

upvoted an article over 1 year ago

Article

Deploying Speech-to-Speech on Hugging Face

+2

Oct 22, 2024

•

45

upvoted a paper over 1 year ago

AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published Oct 21, 2024 • 59

upvoted an article over 1 year ago

Article

AI Watermarking 101: Tools and Techniques

+7

Feb 26, 2024

•

27

upvoted 6 papers over 1 year ago

ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation

Paper • 2410.01731 • Published Oct 2, 2024 • 16

BordIRlines: A Dataset for Evaluating Cross-lingual Retrieval-Augmented Generation

Paper • 2410.01171 • Published Oct 2, 2024 • 5

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published Sep 18, 2024 • 39

Challenges and Responses in the Practice of Large Language Models

Paper • 2408.09416 • Published Aug 18, 2024 • 1

Characterizing Prompt Compression Methods for Long Context Inference

Paper • 2407.08892 • Published Jul 11, 2024 • 11

GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering

Paper • 2409.06595 • Published Sep 10, 2024 • 38