73 7 52

malteos

https://ostendorff.org

malteos

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

Enhancing Human-Like Responses in Large Language Models

updated a dataset 26 days ago

malteos/images

updated a Space 27 days ago

malteos/seed-crawl-annotator

View all activity

Articles

Organizations

malteos's activity

upvoted a paper 7 days ago

Enhancing Human-Like Responses in Large Language Models

Paper • 2501.05032 • Published 9 days ago • 46

updated a dataset 26 days ago

malteos/images

Viewer • Updated 26 days ago • 7 • 672

updated a Space 27 days ago

Sleeping

🐨

malteos/seed-crawl-urls

Viewer • Updated 27 days ago • 2 • 22

updated a collection about 1 month ago

Occiglot FineWeb

Collection

Multilingual Web datasets • 2 items • Updated Dec 9, 2024

upvoted a collection about 1 month ago

Tulu 3 Datasets

Collection

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 12 days ago • 64

New activity in malteos/bloom-6b4-clp-german about 1 month ago

Any chance of a gguf?

#7 opened about 2 months ago by

lollylan

liked a model about 2 months ago

utter-project/EuroLLM-9B-Instruct

Text Generation • Updated Dec 9, 2024 • 14.8k • 134

liked a dataset about 2 months ago

allenai/tulu-3-sft-mixture

Viewer • Updated Dec 2, 2024 • 939k • 4.29k • 96

New activity in malteos/bloom-6b4-clp-german-oasst-v0.1 2 months ago

Adding `safetensors` variant of this model

#1 opened 2 months ago by

SFconvertbot

liked a Space 3 months ago

Running on CPU Upgrade

12.2k

🏆

Open LLM Leaderboard

Track, rank and evaluate open LLMs and chatbots

reacted to Tonic's post with 👀 3 months ago

Post

1859

🙋🏻‍♂️ Hey there folks ,

🦎Salamandra release by @mvillegas and team
@BSC_CNS https://huggingface.co/BSC-LT is absolutely impressive so far !

perhaps the largest single training dataset of high quality text to date of 7.8 trillion tokens in 35 European languages and code.

the best part : the data was correctly licenced so it's actually future-proof!

the completions model is really creative and instruct fine tuned version is very good also.

now you can use such models for multi-lingual enterprise applications with further finetunes , long response generation, structured outputs (coding) also works.

check out 👇🏻
the collection : BSC-LT/salamandra-66fc171485944df79469043a
the repo : https://github.com/langtech-bsc/salamandra
7B-Instruct demo : Tonic/Salamandra-7B