Clem 🤗 PRO

clem

AI & ML interests

multi-modal, time-series, biology and chemistry

Recent Activity

upvoted an article about 22 hours ago
liked a Space about 22 hours ago
akhaliq/anychat
upvoted an article 1 day ago

Organizations

clem's activity

reacted to fracapuano's post with ❤️ 2 days ago
view post
Post
929
Sharing what we have built over the course of the weekend at the @llamameta hackathon, by Cerebral Valley in London 🇬🇧 👇

@gabrycina @calebgcc and I competed with 200+ participants and 50+ teams for a 24-hrs sprint centered around hacking for impact! We focused on applications of robotics to those in need of assisted living, moving our focus to enable greater autonomy and accessibility of robotics in everyday life.

complete list of assets 👇
🤗 trained robotics policies
v1:
- fracapuano/moss-pills
- fracapuano/moss-cup
v2:
- fracapuano/meta-grasp

🤗 datasets
v1:
- fracapuano/pills
- fracapuano/cup
v2:
- fracapuano/cupim


You can find a live demo of our submission at: https://x.com/_fracapuano/status/1858102728691458554

If you want to know more about how we collected 100GB+ of data, trained multiple RL-policies using @lerobot and used Llama-3.2 models to handle user interactions and switch between tasks, go ahead and have a look! Also, don't be a stranger, and reach out 🦾

Our project is fully open-source, for the community (and ourselves, 👨‍🍳) to build! A huge thank you to @cadene for the help (and the robot 🤭) - truly feeling these hugs-vibes 🤗 , and to @thomwolf and @clem for sharing our work across

Little extra:
➡️ Our 🧠EEG waves🧠-based control of the 🦾robotic arm🦾
reacted to prithivMLmods's post with 🚀🔥❤️🤗 3 days ago
view post
Post
3831
  • 3 replies
·
reacted to fracapuano's post with ❤️👀 3 days ago
view post
Post
525
✍️ the last few weeks has been very intense!
🔴 I have been out all weekends
🔴 Participated in 4 hackathons in a row (2 more to come!)
🔴 Even threw a big hackathon myself!

Nonetheless, I am in school again 🏫, which meant... ✨homework✨

➡️ Head out to here https://x.com/_fracapuano/status/1856415612202799243 to read more about how I used @mistralai models to help me with my assignments (not how you think I did hihi 😏)

➡️ Check outhttps://huggingface.co/spaces/fracapuano/texstral if you want to use the tool yourself!
reacted to Ameeeee's post with ❤️👀 3 days ago
view post
Post
1186
Build a fine-tuning dataset with No Code.

Do you want to build a small dataset for creative writing to fine-tune an Open LLM?
- Find a dataset full of conversations with ChatGPT on the Hugging Face Hub.
- Import it into your Argilla Space.
- Preview the dataset and create a question to label the relevant conversations.
- Label 1000 valid examples of creating writing.
- Use this dataset with Autotrain to fine-tune your model.
  • 1 reply
·
reacted to cfahlgren1's post with ❤️ 3 days ago
view post
Post
2093
You can clean and format datasets entirely in the browser with a few lines of SQL.

In this post, I replicate the process @mlabonne used to clean the new microsoft/orca-agentinstruct-1M-v1 dataset.

The cleaning process consists of:
- Joining the separate splits together / add split column
- Converting string messages into list of structs
- Removing empty system prompts

https://huggingface.co/blog/cfahlgren1/the-beginners-guide-to-cleaning-a-dataset

Here's his new cleaned dataset: mlabonne/orca-agentinstruct-1M-v1-cleaned
  • 1 reply
·
reacted to erikkaum's post with 🔥 3 days ago
view post
Post
1612
A while ago I started experimenting with compiling the Python interpreter to WASM.

To build a secure, fast, and lightweight sandbox for code execution — ideal for running LLM-generated Python code.

- Send code simply as a POST request
- 1-2ms startup times

Hack away:
https://github.com/ErikKaum/runner
reacted to sayakpaul's post with 🚀❤️ 3 days ago
view post
Post
2074
It's been a while we shipped native quantization support in diffusers 🧨

We currently support bistandbytes as the official backend but using others like torchao is already very simple.

This post is just a reminder of what's possible:

1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4. enable_model_cpu_offload()
5. Training and loading LoRAs into quantized checkpoints

Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes
  • 1 reply
·
reacted to davidberenstein1957's post with 😎🔥👀 3 days ago
view post
Post
1795
For anyone who struggles with NER or information extraction with LLM.

We showed an efficient workflow for token classification including zero-shot suggestions and model fine-tuning with Argilla, GliNER, the NuMind NuExtract LLM and SpanMarker. @argilla

Video: https://youtu.be/JvLpaYgNd84?feature=shared
Notebooks and slides included to try it yourself 🙂
reacted to cfahlgren1's post with 🤗👀🔥 6 days ago
view post
Post
2190
Why use Google Drive when you can have:

• Free storage with generous limits🆓
• Dataset Viewer (Sorting, Filtering, FTS) 🔍
• Third Party Library Support
• SQL Console 🟧
• Security 🔒
• Community, Reach, and Visibility 📈

It's a no brainer!

Check out our post on what you get instantly out of the box when you create a dataset.
https://huggingface.co/blog/researcher-dataset-sharing
  • 1 reply
·
replied to m-ric's post 7 days ago