Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

cfahlgren1ย 
posted an update 3 days ago
view post
Post
1721
You can clean and format datasets entirely in the browser with a few lines of SQL.

In this post, I replicate the process @mlabonne used to clean the new microsoft/orca-agentinstruct-1M-v1 dataset.

The cleaning process consists of:
- Joining the separate splits together / add split column
- Converting string messages into list of structs
- Removing empty system prompts

https://huggingface.co/blog/cfahlgren1/the-beginners-guide-to-cleaning-a-dataset

Here's his new cleaned dataset: mlabonne/orca-agentinstruct-1M-v1-cleaned
  • 1 reply
ยท
ArthurZย 
posted an update 3 days ago
view post
Post
1727
Native tensor parallel has landed in transformers!!! https://github.com/huggingface/transformers/pull/34184 thanks a lot to the torch team for their support!

Contributions are welcome to support more models! ๐Ÿ”ฅ
jsulzย 
posted an update about 12 hours ago
view post
Post
421
When the XetHub crew joined Hugging Face this fall, @erinys and I started brainstorming how to share our work to replace Git LFS on the Hub. Uploading and downloading large models and datasets takes precious time. Thatโ€™s where our chunk-based approach comes in.

Instead of versioning files (like Git and Git LFS), we version variable-sized chunks of data. For the Hugging Face community, this means:

โฉ Only upload the chunks that changed.
๐Ÿš€ Download just the updates, not the whole file.
๐Ÿง  We store your file as deduplicated chunks

In our benchmarks, we found that using CDC to store iterative model and dataset version led to transfer speedups of ~2x, but this isnโ€™t just a performance boost. Itโ€™s a rethinking of how we manage models and datasets on the Hub.

We're planning on our new storage backend to the Hub in early 2025 - check out our blog to dive deeper, and let us know: how could this improve your workflows?

https://huggingface.co/blog/from-files-to-chunks
prithivMLmodsย 
posted an update about 17 hours ago
view post
Post
712
๐Ÿ… Glif App's Remixes feature allows you to slap a logo onto anything, seamlessly integrating the input image (logo) into various contexts. The result is stunning remixes that blend the input logo with generated images (img2img logo mapping) for incredible outcomes.

Check out Any Logo Anywhere remixes on Glif: [Glif Remixes](https://glif.app/glifs/cm3o7dfsd002610z48sz89yih/remixes)

๐ŸŒThe browser extension enables thousands of Glif-based img2img workflows on any image you find online. Experience Glif Remix with WebAI: [Chrome Extension](https://chromewebstore.google.com/detail/glif-remix-the-web-with-a/abfbooehhdjcgmbmcpkcebcmpfnlingo)

.
.
.
๐Ÿค—Have fun with the cool stuff !!
@prithivMLmods
m-ricย 
posted an update 2 days ago
view post
Post
1144
Great feature alert: ๐—ฌ๐—ผ๐˜‚ ๐—ฐ๐—ฎ๐—ป ๐—ป๐—ผ๐˜„ ๐˜‚๐˜€๐—ฒ ๐—ฎ๐—ป๐˜† ๐—ฆ๐—ฝ๐—ฎ๐—ฐ๐—ฒ ๐—ฎ๐˜€ ๐—ฎ ๐˜๐—ผ๐—ผ๐—น ๐—ณ๐—ผ๐—ฟ ๐˜†๐—ผ๐˜‚๐—ฟ ๐˜๐—ฟ๐—ฎ๐—ป๐˜€๐—ณ๐—ผ๐—ฟ๐—บ๐—ฒ๐—ฟ๐˜€.๐—ฎ๐—ด๐—ฒ๐—ป๐˜! ๐Ÿ› ๏ธ๐Ÿ”ฅ๐Ÿ”ฅ

This lets you take the coolest spaces, like FLUX.1-dev, and use them in agentic workflows with a few lines of code! ๐Ÿง‘โ€๐Ÿ’ป

On the video below, I set up my fake vacation pictures where I'm awesome at surfing (I'm really not) ๐Ÿ„

Head to the doc to learn this magic ๐Ÿ‘‰ https://huggingface.co/docs/transformers/main/en/agents_advanced#import-a-space-as-a-tool-
prithivMLmodsย 
posted an update 2 days ago
view post
Post
2776
The (768 x 1024) mix of MidJourney and Flux's LoRA is nearly identical to the actual visual design. It hasnโ€™t undergone much concept art development for now. In the meantime, try out the impressive visual designs on:

๐ŸฅšMidjourney Flux Mix : prithivMLmods/Midjourney-Flux

๐ŸฅšFlux LoRA Collection: prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be
.
.
.
@prithivMLmods ๐Ÿค—
thomwolfย 
posted an update 2 days ago
openfreeย 
posted an update about 5 hours ago
view post
Post
198
MOUSE-I: Transform a Prompt into a Live Web Service
"From Prompt to Global Service in 60 Seconds"
The Future of Web Development
MOUSE-I revolutionizes web development by converting a single prompt into a fully functional, globally deployed web service through AI automation and enterprise-grade infrastructure.
โšก Lightning-Fast Pipeline (60 Seconds)
1. AI Prompt Enhancement (5s)

Instant requirement analysis
Tech stack optimization
Development spec generation

2. Code Creation (49s)

Production-ready code
Responsive design
Performance-optimized

3. Live Rendering (1s)

Instant visualization
Real-time testing

4. Global Deployment (5s)

Vercel infrastructure
Global CDN
Automatic HTTPS

๐ŸŽฏ Key Differentiators

Instant Results: From idea to live URL in 60 seconds
Enterprise Quality: Production-grade code and infrastructure
Zero Configuration: No setup or technical knowledge required
40+ Templates: Ready-to-use solutions for games, dashboards, and apps

๐Ÿ’ซ Perfect For

Startups needing quick MVPs
Developers prototyping ideas
Non-technical founders building web services
Educators creating interactive tools

๐Ÿš€ Get Started

Visit MOUSE-I Gallery
Enter your prompt
Get your live service in 60 seconds

๐Ÿ’ก Connect

๐ŸŒ MOUSE-I Gallery
https://huggingface.co/spaces/VIDraft/mouse1


๐Ÿ’ฌ discord.gg/openfreeai
๐Ÿ“ง arxivgpt@gmail.com
  • 1 reply
ยท
monsoon-nlpย 
posted an update 1 day ago
view post
Post
853
Great to see Tatta Bio release an embeddings version of their DNA/protein language model ๐Ÿงฌ: tattabio/gLM2_650M_embed
LukeNeumannย 
posted an update 1 day ago
view post
Post
897
Nine years ago, I uploaded the first 8K resolution video to YouTube and I've been stockpiling 8K footage ever since: https://www.youtube.com/watch?v=sLprVF6d7Ug&t

Should @Overlaiapp release the first open-source 8K video dataset?

Could anyone even fine tune a model with this?๐Ÿ˜…
ยท