Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

prithivMLmods 
posted an update 2 days ago
view post
Post
4668
FastVLMs by Apple are the talk of the week for edge device VLMs and also for consumer-grade VLMs on the Hub. They have some impressive demos available on the Hub for live captioning and inference tasks. Meanwhile, I’m still exploring one of the coolest edge-device multimodal releases—Liquid AI’s LFM2-VL (450M and 1.6B). I’ve also made a live camera video inference demo, which is capable of running on Colab’s free-tier T4 GPU.

🤗Live Captioning Notebooks:
➠ LiquidAI LFM2 VL 1.6B Live Cam: https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/LiquidAI-LFM2-VL-Live-Cam/LiquidAI_LFM2_VL_1_6B_Live_Cam.ipynb

➠ LiquidAI LFM2 VL 450M Live Cam: https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/LiquidAI-LFM2-VL-Live-Cam/LiquidAI_LFM2_VL_450M_Live_Cam.ipynb

✨I also made a demo for the FastVLM Live Captioning Notebook.
➠ FastVLM 0.5B Live Cam: https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/Apple-FastVLM-0.5B-Live-Cam/apple_FastVLM_0_5B_live_cam.ipynb

↗️For more notebooks, kindly visit the following repositories.
➠ Multimodal Outpost Notebooks: https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks

Feel free to fork, modify, and explore!
MonsterMMORPG 
posted an update 3 days ago
view post
Post
7146
I have concluded first 8 traininings of Qwen Image LoRA - we are not at the level of FLUX yet and next 8 trainings starting hopefully - 2656x2656px image generated with 8 steps Fast Qwen LoRA + myself trained LoRA :

Grid test results shared here along with App installer : https://www.patreon.com/posts/137551634
cgeorgiaw 
posted an update about 24 hours ago
view post
Post
1598
🚀🚀🚀 The largest ever dataset of co-folded 3D protein-ligand structures just dropped on HF!!

Meet SAIR (Structurally Augmented IC₅₀ Repository): 5M+ AI-generated complexes with experimentally measured drug potency data from SandboxAQ. 🚀🚀🚀

Check it out and explore here: SandboxAQ/SAIR

merve 
posted an update 3 days ago
view post
Post
5539
large AI labs have dropped so many open models last week 🔥 don't miss out on them

→ Apple released on-device vision LMs apple/fastvlm-68ac97b9cd5cacefdd04872e & apple/mobileclip2-68ac947dcb035c54bcd20c47
→ OpenGVLab released InternVL3.5, 32 new vision LMs with one based on gpt-oss! (OS) OpenGVLab/internvl35-68ac87bd52ebe953485927fb
→ MSFT released a killer small TTS model (OS) microsoft/VibeVoice-1.5B

find more herehttps://huggingface.co/collections/merve/august-29-releases-68b5a3754cfb8abf59e2b486
  • 1 reply
·
eliebak 
posted an update 2 days ago
view post
Post
1837
Super excited to announce that our research team at Hugging Face will be doing an AMA on reddit r/LocalLLaMA.

Come ask any questions to the team behind SmolLM, FineWeb and more! And who knows, maybe there’ll be a shiny new release to talk about?

Thursday 4th September, 8AM-11AM PST 🤗

science
louisbrulenaudet 
posted an update 3 days ago
view post
Post
5604
Supercharge Apple’s Shortcuts using Cloudflare Workers and Gemini within minutes (and for free, up to 1,500 requests per day) ☁️✨

Hello everyone, last week, while experimenting for fun, I created an API that allows you to easily access AI models (in this case, Google's) from the Shortcut app in order to analyze data from my apps and make the most of it thanks to the generative capabilities of advanced models.

It costs me nothing, and I think it might be good to share it so that others can build on it.

In README.md, you will find everything you need to get started and put your own microservice into production, which you can call from the app’s HTTP request features.

You will simply be asked to have a free Cloudflare account and an API key obtained from Google's AI Studio.

Feel free to take a look and get back to me if you encounter any problems during deployment.

Here is the GitHub repo where you can find all the source code and run it on your own: https://github.com/louisbrulenaudet/genai-api
DawnC 
posted an update 3 days ago
view post
Post
6543
PawMatchAI — Now with SBERT-Powered Recommendations! 🐶✨

⭐️ NEW: Description-based recommendations are here!
Just type in your lifestyle or preferences (e.g. “I live in an apartment and want a quiet dog”), and PawMatchAI uses SBERT semantic embeddings to understand your needs and suggest compatible breeds.

What can PawMatchAI do today?
📸 Upload a photo to identify your dog from 124 breeds with detailed info.
⚖️ Compare two breeds side-by-side, from grooming needs to health insights.
📊 Visualize breed traits with radar and comparison charts.
🎨 Try Style Transfer to turn your dog’s photo into anime, watercolor, cyberpunk, and more.

What’s next?
🎯 More fine-tuned recommendations.
📱 Mobile-friendly deployment.
🐾 Expansion to additional species.

My goal:
To make breed discovery not only accurate but also interactive and fun — combining computer vision, semantic understanding, and creativity to help people find their perfect companion.

👉 Try it here:
DawnC/PawMatchAI

If you enjoy PawMatchAI, please give the project a ❤️ — it really helps and keeps me motivated to keep improving!

#ComputerVision #SBERT #DeepLearning #MachineLearning #TechForLife
DualityAI-RebekahBogdanoff 
posted an update about 14 hours ago
hannayukhymenko 
posted an update 2 days ago
view post
Post
1698
Releasing the Jupyter Agent Dataset! 🚀

Built from 7 TB of real Kaggle datasets + 20k notebooks, creating real code exec traces using Qwen3-Coder and E2B.
Training on this data dramatically improves the ability to execute code and analyze data.

We ( @baptistecolle @hannayukhymenko @lvwerra ) have created a novel synthetic data generation pipeline with efficient scaffolding, which gives a big performance boost after training your coding agent🔥With the help of real Kaggle notebooks and datasets we generate synthetic notebooks which aim to analyze datasets and answer factual questions about them more efficiently. We simulate a real code execution environment by prompting LLMs or with the help of E2B sandboxes. We have built a dataset of 50k+ high-quality LLM-generated notebooks which can help your agent become better at performing data analysis and question answering.

Link: data-agents/jupyter-agent-dataset
  • 2 replies
·
Bils 
posted an update 4 days ago
view post
Post
1041
Introducing ShortiFoley 🎵 — an AI tool that transforms short videos into realistic Foley audio.
Built on Tencent’s HunyuanVideo-Foley with SigLIP2 + CLAP, and designed for media automation pipelines like n8n
✅ Generate Foley from video
✅ Autosave results with metadata
✅ MCP endpoints for workflows
Bils/ShortiFoley