dan su

sudanenator

AI & ML interests

None yet

Recent Activity

reacted to MonsterMMORPG's post with 🔥 10 days ago
Wan 2.1 AI Video Model: Ultimate Step-by-Step Tutorial for Windows & Affordable Private Cloud Setup : https://youtu.be/hnAhveNy-8s https://youtu.be/hnAhveNy-8s Please check all screenshots to see latest news and updates after the tutorial video Alibaba’s new Wan 2.1 text-to-video, video-to-video and image-to-video Open Source AI is unbelievable. In this tutorial I will show how you can install Wan 2.1 all publicly published models into your Windows PC with 1-click installation and use them with the easiest possible way. With the Gradio APP I have developed, you will be able to use Wan AI with as low as 3.5GB VRAM having GPUs. Furthermore, for those who want to utilize powerful private cloud GPUs with cheapest possible prices, I will show how to 1-click and install Wan 2.1 on Massed Compute and on RunPod. Additionally, I will compare performance of RTX 3090 TI with RTX 5090 on all Wan 2.1 models. You will be shocked to see performance of RTX 5090. Also the APP I developed supports all RTX 5000 series on Windows with Python VENV natively. You don't need Linux or WSL. 🔗 Full Instructions, Configs, Installers, Information and Links Shared Post (the one used in the tutorial) ⤵️ ▶️ https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-123105403 🔗 SECourses Official Discord 9500+ Members ⤵️ ▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388 🔗 Stable Diffusion, FLUX, Generative AI Tutorials and Resources GitHub ⤵️ ▶️ https://github.com/FurkanGozukara/Stable-Diffusion 🔗 SECourses Official Reddit - Stay Subscribed To Learn All The News and More ⤵️ ▶️ https://www.reddit.com/r/SECourses/ 🔗 MSI RTX 5090 TRIO FurMark Benchmarking + Overclocking + Noise Testing and Comparing with RTX 3090 TI ⤵️ ▶️ https://youtu.be/uV3oqdILOmA 🔗 RTX 5090 Tested Against FLUX DEV, SD 3.5 Large, SD 3.5 Medium, SDXL, SD 1.5, AMD 9950X + RTX 3090 TI ⤵️ ▶️ https://youtu.be/jHlGzaDLkto
reacted to Kseniase's post with ➕ 17 days ago
8 Free Sources about AI Agents: Agents seem to be everywhere and this collection is for a deep dive into the theory and practice: 1. "Agents" Google's whitepaper by Julia Wiesinger, Patrick Marlow and Vladimir Vuskovic -> https://www.kaggle.com/whitepaper-agents Covers agents, their functions, tool use and how they differ from models 2. "Agents in the Long Game of AI. Computational Cognitive Modeling for Trustworthy, Hybrid AI" book by Marjorie McShane, Sergei Nirenburg, and Jesse English -> https://direct.mit.edu/books/oa-monograph/5833/Agents-in-the-Long-Game-of-AIComputational Explores building AI agents, using Hybrid AI, that combines ML with knowledge-based reasoning 3. "AI Engineer Summit 2025: Agent Engineering" 8-hour video -> https://www.youtube.com/watch?v=D7BzTxVVMuw Experts' talks that share insights on the freshest Agent Engineering advancements, such as Google Deep Research, scaling tips and more 4. AI Agents Course from Hugging Face -> https://huggingface.co/learn/agents-course/en/unit0/introduction Agents' theory and practice to learn how to build them using top libraries and tools 5. "Artificial Intelligence: Foundations of Computational Agents", 3rd Edition, book by David L. Poole and Alan K. Mackworth -> https://artint.info/3e/html/ArtInt3e.html Agents' architectures, how they learn, reason, plan and act with certainty and uncertainty 6. "Intelligent Agents: Theory and Practice" book by Michael Wooldridge -> https://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/ker95/ker95-html.html A fascinating option to dive into how agents were seen in 1995 and explore their theory, architectures and agent languages 7. The Turing Post articles "AI Agents and Agentic Workflows" on Hugging Face -> https://huggingface.co/Kseniase We explore agentic workflows in detail and agents' building blocks, such as memory and knowledge 8. Our collection "8 Free Sources to Master Building AI Agents" -> https://www.turingpost.com/p/building-ai-agents-sources
View all activity

Organizations

Project Fluently's profile picture

sudanenator's activity

reacted to AdinaY's post with 😎 10 days ago
view post
Post
3986
Exciting releases from the Chinese community this February🔥
👉 zh-ai-community/2025-february-67a35aaa68e97812def5b6ef

MLLM:
✨ Ovis2 by Alibaba
AIDC-AI/ovis2-67ab36c7e497429034874464
✨ Step Audio Chat by StepFun AI
stepfun-ai/step-audio-67b33accf45735bb21131b0b

Audio:
✨ Step Audio TTS by StepFunAI
stepfun-ai/Step-Audio-TTS-3B
✨ InspireMusic by Alibaba
https://huggingface.co/FunAudioLLM
✨ Baichuan Audio by BaichuanAI
baichuan-inc/Baichuan-Audio-Instruct

Video:
✨ Wan2.1 by Alibaba_Wan
Wan-AI/Wan2.1-T2V-14B
✨ Stepvideo-T2V by StepFun AI
stepfun-ai/stepvideo-t2v
✨ SkyReels-V1 by Skywork
Skywork/skyreels-v1-67b34676ff65b4ec02d16307
✨ LLaDA-8B by RenminUniversity
GSAI-ML/LLaDA-8B-Instruct

MoE:
✨ Moonlight-16B by MoonshotAI (Kimi)
moonshotai/Moonlight-16B-A3B-Instruct

Reasoning:
✨ TinyR1-32B by Qihoo360
qihoo360/TinyR1-32B-Preview

Dataset:
✨ Chinese DeepSeek R1-Distill data -110k
Congliu/Chinese-DeepSeek-R1-Distill-data-110k
reacted to MonsterMMORPG's post with 🔥 10 days ago
view post
Post
3065
Wan 2.1 AI Video Model: Ultimate Step-by-Step Tutorial for Windows & Affordable Private Cloud Setup : https://youtu.be/hnAhveNy-8s

https://youtu.be/hnAhveNy-8s

Please check all screenshots to see latest news and updates after the tutorial video

Alibaba’s new Wan 2.1 text-to-video, video-to-video and image-to-video Open Source AI is unbelievable. In this tutorial I will show how you can install Wan 2.1 all publicly published models into your Windows PC with 1-click installation and use them with the easiest possible way. With the Gradio APP I have developed, you will be able to use Wan AI with as low as 3.5GB VRAM having GPUs. Furthermore, for those who want to utilize powerful private cloud GPUs with cheapest possible prices, I will show how to 1-click and install Wan 2.1 on Massed Compute and on RunPod. Additionally, I will compare performance of RTX 3090 TI with RTX 5090 on all Wan 2.1 models. You will be shocked to see performance of RTX 5090. Also the APP I developed supports all RTX 5000 series on Windows with Python VENV natively. You don't need Linux or WSL.

🔗 Full Instructions, Configs, Installers, Information and Links Shared Post (the one used in the tutorial) ⤵️
▶️ https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-123105403

🔗 SECourses Official Discord 9500+ Members ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

🔗 Stable Diffusion, FLUX, Generative AI Tutorials and Resources GitHub ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion

🔗 SECourses Official Reddit - Stay Subscribed To Learn All The News and More ⤵️
▶️ https://www.reddit.com/r/SECourses/

🔗 MSI RTX 5090 TRIO FurMark Benchmarking + Overclocking + Noise Testing and Comparing with RTX 3090 TI ⤵️
▶️ https://youtu.be/uV3oqdILOmA

🔗 RTX 5090 Tested Against FLUX DEV, SD 3.5 Large, SD 3.5 Medium, SDXL, SD 1.5, AMD 9950X + RTX 3090 TI ⤵️
▶️ https://youtu.be/jHlGzaDLkto
  • 1 reply
·
reacted to Kseniase's post with 👍 17 days ago
view post
Post
9542
8 Free Sources about AI Agents:

Agents seem to be everywhere and this collection is for a deep dive into the theory and practice:

1. "Agents" Google's whitepaper by Julia Wiesinger, Patrick Marlow and Vladimir Vuskovic -> https://www.kaggle.com/whitepaper-agents
Covers agents, their functions, tool use and how they differ from models

2. "Agents in the Long Game of AI. Computational Cognitive Modeling for Trustworthy, Hybrid AI" book by Marjorie McShane, Sergei Nirenburg, and Jesse English -> https://direct.mit.edu/books/oa-monograph/5833/Agents-in-the-Long-Game-of-AIComputational
Explores building AI agents, using Hybrid AI, that combines ML with knowledge-based reasoning

3. "AI Engineer Summit 2025: Agent Engineering" 8-hour video -> https://www.youtube.com/watch?v=D7BzTxVVMuw
Experts' talks that share insights on the freshest Agent Engineering advancements, such as Google Deep Research, scaling tips and more

4. AI Agents Course from Hugging Face -> https://huggingface.co/learn/agents-course/en/unit0/introduction
Agents' theory and practice to learn how to build them using top libraries and tools

5. "Artificial Intelligence: Foundations of Computational Agents", 3rd Edition, book by David L. Poole and Alan K. Mackworth -> https://artint.info/3e/html/ArtInt3e.html
Agents' architectures, how they learn, reason, plan and act with certainty and uncertainty

6. "Intelligent Agents: Theory and Practice" book by Michael Wooldridge -> https://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/ker95/ker95-html.html
A fascinating option to dive into how agents were seen in 1995 and explore their theory, architectures and agent languages

7. The Turing Post articles "AI Agents and Agentic Workflows" on Hugging Face -> https://huggingface.co/Kseniase
We explore agentic workflows in detail and agents' building blocks, such as memory and knowledge

8. Our collection "8 Free Sources to Master Building AI Agents" -> https://www.turingpost.com/p/building-ai-agents-sources
·
reacted to mkurman's post with 👍 28 days ago
view post
Post
2041
I've been working on something cool: a GRPO with an LLM evaluator that can also perform SFT on the feedback data - if you want. Check it out 😊

Any 🌟are more than welcome 🤗

https://github.com/mkurman/grpo-llm-evaluator
reacted to CultriX's post with ❤️ 28 days ago
view post
Post
2390
Final upgrade to the Multi-Agent Task Completion Space: CultriX/MultiAgent-CodeTask .

It now includes :
- a live stream of the progress being made on the task (see included video),
- The following components:
1. Automatic prompt optimization
2. An orchestrator deciding which agent to call dynamically including feedback from a human (human-in-the-loop)
3. A coding agent to complete the task
4. A code reviewing agent to iteratively provide feedback to improve the code generated by the coding agent until the code meets the required criteria after which it is approved.
5. A testing agent that tests the approved code or provides information on how to test it.
6. A documentation agent that provides documentation and a help message for the approved and tested code.

reacted to davidberenstein1957's post with 🤗 about 1 month ago
reacted to nyuuzyou's post with 👍 about 1 month ago
view post
Post
2468
📱 UI Navigation Corpus - teleren/ui-navigation-corpus

A comprehensive collection of mobile and web UI elements created by a new member of the Hugging Face community @teleren . I'm glad that I was able to provide a little help together with @its5Q to get this dataset published.

This dataset contains:
- Screenshots and recordings of mobile (iOS/Android) and web interfaces
- UI navigation annotations and metadata
- Screen categorization tags and text extractions
- Navigation paths and screen relationships
- Version control for UI imagery

Perfect for training UI navigation agents and understanding interface patterns. The dataset provides detailed annotations linking screens, sections, and navigation flows together.
reacted to chansung's post with 👍 about 2 months ago
view post
Post
2063
Simple Summarization on DeepSeek-R1 from DeepSeek AI

The RL stage is very important.
↳ However, it is difficult to create a truly helpful AI for people solely through RL.
↳ So, we applied a learning pipeline consisting of four stages: providing a good starting point, reasoning RL, SFT, and safety RL, and achieved performance comparable to o1.
↳ Simply fine-tuning other open models with the data generated by R1-Zero (distillation) resulted in performance comparable to o1-mini.

Of course, this is just a brief overview and may not be of much help. All models are accessible on Hugging Face, and the paper can be read through the GitHub repository.


Model: https://huggingface.co/deepseek-ai
Paper: https://github.com/deepseek-ai/DeepSeek-R1
  • 1 reply
·
reacted to danielhanchen's post with 🔥 2 months ago
reacted to reddgr's post with 👀 2 months ago
view post
Post
2356
Major update on the Talking to Chatbots dataset! Expanded the 'wrapped' dataset (one row per chat) to 2.86k records, and the 'unwrapped' version (one row per conversation turn) to 11k records. The main source is my ChatGPT archive with nearly 2 years of chats. It is still a work in progress as I incorporate chats from other sources and qualitative metrics (SCBN) for responses.

reddgr/talking-to-chatbots-unwrapped-chats

reddgr/talking-to-chatbots-chats

reacted to Xenova's post with 👍 8 months ago
view post
Post
8073
Introducing Whisper Diarization: Multilingual speech recognition with word-level timestamps and speaker segmentation, running 100% locally in your browser thanks to 🤗 Transformers.js!

Tested on this iconic Letterman interview w/ Grace Hopper from 1983!
- Demo: Xenova/whisper-speaker-diarization
- Source code: Xenova/whisper-speaker-diarization
  • 1 reply
·
upvoted an article 11 months ago
reacted to chansung's post with ❤️ 11 months ago
view post
Post
4410
💻 Smoothing the Transition from Service LLM to Local LLM

Imagine your go-to LLM service is down, or you need to use it offline – yikes! This project is all about having that "Plan B" ready to go. Here's LLaMA Duo I've been building with @sayakpaul :

✨ Fine-tune a smaller LLM: We used Hugging Face's alignment-handbook to teach a smaller-sized LLM to mimic my favorite large language model. Think of it as that super-smart AI assistant getting a capable understudy.

🤖 Batch Inference: Let's get that fine-tuned LLM working! My scripts generate lots of text like a champ, and we've made sure things run smoothly even with bigger workloads.

🧐 Evaluation: How well is my small LLM doing? We integrated with the Gemini API to use it as an expert judge – it compares my model's work to the original. Talk about a tough critic!

🪄 Synthetic Data Generation: Need to boost that model's performance? Using Gemini's feedback, we can create even more training data, custom-made to make the LLM better.

🧱 Building Blocks: This isn't just a one-time thing – it's a toolkit for all kinds of LLMOps work. Want to change your evaluation metrics? Bring in models trained differently? Absolutely, let's make it happen.

Why this project is awesome:

💪 Reliability: Keep things running no matter what happens to your main LLM source.
🔒 Privacy: Process sensitive information on your own terms.
🗺️ Offline capable: No internet connection? No problem!
🕰️ Version Control: Lock in your favorite LLM's behavior, even if the service model changes.

We'm excited to share the code on GitHub. Curious to see what you all think! 👉🏻 https://github.com/deep-diver/llamaduo