John6666 (John Smith)

reacted to m-ric's post with 🚀🤗 about 11 hours ago

Post

1000

Introducing 𝗼𝗽𝗲𝗻 𝗗𝗲𝗲𝗽-𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 by Hugging Face! 💥

OpenAI's latest agentic app Deep Research seems really good... But it's closed, as usual.

⏱️ So with a team of cracked colleagues, we set ourselves a 24hours deadline to replicate and open-source Deep Research! ⏱️

➡️ We built open-Deep-Research, an entirely open agent that can: navigate the web autonomously, scroll and search through pages, download and manipulate files, run calculation on data...

We aimed for the best performance: are the agent's answers really rigorous?

On GAIA benchmark, Deep Research had 67% accuracy on the validation set.
➡️ open Deep Research is at 55% (powered by o1), it is:
- the best pass@1 solution submitted
- the best open solution 💪💪

And it's only getting started ! Please jump in, drop PRs, and let's bring it to the top !

Read the blog post 👉 https://huggingface.co/blog/open-deep-research

replied to Keltezaa's post about 13 hours ago

Inference credits,

20x higher limits , and more.

These parts are pointing here, and are all included in the subscription. Confusing and insufficient explanation, well, yeah.
https://huggingface.co/blog/inference-providers
https://huggingface.co/blog/inference-pro
https://huggingface.co/docs/api-inference/pricing

reacted to singhsidhukuldeep's post with 👍 about 13 hours ago

Post

680

Exciting breakthrough in Streaming Recommendation Systems! @BytedanceTalk researchers have developed "Long-Term Interest Clock" (LIC), a revolutionary approach to understand user preferences throughout the day.

>> Technical Innovation
The system introduces two groundbreaking modules:
- Clock-based General Search Unit (Clock-GSU): Intelligently retrieves relevant user behaviors by analyzing time patterns and content similarity
- Clock-based Exact Search Unit (Clock-ESU): Employs time-gap-aware attention mechanism to precisely model user interests

>> Key Advantages
LIC addresses critical limitations of existing systems by:
- Providing fine-grained time perception instead of discrete hour-based recommendations
- Analyzing long-term user behavior patterns rather than just short-term interactions
- Operating at item-level granularity versus broad category-level interests

>> Real-World Impact
Already deployed in Douyin Music App, the system has demonstrated remarkable results:
- 0.122% improvement in user active days
- Significant boost in engagement metrics including likes and play rates
- Enhanced user satisfaction with reduced dislike rates

>> Under the Hood
The system processes user behavior sequences spanning an entire year, utilizing multi-head attention mechanisms and sophisticated time-gap calculations to understand user preferences. It pre-computes embeddings stored in parameter servers for real-time performance, making it highly scalable for production environments.

This innovation marks a significant step forward in personalized content delivery, especially for streaming platforms where user preferences vary throughout the day. The research has been accepted for presentation at WWW '25, Sydney.

reacted to davidberenstein1957's post with 🤗 about 13 hours ago

Post

636

Creating an agentic RAG stack on the Hugging Face Hub - part 1 - retrieval (1/5).

🚀 Web apps and microservices included!

Chunk, embed and index documents at a huge scale without overhead.

Blog: https://huggingface.co/blog/davidberenstein1957/ai-blueprint-agentic-rag-part-1-retrieve

reacted to ggbetz's post with 👀 about 13 hours ago

Post

608

We've just released syncIALO -- a multi-purpose synthetic debate and argument mapping corpus with more than 600k arguments:

📝 Blog article: https://huggingface.co/blog/ggbetz/introducing-syncialo
🛢️ Dataset: DebateLabKIT/syncialo-raw
👩‍💻 Code: https://github.com/debatelab/syncIALO

🤗 Hugging Face has sponsored the syncIALO project through inference time / compute credits. 🙏 We gratefully acknowledge the generous support. 🫶

replied to Keltezaa's post about 18 hours ago

I'm sorry. I didn't think about it enough.
In my country, $9 is enough to buy 4 to 6 meals for the average person (according to statistics released by the government. I think it's actually lower...), so it's not a small amount of money. There are also many people living in debt.
But even so, I have forgotten that we are blessed to be able to buy toys by holding back on our meals a little.

reacted to hba123's post with 🔥 about 19 hours ago

Post

802

We developed a method that ensures almost-sure safety (i.e., safety with probability approaching 1). We proved this result. We then, present a practical implementation which we call InferenceGuard. InferenceGuard has impressive practical results: 91.04% on Alpaca-7B and 100% safety results on Beaver 7B-v3.

Now, it is easy to get high safety results like those if we want a dumb model, e.g., just don't answer or answer with EOS and so on. However, our goal is not to only have safe results, but also to make sure that the rewards are high - we want a good trade-off between safety and rewards! That's exactly, what we show. InferenceGuard achieves that!

Check it out: Almost Surely Safe Alignment of Large Language Models at Inference-Time (2502.01208)

reacted to rubenroy's post with 🔥 about 19 hours ago

Post

849

🔥🚀 Hey everyone! I'm excited to share my latest LLM release: Gilgamesh 72B, a model built on Qwen 2.5-72B Instruct. Gilgamesh was trained on a couple of my GammaCorpus datasets, specifically:

- rubenroy/GammaCorpus-CoT-Math-170k
- rubenroy/GammaCorpus-v2-5m
- rubenroy/GammaCorpus-Fact-QA-450k

I've submitted GGM 72B to the Open LLM Leaderboard for benchmarking, I'll send an update post once the results are in!

You can try it out and share your feedback, check out the model page and see what it can do:
👉 rubenroy/Gilgamesh-72B

Would love to hear your thoughts!

reacted to frimelle's post with 👀 about 19 hours ago

Post

312

I was quoted in an article about the French Lucie AI in La Presse. While I love the name for obvious reasons 👀 there were still a lot of problems with the model and how and when it was deployed. Nevertheless seeing new smaller models being developed is an exciting direction for the next years of AI development to come!

https://www.lapresse.ca/affaires/techno/2025-02-02/radioscopie/lucie-l-ia-francaise-qui-ne-passe-pas-le-test.php

Also fun to see my comments in French.

reacted to Jaward's post with 🚀 about 19 hours ago

Post

856

ByteDance drops OmniHuman🔥
This is peak SOTA performance - flawless natural gestures with perfect lip sync and facial expressions. This is the second time they released SOTA level talking-heads only this time with hands and body motion.
Project: https://omnihuman-lab.github.io/

2 replies

·

reacted to nicolay-r's post with 👀 about 21 hours ago

Post

1617

📢 Qwen so far released the 2.5-MAX that claims to outperform DeepSeek-V3 [Edited: not R1].
And here is how you can start applying it for handling CSV / JSONL data.
The model is compatible with OpenAI API so here is my wrapper for it:
🌌 https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/openai_156.py

🚀 All you have to do is to set
base-url: https://dashscope-intl.aliyuncs.com/compatible-mode/v1
and API key of the platform.

↗️ Below is the link to the complete example (see screenshot):
https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_qwen_25_max_chat.sh

📰 Source: https://www.alibabacloud.com/help/en/model-studio/developer-reference/what-is-qwen-llm
📺 Official Sandbox Demo: Qwen/Qwen2.5-Max-Demo
📜 Paper: https://arxiv.org/abs/2412.15115

2 replies

·

reacted to Tonic's post with 🔥 about 21 hours ago

Post

1155

🙋🏻‍♂️hey there folks ,

Goedel's Theorem Prover is now being demo'ed on huggingface : Tonic/Math

give it a try !

reacted to oleggolev's post with 🚀 about 21 hours ago

Post

2601

🚀 Dobby-mini is out!

Last week, @SentientAGI released two demo models for the upcoming Dobby model family which we are building with your feedback: SentientAGI/dobby-mini-679af3ed45dfdd8c25e8112c

🔥 The two models (available as transformers and GGUF) are here:
- SentientAGI/Dobby-Mini-Unhinged-Llama-3.1-8B 😈
- SentientAGI/Dobby-Mini-Leashed-Llama-3.1-8B 😇

Fine-tuned from Llama-3.1-8B-Instruct while retaining benchmark performance, these personality-enhanced models are prime for building anything from AI companions and social agents to opinionated chatbots and content generators.

- 🦅 Pro-freedom
- 💸 Pro-crypto
- 💪 Opinionated and stand their ground

💻 Local Setup with Ollama:
- Written instructions: https://huggingface.co/blog/chrisaubin/hosting-dobby-mini
- Companion video: https://www.youtube.com/watch?v=b1rbtCgK2YA

🎆 Use via API on Fireworks for free!
- Unhinged: https://tinyurl.com/4h2c7tmv
- Leashed: https://tinyurl.com/2xjwsdxb

✌️ Try Dobby-mini via a Gradio demo:
- https://demo-dobby.sentient.xyz/
- No Internet search, ask it some personal questions!

Dobby-70B en route 😎

reacted to CultriX's post with 🔥 about 21 hours ago

Post

966

# Multi-Agent Collaboration for Coding Tasks - Updated Space!

This version does not rely on AutoGen.
The user simply enters his OPENAI_API_KEY and a task and the Space goes to work, employing a
- 1. prompt-enhancer agent,
- 2. an orchestrator agent,
- 3. a coder agent,
- 4. a code-reviewing agent and
-5. a code documentation generator agent.

See below image for an example workflow:

CultriX/MultiAgent-CodeTask

1 reply

·

reacted to sequelbox's post with ➕ about 21 hours ago

Post

1057

New sneak preview of my next release! Raiden is a deepseek-ai/DeepSeek-R1 synthetic dataset that uses creative-reasoning and analytic-reasoning prompts!

This preview release has the first 5.8k rows, all responses generated using DeepSeek's 685b parameter R1 model: sequelbox/Raiden-DSR1-PREVIEW

Enjoy this look at R1's reasoning skills! Full dataset coming soon.

replied to Keltezaa's post about 21 hours ago

Sometimes I wish there was more quota/slot. Also, because the function with the decorator is multi-process, it's painful that objects that can't be pickled don't work...
Well, it's only $9, so I won't be greedy.

I think it might be difficult to achieve the latter due to the system being a shared resource. If you count failed attempts as no-count, then an attack that occupies the shared resource by intentionally continuing to fail would be successful...
Of course, I don't mean you would do that.

reacted to Keltezaa's post with 👀 about 21 hours ago

Post

258

Other "Pro" members & staff.
I am seriously considering of canceling my Pro subscription, unless this matter can be addressed.

I subscribed to the Pro package for the use of the ZeroGPU, but the current recovery rate is not worth the money. I am supposed to get 5x more usage and or recovery.

By my rough calculation the current recovery rate for GPU time spend is 18min per every 60sec of GPU usage. One of my "T-T-image" Spaces use about 45-50 seconds for each render. that gives 25-28 images
At the current recover rate that is 2 images per hour, limited to 24 images per day.

I may be wrong here or there may be something in my app.py code that is messing with with the usage time. currently it is set "@spaces.ZeroGPU(Duration=60)" If I reduce this I get an error if my render takes more time. (GPU task aborted, or error)

The second thing that bothers me a bit is that some Errors, or failed image generation do not refund the usage. So if an image fails due to whatever error. It still gets added to the usage and as mentioned before recovers very slow. So at the end of the day If I get 15 usable images I consider myself lucky. As we all know, that you can prompt like a "boss" and the image dont always come out as you imagined or hoped.

Thank you for taking the time to read my *RANT* - But I sincerely hope that We as paying users can get more value for money.

Regrds.

12 replies

·

reacted to rubenroy's post with 🚀🔥 1 day ago

Post

2487

🎉 Fully released my newest models trained on my GammaCorpus dataset, Zurich 7B & 14B and Geneva 12B. Here is the model collections:

Zurich:
rubenroy/zurich-679b21284e207e2844bc025d

Geneva:
https://huggingface.co/collections/rubenroy/geneva-679e33a55d1576319b0d9cd4

If you would like to test them, feel free to visit their spaces:
rubenroy/Geneva-12B
rubenroy/Zurich-14B
rubenroy/Zurich-7B

John Smith PRO

AI & ML interests

Recent Activity

Organizations

John6666's activity