John Smith's picture

John Smith PRO

John6666

AI & ML interests

None yet

Recent Activity

Organizations

open/ acc's profile picture Solving Real World Problems's profile picture FashionStash Group meeting's profile picture

John6666's activity

reacted to m-ric's post with ๐Ÿš€๐Ÿค— about 11 hours ago
view post
Post
1000
Introducing ๐—ผ๐—ฝ๐—ฒ๐—ป ๐——๐—ฒ๐—ฒ๐—ฝ-๐—ฅ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต by Hugging Face! ๐Ÿ’ฅ

OpenAI's latest agentic app Deep Research seems really good... But it's closed, as usual.

โฑ๏ธ So with a team of cracked colleagues, we set ourselves a 24hours deadline to replicate and open-source Deep Research! โฑ๏ธ

โžก๏ธ We built open-Deep-Research, an entirely open agent that can: navigate the web autonomously, scroll and search through pages, download and manipulate files, run calculation on data...

We aimed for the best performance: are the agent's answers really rigorous?

On GAIA benchmark, Deep Research had 67% accuracy on the validation set.
โžก๏ธ open Deep Research is at 55% (powered by o1), it is:
- the best pass@1 solution submitted
- the best open solution ๐Ÿ’ช๐Ÿ’ช

And it's only getting started ! Please jump in, drop PRs, and let's bring it to the top !

Read the blog post ๐Ÿ‘‰ https://huggingface.co/blog/open-deep-research
replied to Keltezaa's post about 13 hours ago
reacted to singhsidhukuldeep's post with ๐Ÿ‘ about 13 hours ago
view post
Post
680
Exciting breakthrough in Streaming Recommendation Systems! @BytedanceTalk researchers have developed "Long-Term Interest Clock" (LIC), a revolutionary approach to understand user preferences throughout the day.

>> Technical Innovation
The system introduces two groundbreaking modules:
- Clock-based General Search Unit (Clock-GSU): Intelligently retrieves relevant user behaviors by analyzing time patterns and content similarity
- Clock-based Exact Search Unit (Clock-ESU): Employs time-gap-aware attention mechanism to precisely model user interests

>> Key Advantages
LIC addresses critical limitations of existing systems by:
- Providing fine-grained time perception instead of discrete hour-based recommendations
- Analyzing long-term user behavior patterns rather than just short-term interactions
- Operating at item-level granularity versus broad category-level interests

>> Real-World Impact
Already deployed in Douyin Music App, the system has demonstrated remarkable results:
- 0.122% improvement in user active days
- Significant boost in engagement metrics including likes and play rates
- Enhanced user satisfaction with reduced dislike rates

>> Under the Hood
The system processes user behavior sequences spanning an entire year, utilizing multi-head attention mechanisms and sophisticated time-gap calculations to understand user preferences. It pre-computes embeddings stored in parameter servers for real-time performance, making it highly scalable for production environments.

This innovation marks a significant step forward in personalized content delivery, especially for streaming platforms where user preferences vary throughout the day. The research has been accepted for presentation at WWW '25, Sydney.
reacted to davidberenstein1957's post with ๐Ÿค— about 13 hours ago
reacted to ggbetz's post with ๐Ÿ‘€ about 13 hours ago
view post
Post
608
We've just released syncIALO -- a multi-purpose synthetic debate and argument mapping corpus with more than 600k arguments:

๐Ÿ“ Blog article: https://huggingface.co/blog/ggbetz/introducing-syncialo
๐Ÿ›ข๏ธ Dataset: DebateLabKIT/syncialo-raw
๐Ÿ‘ฉโ€๐Ÿ’ป Code: https://github.com/debatelab/syncIALO

๐Ÿค— Hugging Face has sponsored the syncIALO project through inference time / compute credits. ๐Ÿ™ We gratefully acknowledge the generous support. ๐Ÿซถ
replied to Keltezaa's post about 18 hours ago
view reply

I'm sorry. I didn't think about it enough.
In my country, $9 is enough to buy 4 to 6 meals for the average person (according to statistics released by the government. I think it's actually lower...), so it's not a small amount of money. There are also many people living in debt.
But even so, I have forgotten that we are blessed to be able to buy toys by holding back on our meals a little.

reacted to hba123's post with ๐Ÿ”ฅ about 19 hours ago
view post
Post
802
We developed a method that ensures almost-sure safety (i.e., safety with probability approaching 1). We proved this result. We then, present a practical implementation which we call InferenceGuard. InferenceGuard has impressive practical results: 91.04% on Alpaca-7B and 100% safety results on Beaver 7B-v3.

Now, it is easy to get high safety results like those if we want a dumb model, e.g., just don't answer or answer with EOS and so on. However, our goal is not to only have safe results, but also to make sure that the rewards are high - we want a good trade-off between safety and rewards! That's exactly, what we show. InferenceGuard achieves that!

Check it out: Almost Surely Safe Alignment of Large Language Models at Inference-Time (2502.01208)
reacted to rubenroy's post with ๐Ÿ”ฅ about 19 hours ago
view post
Post
849
๐Ÿ”ฅ๐Ÿš€ Hey everyone! I'm excited to share my latest LLM release: Gilgamesh 72B, a model built on Qwen 2.5-72B Instruct. Gilgamesh was trained on a couple of my GammaCorpus datasets, specifically:

- rubenroy/GammaCorpus-CoT-Math-170k
- rubenroy/GammaCorpus-v2-5m
- rubenroy/GammaCorpus-Fact-QA-450k

I've submitted GGM 72B to the Open LLM Leaderboard for benchmarking, I'll send an update post once the results are in!

You can try it out and share your feedback, check out the model page and see what it can do:
๐Ÿ‘‰ rubenroy/Gilgamesh-72B

Would love to hear your thoughts!
reacted to frimelle's post with ๐Ÿ‘€ about 19 hours ago
view post
Post
312
I was quoted in an article about the French Lucie AI in La Presse. While I love the name for obvious reasons ๐Ÿ‘€ there were still a lot of problems with the model and how and when it was deployed. Nevertheless seeing new smaller models being developed is an exciting direction for the next years of AI development to come!

https://www.lapresse.ca/affaires/techno/2025-02-02/radioscopie/lucie-l-ia-francaise-qui-ne-passe-pas-le-test.php

Also fun to see my comments in French.
reacted to Jaward's post with ๐Ÿš€ about 19 hours ago
view post
Post
856
ByteDance drops OmniHuman๐Ÿ”ฅ
This is peak SOTA performance - flawless natural gestures with perfect lip sync and facial expressions. This is the second time they released SOTA level talking-heads only this time with hands and body motion.
Project: https://omnihuman-lab.github.io/
  • 2 replies
ยท
reacted to nicolay-r's post with ๐Ÿ‘€ about 21 hours ago
view post
Post
1617
๐Ÿ“ข Qwen so far released the 2.5-MAX that claims to outperform DeepSeek-V3 [Edited: not R1].
And here is how you can start applying it for handling CSV / JSONL data.
The model is compatible with OpenAI API so here is my wrapper for it:
๐ŸŒŒ https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/openai_156.py

๐Ÿš€ All you have to do is to set
base-url: https://dashscope-intl.aliyuncs.com/compatible-mode/v1
and API key of the platform.

โ†—๏ธ Below is the link to the complete example (see screenshot):
https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_qwen_25_max_chat.sh

๐Ÿ“ฐ Source: https://www.alibabacloud.com/help/en/model-studio/developer-reference/what-is-qwen-llm
๐Ÿ“บ Official Sandbox Demo: Qwen/Qwen2.5-Max-Demo
๐Ÿ“œ Paper: https://arxiv.org/abs/2412.15115
  • 2 replies
ยท
reacted to Tonic's post with ๐Ÿ”ฅ about 21 hours ago
view post
Post
1155
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธhey there folks ,

Goedel's Theorem Prover is now being demo'ed on huggingface : Tonic/Math

give it a try !
reacted to oleggolev's post with ๐Ÿš€ about 21 hours ago
view post
Post
2601
๐Ÿš€ Dobby-mini is out!

Last week, @SentientAGI released two demo models for the upcoming Dobby model family which we are building with your feedback: SentientAGI/dobby-mini-679af3ed45dfdd8c25e8112c

๐Ÿ”ฅ The two models (available as transformers and GGUF) are here:
- SentientAGI/Dobby-Mini-Unhinged-Llama-3.1-8B ๐Ÿ˜ˆ
- SentientAGI/Dobby-Mini-Leashed-Llama-3.1-8B ๐Ÿ˜‡

Fine-tuned from Llama-3.1-8B-Instruct while retaining benchmark performance, these personality-enhanced models are prime for building anything from AI companions and social agents to opinionated chatbots and content generators.

- ๐Ÿฆ… Pro-freedom
- ๐Ÿ’ธ Pro-crypto
- ๐Ÿ’ช Opinionated and stand their ground

๐Ÿ’ป Local Setup with Ollama:
- Written instructions: https://huggingface.co/blog/chrisaubin/hosting-dobby-mini
- Companion video: https://www.youtube.com/watch?v=b1rbtCgK2YA

๐ŸŽ† Use via API on Fireworks for free!
- Unhinged: https://tinyurl.com/4h2c7tmv
- Leashed: https://tinyurl.com/2xjwsdxb

โœŒ๏ธ Try Dobby-mini via a Gradio demo:
- https://demo-dobby.sentient.xyz/
- No Internet search, ask it some personal questions!

Dobby-70B en route ๐Ÿ˜Ž
reacted to CultriX's post with ๐Ÿ”ฅ about 21 hours ago
view post
Post
966
# Multi-Agent Collaboration for Coding Tasks - Updated Space!

This version does not rely on AutoGen.
The user simply enters his OPENAI_API_KEY and a task and the Space goes to work, employing a
- 1. prompt-enhancer agent,
- 2. an orchestrator agent,
- 3. a coder agent,
- 4. a code-reviewing agent and
-5. a code documentation generator agent.

See below image for an example workflow:

CultriX/MultiAgent-CodeTask
  • 1 reply
ยท
reacted to sequelbox's post with โž• about 21 hours ago
view post
Post
1057
New sneak preview of my next release! Raiden is a deepseek-ai/DeepSeek-R1 synthetic dataset that uses creative-reasoning and analytic-reasoning prompts!

This preview release has the first 5.8k rows, all responses generated using DeepSeek's 685b parameter R1 model: sequelbox/Raiden-DSR1-PREVIEW

Enjoy this look at R1's reasoning skills! Full dataset coming soon.
replied to Keltezaa's post about 21 hours ago
view reply

Sometimes I wish there was more quota/slot. Also, because the function with the decorator is multi-process, it's painful that objects that can't be pickled don't work...
Well, it's only $9, so I won't be greedy.

I think it might be difficult to achieve the latter due to the system being a shared resource. If you count failed attempts as no-count, then an attack that occupies the shared resource by intentionally continuing to fail would be successful...
Of course, I don't mean you would do that.

reacted to Keltezaa's post with ๐Ÿ‘€ about 21 hours ago
view post
Post
258
Other "Pro" members & staff.
I am seriously considering of canceling my Pro subscription, unless this matter can be addressed.

I subscribed to the Pro package for the use of the ZeroGPU, but the current recovery rate is not worth the money. I am supposed to get 5x more usage and or recovery.

By my rough calculation the current recovery rate for GPU time spend is 18min per every 60sec of GPU usage. One of my "T-T-image" Spaces use about 45-50 seconds for each render. that gives 25-28 images
At the current recover rate that is 2 images per hour, limited to 24 images per day.

I may be wrong here or there may be something in my app.py code that is messing with with the usage time. currently it is set "@spaces.ZeroGPU(Duration=60)" If I reduce this I get an error if my render takes more time. (GPU task aborted, or error)

The second thing that bothers me a bit is that some Errors, or failed image generation do not refund the usage. So if an image fails due to whatever error. It still gets added to the usage and as mentioned before recovers very slow. So at the end of the day If I get 15 usable images I consider myself lucky. As we all know, that you can prompt like a "boss" and the image dont always come out as you imagined or hoped.

Thank you for taking the time to read my *RANT* - But I sincerely hope that We as paying users can get more value for money.

Regrds.
ยท
reacted to rubenroy's post with ๐Ÿš€๐Ÿ”ฅ 1 day ago