Brigitte Tousignant

BrigitteTousi

AI & ML interests

None yet

Recent Activity

Articles

Organizations

Hugging Face's profile picture Society & Ethics's profile picture HuggingFaceM4's profile picture Open-Source AI Meetup's profile picture BigCode's profile picture Hugging Face OSS Metrics's profile picture IBM-NASA Prithvi Models Family's profile picture Wikimedia Movement's profile picture LeRobot's profile picture Journalists on Hugging Face's profile picture Women on Hugging Face's profile picture Social Post Explorers's profile picture Dev Mode Explorers's profile picture Hugging Face Science's profile picture open/ acc's profile picture Bluesky Community's profile picture

BrigitteTousi's activity

reacted to m-ric's post with โค๏ธ 1 day ago
view post
Post
2325
Single most important thing to do today: ๐—ด๐—ผ ๐˜๐—ฟ๐˜† ๐—ค๐˜„๐—ค ๐—ผ๐—ป ๐—›๐˜‚๐—ด๐—ด๐—ถ๐—ป๐—ด ๐—–๐—ต๐—ฎ๐˜!

๐Ÿ‘‰ https://huggingface.co/chat/models/Qwen/QwQ-32B-Preview
  • 2 replies
ยท
reacted to as-cle-bert's post with ๐Ÿš€ 1 day ago
view post
Post
1221
Hi HuggingFacers!๐Ÿค—
December is here and time has come, for most of us, to wrap up our code projects and take stock of our 2024 contributions๐Ÿ—“๏ธ
In order to do this, I made a small Gradio application, what-a-git-year:

as-cle-bert/what-a-git-year

that scrapes information from your GitHub profile and summarizes them, producing also nice plots๐Ÿ“Š
Find also the GitHub repo here: https://github.com/AstraBert/what-a-git-year โญ

Hope that everyone had a Git year!๐ŸŽ‰
reacted to julien-c's post with ๐Ÿ”ฅ 1 day ago
view post
Post
1578
wow ๐Ÿ˜ฎ

INTELLECT-1 is the first collaboratively trained 10 billion parameter language model trained from scratch on 1 trillion tokens of English text and code.

PrimeIntellect/INTELLECT-1-Instruct
reacted to merve's post with ๐Ÿ”ฅ 1 day ago
view post
Post
2242
Last week we were blessed with open-source models! A recap ๐Ÿ’
merve/nov-29-releases-674ccc255a57baf97b1e2d31

๐Ÿ–ผ๏ธ Multimodal
> At Hugging Face we released SmolVLM, a performant and efficient smol vision language model ๐Ÿ’—
> Show Lab released ShowUI-2B: new vision-language-action model to build GUI/web automation agents ๐Ÿค–
> Rhymes AI has released the base model of Aria: Aria-Base-64K and Aria-Base-8K with their respective context length
> ViDoRe team released ColSmolVLM: A new ColPali-like retrieval model based on SmolVLM
> Dataset: Llava-CoT-o1-Instruct: new dataset labelled using Llava-CoT multimodal reasoning model๐Ÿ“–
> Dataset: LLaVA-CoT-100k dataset used to train Llava-CoT released by creators of Llava-CoT ๐Ÿ“•

๐Ÿ’ฌ LLMs
> Qwen team released QwQ-32B-Preview, state-of-the-art open-source reasoning model, broke the internet ๐Ÿ”ฅ
> AliBaba has released Marco-o1, a new open-source reasoning model ๐Ÿ’ฅ
> NVIDIA released Hymba 1.5B Base and Instruct, the new state-of-the-art SLMs with hybrid architecture (Mamba + transformer)

โฏ๏ธ Image/Video Generation
> Qwen2VL-Flux: new image generation model based on Qwen2VL image encoder, T5 and Flux for generation
> Lightricks released LTX-Video, a new DiT-based video generation model that can generate 24 FPS videos at 768x512 res โฏ๏ธ
> Dataset: Image Preferences is a new image generation preference dataset made with DIBT community effort of Argilla ๐Ÿท๏ธ

Audio
> OuteAI released OuteTTS-0.2-500M new multilingual text-to-speech model based on Qwen-2.5-0.5B trained on 5B audio prompt tokens
reacted to singhsidhukuldeep's post with ๐Ÿ”ฅ๐Ÿ‘€ 1 day ago
view post
Post
891
Excited to share @LinkedIn 's innovative approach to evaluating semantic search quality! As part of the Search AI team, we've developed a groundbreaking evaluation pipeline that revolutionizes how we measure search relevance.

>> Key Innovation: On-Topic Rate (OTR)
This novel metric measures the semantic match between queries and search results, going beyond simple keyword matching. The system evaluates whether content is truly relevant to the query's intent, not just matching surface-level terms.

>> Technical Implementation Details
Query Set Construction
โ€ข Golden Set: Contains curated top queries and complex topical queries
โ€ข Open Set: Includes trending queries and random production queries for diversity

Evaluation Pipeline Architecture
1. Query Processing:
- Retrieves top 10 documents per query
- Extracts post text and article information
- Processes both primary content and reshared materials

2. GAI Integration:
- Leverages GPT-3.5 with specialized prompts
- Produces three key outputs:
- Binary relevance decision
- Relevance score (0-1 range)
- Decision reasoning

Quality Assurance
โ€ข Validation achieved 94.5% accuracy on a test set of 600 query-post pairs
โ€ข Human evaluation showed 81.72% consistency with expert annotators

>> Business Impact
This system now serves as LinkedIn's benchmark for content search experiments, enabling:
โ€ข Weekly performance monitoring
โ€ข Rapid offline testing of new ML models
โ€ข Systematic identification of improvement opportunities

What are your thoughts on semantic search evaluation?
reacted to cfahlgren1's post with ๐Ÿ”ฅ๐Ÿš€ 1 day ago
view post
Post
2033
We just dropped an LLM inside the SQL Console ๐Ÿคฏ

The amazing, new Qwen/Qwen2.5-Coder-32B-Instruct model can now write SQL for any Hugging Face dataset โœจ

It's 2025, you shouldn't be hand writing SQL! This is a big step in making it where anyone can do in depth analysis on a dataset. Let us know what you think ๐Ÿค—
reacted to clem's post with ๐Ÿš€๐Ÿ”ฅ 1 day ago
view post
Post
2691
Six predictions for AI in 2025 (and a review of how my 2024 predictions turned out):

- There will be the first major public protest related to AI
- A big company will see its market cap divided by two or more because of AI
- At least 100,000 personal AI robots will be pre-ordered
- China will start to lead the AI race (as a consequence of leading the open-source AI race).
- There will be big breakthroughs in AI for biology and chemistry.
- We will begin to see the economic and employment growth potential of AI, with 15M AI builders on Hugging Face.

How my predictions for 2024 turned out:

- A hyped AI company will go bankrupt or get acquired for a ridiculously low price
โœ… (Inflexion, AdeptAI,...)

- Open-source LLMs will reach the level of the best closed-source LLMs
โœ… with QwQ and dozens of others

- Big breakthroughs in AI for video, time-series, biology and chemistry
โœ… for video ๐Ÿ”ดfor time-series, biology and chemistry

- We will talk much more about the cost (monetary and environmental) of AI
โœ…Monetary ๐Ÿ”ดEnvironmental (๐Ÿ˜ข)

- A popular media will be mostly AI-generated
โœ… with NotebookLM by Google

- 10 millions AI builders on Hugging Face leading to no increase of unemployment
๐Ÿ”œcurrently 7M of AI builders on Hugging Face
  • 2 replies
ยท
reacted to merve's post with ๐Ÿ”ฅ๐Ÿš€๐Ÿ˜Ž 1 day ago
view post
Post
1802
small but mighty ๐Ÿ”ฅ
you can fine-tune SmolVLM on an L4 with batch size of 4 and it will only take 16.4 GB VRAM ๐Ÿซฐ๐Ÿป also with gradient accumulation simulated batch size is 16 โœจ
I made a notebook that includes all the goodies: QLoRA, gradient accumulation, gradient checkpointing with explanations on how they work ๐Ÿ’ https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
reacted to brunatrevelin's post with ๐Ÿค—๐Ÿ”ฅโค๏ธ 1 day ago
upvoted an article 1 day ago
view article
Article

Use Models from the Hugging Face Hub in LM Studio

By yagilb โ€ข
โ€ข 85
reacted to victor's post with ๐Ÿ”ฅ 7 days ago
view post
Post
1986
Perfect example of why Qwen/Qwen2.5-Coder-32B-Instruct is insane?

Introducing: AI Video Composer ๐Ÿ”ฅ
huggingface-projects/ai-video-composer

Drag and drop your assets (images/videos/audios) to create any video you want using natural language!

It works by asking the model to output a valid FFMPEG and this can be quite complex but most of the time Qwen2.5-Coder-32B gets it right (that thing is a beast). It's an update of an old project made with GPT4 and it was almost impossible to make it work with open models back then (~1.5 years ago), but not anymore, let's go open weights ๐Ÿš€.
reacted to andito's post with ๐Ÿ”ฅ 7 days ago
view post
Post
3065
Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

- SmolVLM generates tokens 7.5 to 16 times faster than Qwen2-VL! ๐Ÿคฏ
- Other models at this size crash a laptop, but SmolVLM comfortably generates 17 tokens/sec on a macbook! ๐Ÿš€
- SmolVLM can be fine-tuned on a Google collab! Or process millions of documents with a consumer GPU!
- SmolVLM even outperforms larger models in video benchmarks, despite not even being trained on videos!

Check out more!
Demo: HuggingFaceTB/SmolVLM
Blog: https://huggingface.co/blog/smolvlm
Model: HuggingFaceTB/SmolVLM-Instruct
Fine-tuning script: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
reacted to nataliaElv's post with ๐Ÿ‘€ 7 days ago
view post
Post
1528
Would you like to get a high-quality dataset to pre-train LLMs in your language? ๐ŸŒ

At Hugging Face we're preparing a collaborative annotation effort to build an open-source multilingual dataset as part of the Data is Better Together initiative.

Follow the link below, check if your language is listed and sign up to be a Language Lead!

https://forms.gle/s9nGajBh6Pb9G72J6