1 1

kim

skcandx

AI & ML interests

None yet

Recent Activity

New activity 6 days ago

HyperBlaze/L3-8B-Soliloquy-v2-SpicyMaid-Lewd-Mergetest:Be a person of basic courtesy.

liked a model 6 days ago

HyperBlaze/L3-8B-Soliloquy-v2-SpicyMaid-Lewd-Mergetest

replied to tomaarsen's post 6 days ago

🚀 Sentence Transformers v3.1 is out! Featuring a hard negatives mining utility to get better models out of your data, a new strong loss function, training with streaming datasets, custom modules, bug fixes, small additions and docs changes. Here's the details: ⛏ Hard Negatives Mining Utility: Hard negatives are texts that are rather similar to some anchor text (e.g. a question), but are not the correct match. They're difficult for a model to distinguish from the correct answer, often resulting in a stronger model after training. 📉 New loss function: This loss function works very well for symmetric tasks (e.g. clustering, classification, finding similar texts/paraphrases) and a bit less so for asymmetric tasks (e.g. question-answer retrieval). 💾 Streaming datasets: You can now train with the datasets.IterableDataset, which doesn't require downloading the full dataset to disk before training. As simple as "streaming=True" in your "datasets.load_dataset". 🧩 Custom Modules: Model authors can now customize a lot more of the components that make up Sentence Transformer models, allowing for a lot more flexibility (e.g. multi-modal, model-specific quirks, etc.) ✨ New arguments to several methods: encode_multi_process gets a progress bar, push_to_hub can now be done to different branches, and CrossEncoders can be downloaded to specific cache directories. 🐛 Bug fixes: Too many to name here, check out the release notes! 📝 Documentation: A particular focus on clarifying the batch samplers in the Package Reference this release. Check out the full release notes here ⭐: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.1.0 I'm very excited to hear your feedback, and I'm looking forward to the future changes that I have planned, such as ONNX inference! I'm also open to suggestions for new features: feel free to send me your ideas.

View all activity

Organizations

None yet

skcandx's activity

New activity in HyperBlaze/L3-8B-Soliloquy-v2-SpicyMaid-Lewd-Mergetest 6 days ago

Be a person of basic courtesy.

#1 opened 6 days ago by

skcandx

liked a model 6 days ago

HyperBlaze/L3-8B-Soliloquy-v2-SpicyMaid-Lewd-Mergetest

Text Generation • Updated May 9 • 103 • 1

replied to tomaarsen's post 6 days ago

This comment has been hidden

Reacted to alvdansen's post with 😔 6 days ago

Post

5749

New LoRA Model!

I trained this model on a new spot I'm really excited to share (soon!)

This Monday I will be posting my first beginning to end blog showing the tool I've used, dataset, captioning techniques, and parameters to finetune this LoRA.

For now, check out the model in the link below.

alvdansen/m3lt

5 replies

Reacted to fdaudens's post with 😔 6 days ago

Post

2525

Exciting news for audio AI enthusiasts! 🎙️🌍

The Emilia dataset dropped last week, and it's a cool one:
- 101k+ hours of high-quality audio
- 6 languages: 🇨🇳 🇺🇸 🇯🇵 🇰🇷 🇩🇪 🇫🇷
- Diverse content: talk shows, interviews, debates, sports commentary, audiobooks

This dataset could improve multilingual speech generation and recognition. Opens up many possibilities for global media, language learning, and accessibility!

Explore it: amphion/Emilia

#AIAudio

Reacted to tomaarsen's post with 😔 6 days ago

Post

3714

🚀 Sentence Transformers v3.1 is out! Featuring a hard negatives mining utility to get better models out of your data, a new strong loss function, training with streaming datasets, custom modules, bug fixes, small additions and docs changes. Here's the details:

⛏ Hard Negatives Mining Utility: Hard negatives are texts that are rather similar to some anchor text (e.g. a question), but are not the correct match. They're difficult for a model to distinguish from the correct answer, often resulting in a stronger model after training.
📉 New loss function: This loss function works very well for symmetric tasks (e.g. clustering, classification, finding similar texts/paraphrases) and a bit less so for asymmetric tasks (e.g. question-answer retrieval).
💾 Streaming datasets: You can now train with the datasets.IterableDataset, which doesn't require downloading the full dataset to disk before training. As simple as "streaming=True" in your "datasets.load_dataset".
🧩 Custom Modules: Model authors can now customize a lot more of the components that make up Sentence Transformer models, allowing for a lot more flexibility (e.g. multi-modal, model-specific quirks, etc.)
✨ New arguments to several methods: encode_multi_process gets a progress bar, push_to_hub can now be done to different branches, and CrossEncoders can be downloaded to specific cache directories.
🐛 Bug fixes: Too many to name here, check out the release notes!
📝 Documentation: A particular focus on clarifying the batch samplers in the Package Reference this release.

Check out the full release notes here ⭐: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.1.0

I'm very excited to hear your feedback, and I'm looking forward to the future changes that I have planned, such as ONNX inference! I'm also open to suggestions for new features: feel free to send me your ideas.

3 replies