Nick Brooks

nickandbro

AI & ML interests

None yet

Recent Activity

Organizations

nickandbro's activity

New activity in nvidia/Llama-3.1-Nemotron-70B-Instruct-HF about 1 month ago

Congrats to the Nvidia team!

#10 opened about 1 month ago by nickandbro
New activity in nvidia/Llama-3_1-Nemotron-51B-Instruct about 1 month ago

vLLM compatible?

3
#10 opened about 2 months ago by nickandbro
Reacted to davidberenstein1957's post with πŸ‘ about 1 month ago
view post
Post
2498
Don't use an LLM when you can use a much cheaper model.

The problem is that no one tells you how to actually do it.

Just picking a pre-trained model (e.g., BERT) and throwing it at your problem won't work!

If you want a small model to perform well on your problem, you need to fine-tune it.

And to fine-tune it, you need data.

The good news is that you don't need a lot of data but instead high-quality data for your specific problem.

In the latest livestream, I showed you guys how to get started with Argilla on the Hub! Hope to see you at the next one.

https://www.youtube.com/watch?v=BEe7shiG3rY
Reacted to zamal's post with πŸ€— about 2 months ago
view post
Post
1938
πŸš€ New Model Release: zamal/Molmo-7B-GPTQ-4bit πŸš€

Hello lovely community,

zamal/Molmo-7B-GPTQ-4bit model is now available for all! This model has been highly quantized, reducing its size by almost six times. It now occupies significantly less space and vRAM, making it perfect for deployment on resource-constrained devices without compromising performance.

Now we get:
Efficient Performance: Maintains high accuracy while being highly quantized.
Reduced Size: The model size is reduced by nearly six times, optimizing storage and memory usage.
Versatile Application: Ideal for integrating a powerful visual language model into various projects particularly multi rag chains.
Check it out!

  • 1 reply
Β·
New activity in nvidia/NV-Embed-v2 2 months ago

Does this work with vLLM?

#9 opened 2 months ago by nickandbro
Reacted to MonsterMMORPG's post with ❀️ 2 months ago
view post
Post
4139
Trained Myself With 256 Images on FLUX β€” Results Mind Blowing

Detailed Full Workflow

Medium article : https://medium.com/@furkangozukara/ultimate-flux-lora-training-tutorial-windows-and-cloud-deployment-abb72f21cbf8

Windows main tutorial : https://youtu.be/nySGu12Y05k

Cloud tutorial for GPU poor or scaling : https://youtu.be/-uhL2nW7Ddw

Full detailed results and conclusions : https://www.patreon.com/posts/111891669

Full config files and details to train : https://www.patreon.com/posts/110879657

SUPIR Upscaling (default settings are now perfect) : https://youtu.be/OYxVEvDf284

I used my Poco X6 Camera phone and solo taken images

My dataset is far from being ready, thus I have used so many repeating and almost same images, but this was rather experimental

Hopefully I will continue taking more shots and improve dataset and reduce size in future

I trained Clip-L and T5-XXL Text Encoders as well

Since there was too much push from community that my workflow won’t work with expressions, I had to take a break from research and use whatever I have

I used my own researched workflow for training with Kohya GUI and also my own self developed SUPIR app batch upscaling with face upscaling and auto LLaVA captioning improvement

Download images to see them in full size, the last provided grid is 50% downscaled

Workflow

Gather a dataset that has expressions and perspectives that you like after training, this is crucial, whatever you add, it can generate perfect

Follow one of the LoRA training tutorials / guides

After training your LoRA, use your favorite UI to generate images

I prefer SwarmUI and here used prompts (you can add specific expressions to prompts) including face inpainting :

https://gist.github.com/FurkanGozukara/ce72861e52806c5ea4e8b9c7f4409672

After generating images, use SUPIR to upscale 2x with maximum resemblance

Short Conclusions

Using 256 images certainly caused more overfitting than necessary

...
New activity in jbilcke-hf/ai-comic-factory 4 months ago

Where can I find the code?

3
#832 opened 5 months ago by nickandbro
Reacted to tomaarsen's post with πŸ”₯ 6 months ago
view post
Post
2364
NuMind has just released 3 new state-of-the-art GLiNER models for Named Entity Recognition/Information Extraction. These GLiNER models allow you to specify any label that you want, and it'll find spans in the text corresponding to your label. It's been shown to work quite well on unusual domains, e.g. celestial entities in my picture.

There are 3 models released:
- numind/NuNER_Zero:
The primary model, SOTA & can detect really long entities.
- numind/NuNER_Zero-span:
Slightly better performance than NuNER Zero, but can't detect entities longer than 12 tokens.
- numind/NuNER_Zero-4k:
Slightly worse than NuNER Zero, but has a context length of 4k tokens.

Some more details about these models in general:
- They are *really* small, orders of magnitude smaller than LLMs, which don't reach this level of performance.
- Because they're small - they're fast: <1s per sentence on free GPUs.
- They have an MIT license: free commercial usage.

Try out the demo here: https://huggingface.co/spaces/numind/NuZero
Or check out all of the models here: numind/nunerzero-zero-shot-ner-662b59803b9b438ff56e49e2

If there's ever a need for me to extract some information from any text: I'll be using these. Great work @Serega6678 !
  • 3 replies
Β·
liked a Space 8 months ago
Reacted to mvaloatto's post with ❀️ 9 months ago
view post
Post
Want more β€œgood machine learning” in your X feed? Here is a new Space for you:
πŸ”” Top HF Users To Follow On X - https://huggingface.co/spaces/mvaloatto/HF2X

Ever since I fell down the AI rabbit hole, it hasn’t been super easy to spot and follow the most impactful Hugging Face contributors on X. So, inspired by @Weyaxi leaderboards, I decided to create a list just for this purpose.

Why, you ask?

First, it’s quite surprising how so many talented AI pioneers and independent contributors on X don't get the visibility/reach you might expect. Sad but true: follower count doesn't always match up with the value or innovation an individual brings to the table (just stating the obvious here).

Open source AI, in particular, thrives not just on innovation but also on the collective spirit of its believers and builders. With Hugging Face standing out as a prime hub for top AI engineers and contributors, compiling a directory of X profiles from influential figures on this platform felt like a natural step.

This Space aims to not only connect these top contributors but also guide open AI enthusiasts and newcomers towards the field's leading lights.

I put this modest page together using some web scraping and what I remember from my web dev class ages ago! Suggestions/likes are welcome - I’m hoping to keep tweaking/upgrading it, especially if you all find it useful.

Now, let’s follow each other! It’s time to accelerate the dissemination of our ideas, encourage collaboration within our community, and ensure that open AI developments receive the attention and recognition they deserve. πŸ”₯
Β·
Reacted to fblgit's post with πŸ‘ 9 months ago
view post
Post
Senku-70B stills undefeated within EQ-Bench, latest updates from the author shows even a further increase in performance, reaching a new score of 85.09

This new mark outperform some GPT-4 models, closing further the very thin gap between OpenCommunity LLM and Closed source models.

ShinojiResearch/Senku-70B-Full
  • 1 reply
Β·
Reacted to gsarti's post with πŸ‘ 9 months ago
view post
Post
πŸ” Today's pick in Interpretability & Analysis of LMs: In-Context Learning Demonstration Selection via Influence Analysis
by Vinay M.S. @minhhaovan X. Wu

Recent work showed how the performance of LMs using in-context learning (ICL) is heavily dependent on selected demonstrations.

This work introduces InfICL, a demonstration selection method using influence functions to identify salient training examples to use as demonstrations at inference time. InfICL is tested alongside other examples selection baselines for prompting medium-sized LLMs for COLA and RTE, showing improvements over other methods especially when a smaller number of in-context examples is used.

πŸ“„ Paper: In-Context Learning Demonstration Selection via Influence Analysis (2402.11750)

πŸ” All daily picks in LM interpretability: gsarti/daily-picks-in-interpretability-and-analysis-of-lms-65ae3339949c5675d25de2f9
Reacted to akhaliq's post with ❀️ 9 months ago
view post
Post
In Search of Needles in a 10M Haystack

Recurrent Memory Finds What LLMs Miss

paper addresses the challenge of processing long documents using generative transformer models. To evaluate different approaches, we introduce BABILong, a new benchmark designed to assess model capabilities in extracting and processing distributed facts within extensive texts. Our evaluation, which includes benchmarks for GPT-4 and RAG, reveals that common methods are effective only for sequences up to 10^4 elements. In contrast, fine-tuning GPT-2 with recurrent memory augmentations enables it to handle tasks involving up to 10^7 elements. This achievement marks a substantial leap, as it is by far the longest input processed by any open neural network model to date, demonstrating a significant improvement in the processing capabilities for long sequences
Reacted to Xenova's post with ❀️ 10 months ago
view post
Post
Introducing Remove Background Web: In-browser background removal, powered by @briaai 's new RMBG-v1.4 model and πŸ€— Transformers.js!

Everything runs 100% locally, meaning none of your images are uploaded to a server! 🀯 At only ~45MB, the 8-bit quantized version of the model is perfect for in-browser usage (it even works on mobile).

Check it out! πŸ‘‡
Demo: Xenova/remove-background-web
Model: briaai/RMBG-1.4
Β·
Reacted to merve's post with ❀️ 10 months ago
view post
Post
Google released a paper on Chess that doesn't rely on MCTS (aka AlphaZero) β™ŸοΈ
their secret sauce is.. synthetic data pseudolabeled by Stockfish engine πŸ˜€
2024 really is the year of synthetic data across all domains!
There's a nice discussion here, join us Grandmaster-Level Chess Without Search (2402.04494)
  • 2 replies
Β·
Reacted to santiviquez's post with πŸ‘ 10 months ago
view post
Post
What if the retrieval goes wrong? πŸ•

Retrieval Augmented Generation (RAG) is a strategy to alleviate LLM hallucinations and improve the quality of generated responses.

A standard RAG architecture has two main blocks: a Retriever and a Generator.

1️⃣ When the system receives an input sequence, it uses the Retriever to retrieve the top-K most relevant documents associated with the input sequence. These documents typically come from an external source (e.g., Wikipedia) and are then concatenated to the original input's context.

2️⃣ It then uses the Generator to generate a response given the gathered information in the first step.

But what happens if the retrieval goes wrong and the retrieved documents are of very low quality?

Well, in such cases, the generated response will probably be of low quality, too. 🫠

But here is where CRAG (Corrective RAG) *might* help. I say it might help because the paper is very new β€” only one week old, and I don't know if someone has actually tried this in practice πŸ˜…

However, the idea is to add a Knowledge Correction block between the Retrieval and Generation steps to evaluate the retrieved documents and correct them if necessary.

This step goes as follows:

🟒 If the documents are correct, they will be refined into more precise knowledge strips and concatenated to the original context to generate a response.

πŸ”΄ If the documents are incorrect, they will be discarded, and instead, the system searches the web for complementary knowledge. This external knowledge is then concatenated to the original context to generate a response.

🟑 If the documents are ambiguous, a combination of the previous two resolutions is triggered.

The experimental results from the paper show how the CRAG strategy outperforms traditional RAG approaches in both short and long-form text generation tasks.

Paper: Corrective Retrieval Augmented Generation (2401.15884)