dame rajee

damerajee

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago
unsloth/Qwen2.5-7B
liked a Space 7 days ago
Qwen/Qwen2.5-Coder-demo
liked a Space 16 days ago
stabilityai/stable-diffusion-3.5-large

Organizations

Posts 2

view post
Post
420
On the 2nd of October a really cool paper was released called "Were RNNs all we need" https://arxiv.org/abs/2410.01201

This paper introduces the MinGRU model, a simplified version of the traditional Gated Recurrent Unit (GRU) designed to enhance efficiency by removing hidden state dependencies from its gates. This allows for parallel training, making it significantly faster than conventional GRUs. Additionally, MinGRU eliminates non-linear activations like tanh, streamlining computations.

So I read the paper and I tried training this model and it seems to be doing quite well , you could check out the pre-trained model on the huggingface spaces

- damerajee/mingru-stories
view post
Post
1850
Just released ViLaH - a compact 3B parameter vision language model! which generates responses in Hindi only hindi for now 😔

BhashaAI/ViLaH