Shaij's picture

2 6 17

Shaij PRO

appoose

·

AI & ML interests

None yet

Recent Activity

liked a model 11 days ago

jinaai/jina-embeddings-v3

liked a model about 1 month ago

mobiuslabsgmbh/faster-whisper-large-v3-turbo

liked a model about 2 months ago

openai/whisper-large-v3-turbo

View all activity

Organizations

appoose's activity

liked a model 11 days ago

jinaai/jina-embeddings-v3

Feature Extraction • Updated 13 days ago • 1.03M • 508

liked a model about 1 month ago

mobiuslabsgmbh/faster-whisper-large-v3-turbo

Updated Oct 8 • 3.95k • 12

liked a model about 2 months ago

openai/whisper-large-v3-turbo

Automatic Speech Recognition • Updated Oct 4 • 1.83M • • 1.4k

liked a model 2 months ago

Qwen/Qwen2.5-72B-Instruct

Text Generation • Updated Sep 25 • 470k • • 495

upvoted an article 3 months ago

Article

Unlocking Longer Generation with Key-Value Cache Quantization

May 16

• 32

liked a model 3 months ago

mobiuslabsgmbh/Hermes-3-Llama-3.1-70B_4bitgs64_hqq

Text Generation • Updated Aug 16 • 9 • 4

updated a model 3 months ago

mobiuslabsgmbh/Hermes-3-Llama-3.1-70B_4bitgs64_hqq

Text Generation • Updated Aug 16 • 9 • 4

posted an update 3 months ago

Post

2056

Releasing HQQ Llama-3.1-70b 4-bit quantized version! Check it out at mobiuslabsgmbh/Llama-3.1-70b-instruct_4bitgs64_hqq.

Achieves 99% of the base model performance across various benchmarks! Details in the model card.

liked a model 3 months ago

mobiuslabsgmbh/Llama-3.1-70b-instruct_4bitgs64_hqq

Text Generation • Updated Aug 16 • 223 • 31

liked a model 4 months ago

facebook/sam2-hiera-small

Mask Generation • Updated Aug 7 • 19.5k • 13

posted an update 4 months ago

Post

1788

Excited to announce the release of our high-quality Llama-3.1 8B 4-bit HQQ/calibrated quantized model! Achieving an impressive 99.3% relative performance to FP16, it also delivers the fastest inference speed for transformers.

mobiuslabsgmbh/Llama-3.1-8b-instruct_4bitgs64_hqq_calib

1 reply

·

liked 2 models 4 months ago

mobiuslabsgmbh/Llama-3.1-8b-instruct_4bitgs64_hqq_calib

Text Generation • Updated Aug 27 • 101 • 55

mobiuslabsgmbh/Llama-3-8b-instruct_2bitgs64_hqq

Text Generation • Updated Aug 16 • 13 • 10

updated a model 6 months ago

appoose/aana-soccer-no-pretraining-weight-balanced-2_2.0_720-combined-finetuned

upvoted 2 articles 7 months ago

Article

Fine-tune Llama 3 with ORPO

By

•

Apr 22

• 227

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15

• 166

upvoted a collection 8 months ago

Llama3 HQQ

4 items • Updated Aug 13 • 19

updated a Space 8 months ago

README

Multimodal AI for the world's scale

Reacted to osanseviero's post with 🔥 8 months ago

Post

2066

Diaries of Open Source. Part 11 🚀

🚀Databricks release DBRX, potentially the best open access model! A 132B Mixture of Experts with 36B active params and trained on 12 trillion tokens
Blog: https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm
Base and instruct models: databricks/dbrx-6601c0852a0cdd3c59f71962
Demo: databricks/dbrx-instruct

🤏1-bit and 2-bit quantization exploration using HQQ+
Blog post: https://mobiusml.github.io/1bit_blog/
Models: https://hf.co/collections/mobiuslabsgmbh/llama2-7b-hqq-6604257a96fc8b9c4e13e0fe
GitHub: https://github.com/mobiusml/hqq

📚Cosmopedia: a large-scale synthetic dataset for pre-training - it includes 25 billion tokens and 30 million files
Dataset: HuggingFaceTB/cosmopedia
Blog: https://hf.co/blog/cosmopedia

⭐Mini-Gemini: multi-modal VLMs, from 2B to 34B
Models: https://hf.co/collections/YanweiLi/mini-gemini-6603c50b9b43d044171d0854
Paper: Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models (2403.18814)
GitHub: https://github.com/dvlab-research/MiniGemini

🔥VILA - On Pre-training for VLMs
Models: Efficient-Large-Model/vila-on-pre-training-for-visual-language-models-65d8022a3a52cd9bcd62698e
Paper: VILA: On Pre-training for Visual Language Models (2312.07533)

Misc
👀 FeatUp: a framework for image features at any resolution: mhamilton723/FeatUp FeatUp: A Model-Agnostic Framework for Features at Any Resolution (2403.10516)
🍞ColBERTus Maxiums, a colbertialized embedding model mixedbread-ai/mxbai-colbert-large-v1
🖌️Semantic Palette, a new drawing paradigm ironjr/SemanticPalette
🧑‍⚕️HistoGPT, a vision model that generates accurate pathology reports marr-peng-lab/histogpt https://www.medrxiv.org/content/10.1101/2024.03.15.24304211v1

4 replies

·

liked a model 8 months ago

mobiuslabsgmbh/Llama-2-7b-chat-hf_2bitgs8_hqq

Text Generation • Updated Mar 27 • 6 • 34