Answer.AI

company

https://www.answer.ai

AnswerDotAI

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

bwarner new activity 2 days ago

answerdotai/ModernBERT-base:Inference fails on CPU: `ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?)`

bwarner new activity 2 days ago

answerdotai/ModernBERT-base:ValueError: The checkpoint you are trying to load has model type `modernbert`

bwarner new activity 2 days ago

answerdotai/ModernBERT-base:Set tokenizer "model_max_length" property to 8192

View all activity

answerdotai's activity

bwarner

in answerdotai/ModernBERT-base 2 days ago

Inference fails on CPU: `ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?)`

#10 opened 27 days ago by

umarbutler

ValueError: The checkpoint you are trying to load has model type `modernbert`

#37 opened 17 days ago by

Sengil

Set tokenizer "model_max_length" property to 8192

#39 opened 16 days ago by

NohTow

bwarner

in answerdotai/ModernBERT-large 2 days ago

Set tokenizer "model_max_length" property to 8192

#9 opened 16 days ago by

NohTow

Mention that users should use transformers v4.48.0

#12 opened 5 days ago by

tomaarsen

bwarner

in answerdotai/ModernBERT-base 2 days ago

Mention that users should use transformers v4.48.0

#50 opened 5 days ago by

tomaarsen

posted an update 3 days ago

Post

4082

🏎️ Today I'm introducing a method to train static embedding models that run 100x to 400x faster on CPU than common embedding models, while retaining 85%+ of the quality! Including 2 fully open models: training scripts, datasets, metrics.

We apply our recipe to train 2 Static Embedding models that we release today! We release:
2️⃣ an English Retrieval model and a general-purpose Multilingual similarity model (e.g. classification, clustering, etc.), both Apache 2.0
🧠 my modern training strategy: ideation -> dataset choice -> implementation -> evaluation
📜 my training scripts, using the Sentence Transformers library
📊 my Weights & Biases reports with losses & metrics
📕 my list of 30 training and 13 evaluation datasets

The 2 Static Embedding models have the following properties:
🏎️ Extremely fast, e.g. 107500 sentences per second on a consumer CPU, compared to 270 for 'all-mpnet-base-v2' and 56 for 'gte-large-en-v1.5'
0️⃣ Zero active parameters: No Transformer blocks, no attention, not even a matrix multiplication. Super speed!
📏 No maximum sequence length! Embed texts at any length (note: longer texts may embed worse)
📐 Linear instead of exponential complexity: 2x longer text takes 2x longer, instead of 2.5x or more.
🪆 Matryoshka support: allow you to truncate embeddings with minimal performance loss (e.g. 4x smaller with a 0.56% perf. decrease for English Similarity tasks)

Check out the full blogpost if you'd like to 1) use these lightning-fast models or 2) learn how to train them with consumer-level hardware: https://huggingface.co/blog/static-embeddings

The blogpost contains a lengthy list of possible advancements; I'm very confident that our 2 models are only the tip of the iceberg, and we may be able to get even better performance.

Alternatively, check out the models:
* sentence-transformers/static-retrieval-mrl-en-v1
* sentence-transformers/static-similarity-mrl-multilingual-v1

1 reply

tomaarsen

posted an update 18 days ago

Post

2814

That didn't take long! Nomic AI has finetuned the new ModernBERT-base encoder model into a strong embedding model for search, classification, clustering and more!

Details:
🤖 Based on ModernBERT-base with 149M parameters.
📊 Outperforms both nomic-embed-text-v1 and nomic-embed-text-v1.5 on MTEB!
🏎️ Immediate FA2 and unpacking support for super efficient inference.
🪆 Trained with Matryoshka support, i.e. 2 valid output dimensionalities: 768 and 256.
➡️ Maximum sequence length of 8192 tokens!
2️⃣ Trained in 2 stages: unsupervised contrastive data -> high quality labeled datasets.
➕ Integrated in Sentence Transformers, Transformers, LangChain, LlamaIndex, Haystack, etc.
🏛️ Apache 2.0 licensed: fully commercially permissible

Try it out here: nomic-ai/modernbert-embed-base

Very nice work by Zach Nussbaum and colleagues at Nomic AI.

ncoop57

authored 2 papers 28 days ago

Stable Code Technical Report

Paper • 2404.01226 • Published Apr 1, 2024 • 1

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published about 1 month ago • 123

rbiswasfc

authored a paper 29 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published about 1 month ago • 123

fladhak

authored a paper 29 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published about 1 month ago • 123

griffin

authored a paper 29 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published about 1 month ago • 123

jph00

authored 2 papers 29 days ago

The Matrix Calculus You Need For Deep Learning

Paper • 1802.01528 • Published Feb 5, 2018

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published about 1 month ago • 123

bwarner

authored a paper 29 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published about 1 month ago • 123

tomaarsen

authored a paper 30 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published about 1 month ago • 123

bclavie

authored a paper 30 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published about 1 month ago • 123

freddyaboulton

posted an update about 1 month ago

Post

1405

Just created a Gradio space for playing with the new OAI realtime voice API!

freddyaboulton/openai-realtime-voice

freddyaboulton

posted an update about 1 month ago

Post

743

Gemini can talk 🗣️

Check out the new multimodal API from Google on @akhaliq 's anychat or my space. It's very fast and smart 🍓

https://huggingface.co/spaces/freddyaboulton/gemini-voicehttps://huggingface.co/spaces/akhaliq/anychat

1 reply

AI & ML interests

Recent Activity

Team members 19

answerdotai's activity

Inference fails on CPU: `ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?)`

ValueError: The checkpoint you are trying to load has model type `modernbert`

Set tokenizer "model_max_length" property to 8192

Set tokenizer "model_max_length" property to 8192

Mention that users should use transformers v4.48.0

Mention that users should use transformers v4.48.0