21 17 74

Kiran Kamble

kiranr

ki6an

AI & ML interests

nlp,llm

Recent Activity

liked a dataset 15 days ago

Gryphe/Sonnet3.5-SlimOrcaDedupCleaned

liked a dataset 15 days ago

Magpie-Align/Magpie-Llama-3.1-Pro-MT-300K-Filtered

liked a dataset 15 days ago

virattt/financial-qa-10K

View all activity

Organizations

kiranr's activity

liked 4 datasets 15 days ago

updated a model 26 days ago

Writer/Palmyra-X-4.3-73B

Updated 26 days ago • 1.16k

reacted to melisa's post with 🔥 3 months ago

Post

2981

🔥 Introducing "Writing in the Margins (WiM)" - better inference pattern for long context LLMs that solves the Lost-in-the-Middle problem 🔥

Paper page: Writing in the Margins: Better Inference Pattern for Long Context Retrieval (2408.14906)

TL;DR
Make your model write "margin notes" as you chunk prefill the KV cache. Then ask it reread all notes before it speaks up.
Works with humans, works with AI 🤖

WiM leverages the chunked prefill of the key-value cache, which concurrently generates query-based extractive summaries at each step of the prefill that are subsequently reintegrated at the end of the computation. We term these intermediate outputs “margins”, drawing inspiration from the practice of making margin notes for improved comprehension of long contexts in human reading. We show that this technique, which adds only minimal additional computation, significantly improves LLMs long context reasoning capabilities.

Think: Every chunk has a chance to be attended to/ be at the end of the context at least once. 🎉

📊 Results:
- An average accuracy boost of 7.5% in multi-hop reasoning tasks like HotpotQA and MultiHop-RAG.
- Even a 30% increase in F1-score for summarisation-like tasks (CWE).

Plus, WiM fits seamlessly into interactive applications (think: progress bar!). It can provide real-time progress updates during data retrieval and integration, making it user-friendly and transparent - a stark contrast to feeding 1mln tokens to an LLMs and waiting 6 min for the first token. 🤯

👩‍💻🧑‍💻 Check it out and contribute to our open-source project here: https://github.com/writer/writing-in-the-margins

🧠 More about chunked prefill: https://docs.vllm.ai/en/latest/models/performance.html#chunked-prefill

2 replies

updated a collection 3 months ago

papers

Collection

79 items • Updated Aug 28

upvoted a paper 3 months ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27 • 138

authored a paper 3 months ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27 • 138

commented a paper 3 months ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27 • 138 •

upvoted an article 3 months ago

Article

Using Writer Framework with Hugging Face Spaces

•

Aug 20

• 30

updated a collection 3 months ago

papers

Collection

79 items • Updated Aug 28

upvoted 2 papers 3 months ago

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20 • 58

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 41

updated a collection 4 months ago

papers

Collection

79 items • Updated Aug 28

upvoted a paper 4 months ago

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Paper • 2408.08152 • Published Aug 15 • 52

New activity in Writer/Palmyra-Med-70B-32K 4 months ago

Access the model via Ollama

#3 opened 4 months ago by

gileneo

reacted to samjulien's post with 🔥 4 months ago

Post

1897

🔥 Today, Writer dropped Palmyra-Med-70b and Palmyra-Fin-70b, two new domain-specific models that are setting a new standard for medical and financial model performance.

TL;DR
Palmyra-Med-70b
🔢 8k and 32k versions available
🚀 MMLU performance of ~86%, outperforming other top models
👨‍⚕️ Great for diagnosing, planning treatments, medical research, insurance coding and billing
📃 Open-model license for non-commercial use cases
🤗 Available on Hugging Face: Writer/Palmyra-Med-70B
💾 Live on NVIDIA NIM: https://build.nvidia.com/writer/palmyra-med-70b

Palmyra-Fin-70b
🚀 Passed the CFA Level III exam with a 73% score — the first model to do so
💸 Skilled at complex tasks like investment research, financial analysis, and sentiment analysis
📈 Outperformed other top models on a long-fin-eval test of real-world use cases
📃 Open-model license for non-commercial use cases
🤗 Available on Hugging Face: Writer/Palmyra-Fin-70B-32K
💾 Live on NVIDIA NIM: https://build.nvidia.com/writer/palmyra-fin-70b-32k

Try them out and let us know what you think!

2 replies

New activity in Writer/Palmyra-Fin-70B-32K 4 months ago

Dataset sources

#1 opened 4 months ago by

MatVet