9 6 6

Nadav Timor

Nadav-Timor

AI & ML interests

None yet

Recent Activity

new activity 12 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B:Vocab size in config.json mismatches the actual tokenizer size

upvoted a paper about 2 months ago

MambaByte: Token-free Selective State Space Model

published an article 4 months ago

Universal Assisted Generation: Faster Decoding with Any Assistant Model

View all activity

Organizations

Nadav-Timor's activity

New activity in deepseek-ai/DeepSeek-R1-Distill-Qwen-7B 12 days ago

Vocab size in config.json mismatches the actual tokenizer size

#4 opened 24 days ago by

Fizzarolli

upvoted a paper about 2 months ago

MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 54

published an article 4 months ago

Article

Universal Assisted Generation: Faster Decoding with Any Assistant Model

Oct 29, 2024

• 52

upvoted an article 4 months ago

Article

Faster Assisted Generation with Dynamic Speculation

Oct 8, 2024

• 45

published an article 4 months ago

Article

Faster Assisted Generation with Dynamic Speculation

Oct 8, 2024

• 45

upvoted a paper 9 months ago

StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 30

upvoted a collection 9 months ago

speed

Collection

23 items • Updated Jun 27, 2024 • 6

upvoted a paper 9 months ago

Distributed Speculative Inference of Large Language Models

Paper • 2405.14105 • Published May 23, 2024 • 17

authored a paper 9 months ago

Distributed Speculative Inference of Large Language Models

Paper • 2405.14105 • Published May 23, 2024 • 17

New activity in facebook/CyberSecEval 10 months ago

`llama3p-70b-rc3_vr_mid_3` & `llama3p-7b-rc3_vr_mid_2`?

#2 opened 10 months ago by

Nadav-Timor

reacted to julien-c's post with 👍 12 months ago

Post

What if you could casually access your remote GPU in HF Spaces from the comfort of your local VSCode 🤯

8 replies

updated a dataset over 1 year ago

Nadav-Timor/CUAD

Viewer • Updated Nov 8, 2023 • 13.8k • 63

New activity in amazon/MistralLite over 1 year ago

`max_position_embeddings=32768` and `precompute_freqs_cis` with `end=128_000`

#6 opened over 1 year ago by

Nadav-Timor

New activity in mistralai/Mistral-7B-Instruct-v0.1 over 1 year ago

`max_position_embeddings=32768` with "attention span of 131K tokens"

#57 opened over 1 year ago by

Nadav-Timor

liked a dataset over 1 year ago

chargoddard/QuALITY-instruct

Viewer • Updated Jul 14, 2023 • 4.61k • 68 • 2

liked a model over 1 year ago

amazon/MistralLite

Text Generation • Updated May 16, 2024 • 3.81k • 428

liked a Space over 1 year ago

892

Model Memory Utility

🚀

Calculate memory needed to train AI models

New activity in joaogante/assisted_generation_demo over 1 year ago

Space isn't working because there is a runtime error

#1 opened over 1 year ago by

Nadav-Timor

liked a model over 1 year ago

mistralai/Mistral-7B-Instruct-v0.1

Text Generation • Updated Aug 22, 2024 • 164k • 1.56k

New activity in Salesforce/codet5p-220m over 1 year ago

Challenges in reproducing the HumanEval scores reported in the paper with BigCode's Eval Harness

#2 opened over 1 year ago by

Nadav-Timor