Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
8
Jonathan von Rad
jonny-vr
Follow
0 followers
ยท
1 following
jonny-vr
AI & ML interests
LLM Compression & Mechanistic Interpretability
Recent Activity
updated
a model
2 days ago
jonny-vr/mv-final-assignment-gru
updated
a model
3 days ago
jonny-vr/mv-final-assignment-gru-notebook
published
a model
3 days ago
jonny-vr/mv-final-assignment-gru-notebook
View all activity
Organizations
jonny-vr
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
updated
a model
2 days ago
jonny-vr/mv-final-assignment-gru
Updated
2 days ago
updated
a model
3 days ago
jonny-vr/mv-final-assignment-gru-notebook
Updated
3 days ago
published
2 models
3 days ago
jonny-vr/mv-final-assignment-gru-notebook
Updated
3 days ago
jonny-vr/mv-final-assignment-gru
Updated
2 days ago
New activity in
hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4
14 days ago
Tip: For Hardware Acceleration this Model will not leverage vllm marlin kernels!
#19 opened 14 days ago by
jonny-vr
updated
a model
18 days ago
jonny-vr/Llama-3.1-Minitron-4B-Depth-Chat
Text Generation
โข
5B
โข
Updated
18 days ago
โข
15
published
a model
18 days ago
jonny-vr/Llama-3.1-Minitron-4B-Depth-Chat
Text Generation
โข
5B
โข
Updated
18 days ago
โข
15
New activity in
Qwen/Qwen3-32B
24 days ago
Where is the Base Model?
๐
โ
10
3
#34 opened 7 months ago by
jonny-vr
New activity in
Harvard-DCML/boomerang-qwen3-4.9B
about 1 month ago
Substantially lower accuracy on reasoning benchmarks such as GSM8K (1.5%) and MATH-500 (4.2%)
1
#1 opened about 1 month ago by
jonny-vr
updated
a model
about 1 month ago
jonny-vr/mv-final-assignment
Updated
Dec 10, 2025
published
a model
about 1 month ago
jonny-vr/mv-final-assignment
Updated
Dec 10, 2025
New activity in
monology/pile-uncopyrighted
6 months ago
Could you please implement train:1% feature? This way we don't have to download the entire dataset.
1
#12 opened 6 months ago by
jonny-vr
New activity in
Qwen/Qwen3-32B
6 months ago
Low Score on GSM8K on lm-eval-harness? (just 74.91)
2
#36 opened 6 months ago by
jonny-vr
New activity in
nvidia/NV-Embed-v2
7 months ago
TypeError: cannot unpack non-iterable NoneType object
๐
๐
8
5
#37 opened 12 months ago by
Pietroferr
New activity in
google/gemma-3-27b-pt
7 months ago
Model is a Memory Hog - 2xH100 80GB OOM??
1
#5 opened 7 months ago by
jonny-vr
New activity in
google/gemma-3-1b-pt
7 months ago
When evaluating Wiki2, I just get Loss: Nan, while with gemma-3-1b-it it works..
2
#8 opened 7 months ago by
jonny-vr