Collection of State-of-the-art FP8 Block Quantized Models
NM Testing
company
AI & ML interests
None defined yet.
Recent Activity
View all activity
models
492
nm-testing/TinyLlama-1.1B-Chat-v1.0-actorder-weight-e2e
0.3B
•
Updated
•
162
nm-testing/TinyLlama-1.1B-Chat-v1.0-actorder-group-e2e
0.3B
•
Updated
•
314
nm-testing/TinyLlama-1.1B-Chat-v1.0-W4A16_2of4-e2e
0.3B
•
Updated
•
171
nm-testing/TinyLlama-1.1B-Chat-v1.0-W4A16_2of4_channel-e2e
0.3B
•
Updated
•
180
nm-testing/TinyLlama-1.1B-Chat-v1.0-sparse2of4_only-e2e
0.7B
•
Updated
•
175
nm-testing/TinyLlama-1.1B-Chat-v1.0-sparse2of4_fp8_dynamic-e2e
0.7B
•
Updated
•
168
nm-testing/TinyLlama-1.1B-Chat-v1.0-kv_cache_default_tinyllama-e2e
1B
•
Updated
•
134
nm-testing/Phi-3-mini-4k-instruct-kv_cache_default_phi3-e2e
4B
•
Updated
•
179
nm-testing/TinyLlama-1.1B-Chat-v1.0-kv_cache_default_gptq_tinyllama-e2e
0.3B
•
Updated
•
160
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8_tensor_weight_static_per_tensor_act-e2e
1B
•
Updated
•
188