Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
hendrycksTest-logical_fallacies | 1 | acc | 0.3067 | ± | 0.0362 |
acc_norm | 0.3067 | ± | 0.0362 | ||
hendrycksTest-global_facts | 1 | acc | 0.3 | ± | 0.0461 |
acc_norm | 0.3 | ± | 0.0461 | ||
hendrycksTest-abstract_algebra | 1 | acc | 0.2700 | ± | 0.0446 |
acc_norm | 0.2700 | ± | 0.0446 | ||
hendrycksTest-college_chemistry | 1 | acc | 0.3100 | ± | 0.0465 |
acc_norm | 0.3100 | ± | 0.0465 | ||
hendrycksTest-college_physics | 1 | acc | 0.2157 | ± | 0.0409 |
acc_norm | 0.2157 | ± | 0.0409 | ||
hendrycksTest-formal_logic | 1 | acc | 0.2857 | ± | 0.0404 |
acc_norm | 0.2857 | ± | 0.0404 |
Compared to TinyLlama-1.1B-Chat-v1.0:
Algebra UP 17.4%
Formal Logic UP 24.2%
Logical Fallacies UP 35.4%
Template Format: Alpaca
It took 4 hours to train in 1 epoch with an RTX 3090.
- Downloads last month
- 1,208
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.