DeepMount00
commited on
Commit
•
9de9507
1
Parent(s):
7fe7e2e
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Alireo-400M Model Card 📚
|
2 |
+
|
3 |
+
## Model Description
|
4 |
+
Alireo-400M is a lightweight yet powerful Italian language model with 400M parameters, designed to provide efficient natural language processing capabilities while maintaining a smaller footprint compared to larger models.
|
5 |
+
|
6 |
+
## Key Features
|
7 |
+
- **Architecture**: Transformer-based language model
|
8 |
+
- **Parameters**: 400M
|
9 |
+
- **Context Window**: 8K tokens
|
10 |
+
- **Training Data**: Curated Italian text corpus (books, articles, web content)
|
11 |
+
- **Model Size**: ~800MB
|
12 |
+
|
13 |
+
## Performance
|
14 |
+
Despite its compact size, Alireo-400M demonstrates impressive performance:
|
15 |
+
- Outperforms Qwen 0.5B across multiple benchmarks
|
16 |
+
- Maintains high accuracy in Italian language understanding tasks
|
17 |
+
- Efficient inference speed due to optimized architecture
|
18 |
+
|
19 |
+
## Limitations
|
20 |
+
- Limited context window compared to larger models
|
21 |
+
- May struggle with highly specialized technical content
|
22 |
+
- Performance may vary on dialectal variations
|
23 |
+
- Not suitable for multilingual tasks
|
24 |
+
|
25 |
+
## Hardware Requirements
|
26 |
+
- Minimum RAM: 2GB
|
27 |
+
- Recommended RAM: 4GB
|
28 |
+
- GPU: Optional, but recommended for faster inference
|
29 |
+
- Disk Space: ~1GB (including model and dependencies)
|
30 |
+
|
31 |
+
## License
|
32 |
+
Apache 2.0
|
33 |
+
|
34 |
+
## Citation
|
35 |
+
```bibtex
|
36 |
+
@software{alireo2024,
|
37 |
+
author = {[Michele Montebovi]},
|
38 |
+
title = {Alireo-400M: A Lightweight Italian Language Model},
|
39 |
+
year = {2024},
|
40 |
+
}
|
41 |
+
```
|