DeepMount00 commited on
Commit
9de9507
1 Parent(s): 7fe7e2e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Alireo-400M Model Card 📚
2
+
3
+ ## Model Description
4
+ Alireo-400M is a lightweight yet powerful Italian language model with 400M parameters, designed to provide efficient natural language processing capabilities while maintaining a smaller footprint compared to larger models.
5
+
6
+ ## Key Features
7
+ - **Architecture**: Transformer-based language model
8
+ - **Parameters**: 400M
9
+ - **Context Window**: 8K tokens
10
+ - **Training Data**: Curated Italian text corpus (books, articles, web content)
11
+ - **Model Size**: ~800MB
12
+
13
+ ## Performance
14
+ Despite its compact size, Alireo-400M demonstrates impressive performance:
15
+ - Outperforms Qwen 0.5B across multiple benchmarks
16
+ - Maintains high accuracy in Italian language understanding tasks
17
+ - Efficient inference speed due to optimized architecture
18
+
19
+ ## Limitations
20
+ - Limited context window compared to larger models
21
+ - May struggle with highly specialized technical content
22
+ - Performance may vary on dialectal variations
23
+ - Not suitable for multilingual tasks
24
+
25
+ ## Hardware Requirements
26
+ - Minimum RAM: 2GB
27
+ - Recommended RAM: 4GB
28
+ - GPU: Optional, but recommended for faster inference
29
+ - Disk Space: ~1GB (including model and dependencies)
30
+
31
+ ## License
32
+ Apache 2.0
33
+
34
+ ## Citation
35
+ ```bibtex
36
+ @software{alireo2024,
37
+ author = {[Michele Montebovi]},
38
+ title = {Alireo-400M: A Lightweight Italian Language Model},
39
+ year = {2024},
40
+ }
41
+ ```