Triangle104 commited on
Commit
e19688c
·
verified ·
1 Parent(s): 25c5eee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +151 -0
README.md CHANGED
@@ -17,6 +17,157 @@ language:
17
  This model was converted to GGUF format from [`Spestly/Athena-1-0.5B`](https://huggingface.co/Spestly/Athena-1-0.5B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/Spestly/Athena-1-0.5B) for more details on the model.
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ## Use with llama.cpp
21
  Install llama.cpp through brew (works on Mac and Linux)
22
 
 
17
  This model was converted to GGUF format from [`Spestly/Athena-1-0.5B`](https://huggingface.co/Spestly/Athena-1-0.5B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/Spestly/Athena-1-0.5B) for more details on the model.
19
 
20
+ ---
21
+ Model details:
22
+ -
23
+ Athena-1 0.5B is a fine-tuned, instruction-following large language model derived from Qwen/Qwen2.5-0.5B-Instruct.
24
+ Designed for ultra-lightweight applications, Athena-1 0.5B balances
25
+ compactness with robust performance, making it suitable for tasks with
26
+ limited computational resources.
27
+
28
+
29
+
30
+
31
+
32
+
33
+
34
+
35
+ Key Features
36
+
37
+
38
+
39
+
40
+
41
+
42
+
43
+
44
+
45
+ ⚡ Ultra-Lightweight and Efficient
46
+
47
+
48
+
49
+
50
+ Compact Size: With just 500 million parameters, Athena-1 0.5B is ideal for edge devices and low-resource environments.
51
+ Instruction Following: Fine-tuned for reliable adherence to user instructions.
52
+ Coding and Mathematics: Capable of handling basic coding and mathematical tasks.
53
+
54
+
55
+
56
+
57
+
58
+
59
+
60
+ 📖 Contextual Understanding
61
+
62
+
63
+
64
+
65
+ Context Length: Supports up to 16,384 tokens, enabling processing of moderately sized conversations or documents.
66
+ Token Generation: Can generate up to 4K tokens of coherent output.
67
+
68
+
69
+
70
+
71
+
72
+
73
+
74
+ 🌍 Multilingual Support
75
+
76
+
77
+
78
+
79
+ Supports 20+ languages, including:
80
+ English, Chinese, French, Spanish, German, Italian, Russian
81
+ Japanese, Korean, Vietnamese, Thai, and more.
82
+
83
+
84
+
85
+
86
+
87
+
88
+
89
+
90
+
91
+ 📊 Structured Data & Outputs
92
+
93
+
94
+
95
+
96
+ Structured Data Interpretation: Handles formats like tables and JSON effectively.
97
+ Structured Output Generation: Produces well-formatted outputs for data-specific tasks.
98
+
99
+
100
+
101
+
102
+
103
+
104
+
105
+
106
+ Model Details
107
+
108
+
109
+
110
+
111
+ Base Model: Qwen/Qwen2.5-0.5B-Instruct
112
+ Architecture: Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings.
113
+ Parameters: 500M total.
114
+ Layers: (Adjust if different from the base model)
115
+ Attention Heads: (Adjust if different from the base model)
116
+ Context Length: Up to 16,384 tokens.
117
+
118
+
119
+
120
+
121
+
122
+
123
+
124
+
125
+ Applications
126
+
127
+
128
+
129
+
130
+ Athena-1 0.5B is optimized for:
131
+
132
+
133
+ Conversational AI: Power lightweight and responsive chatbots.
134
+ Code Assistance: Basic code generation, debugging, and explanations.
135
+ Mathematical Assistance: Solves fundamental math problems.
136
+ Document Processing: Summarizes and analyzes smaller documents effectively.
137
+ Multilingual Tasks: Supports global use cases with a compact model.
138
+ Structured Data: Reads and generates structured formats like JSON and tables.
139
+
140
+
141
+
142
+
143
+
144
+
145
+
146
+
147
+ Quickstart
148
+
149
+
150
+
151
+
152
+ Here’s how you can use Athena-1 0.5B for quick text generation:
153
+
154
+
155
+ # Use a pipeline as a high-level helper
156
+ from transformers import pipeline
157
+
158
+ messages = [
159
+ {"role": "user", "content": "What can you do?"},
160
+ ]
161
+ pipe = pipeline("text-generation", model="Spestly/Athena-1-0.5B") # Update model name
162
+ print(pipe(messages))
163
+
164
+ # Load model directly
165
+ from transformers import AutoTokenizer, AutoModelForCausalLM
166
+
167
+ tokenizer = AutoTokenizer.from_pretrained("Spestly/Athena-1-0.5B") # Update model name
168
+ model = AutoModelForCausalLM.from_pretrained("Spestly/Athena-1-0.5B") # Update model name
169
+
170
+ ---
171
  ## Use with llama.cpp
172
  Install llama.cpp through brew (works on Mac and Linux)
173