aashish1904 commited on
Commit
7808ba8
1 Parent(s): 8afb9d5

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +112 -0
README.md ADDED
@@ -0,0 +1,112 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ base_model:
5
+ - qnguyen3/VyLinh-3B
6
+ - Qwen/Qwen2.5-3B-Instruct
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+ language:
12
+ - vi
13
+
14
+ ---
15
+
16
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
17
+
18
+
19
+ # QuantFactory/Arcee-VyLinh-GGUF
20
+ This is quantized version of [arcee-ai/Arcee-VyLinh](https://huggingface.co/arcee-ai/Arcee-VyLinh) created using llama.cpp
21
+
22
+ # Original Model Card
23
+
24
+ **Quantized Version**: [arcee-ai/Arcee-VyLinh-GGUF](https://huggingface.co/arcee-ai/Arcee-VyLinh-GGUF)
25
+
26
+ # Arcee-VyLinh
27
+
28
+ Arcee-VyLinh is a 3B parameter instruction-following model specifically optimized for Vietnamese language understanding and generation. Built through an innovative training process combining evolved hard questions and iterative Direct Preference Optimization (DPO), it achieves remarkable performance despite its compact size.
29
+
30
+ ## Model Details
31
+
32
+ - **Architecture:** Based on Qwen2.5-3B
33
+ - **Parameters:** 3 billion
34
+ - **Context Length:** 32K tokens
35
+ - **Training Data:** Custom evolved dataset + ORPO-Mix-40K (Vietnamese)
36
+ - **Training Method:** Multi-stage process including EvolKit, proprietary merging, and iterative DPO
37
+ - **Input Format:** Supports both English and Vietnamese, optimized for Vietnamese
38
+
39
+ ## Intended Use
40
+
41
+ - Vietnamese language chat and instruction following
42
+ - Text generation and completion
43
+ - Question answering
44
+ - General language understanding tasks
45
+ - Content creation and summarization
46
+
47
+ ## Performance and Limitations
48
+
49
+ ### Strengths
50
+
51
+ - Exceptional performance on complex Vietnamese language tasks
52
+ - Efficient 3B parameter architecture
53
+ - Strong instruction-following capabilities
54
+ - Competitive with larger models (4B-8B parameters)
55
+
56
+ ### Benchmarks
57
+
58
+ Tested on Vietnamese subset of m-ArenaHard (CohereForAI), with Claude 3.5 Sonnet as judge:
59
+
60
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630430583926de1f7ec62c6b/m1bTn0vkiPKZ3uECC4b0L.png)
61
+
62
+ ### Limitations
63
+
64
+ - Might still hallucinate on cultural-specific content.
65
+ - Primary focus on Vietnamese language understanding
66
+ - May not perform optimally for specialized technical domains
67
+
68
+ ## Training Process
69
+
70
+ Our training pipeline consisted of several innovative stages:
71
+
72
+ 1. **Base Model Selection:** Started with Qwen2.5-3B
73
+ 2. **Hard Question Evolution:** Generated 20K challenging questions using EvolKit
74
+ 3. **Initial Training:** Created VyLinh-SFT through supervised fine-tuning
75
+ 4. **Model Merging:** Proprietary merging technique with Qwen2.5-3B-Instruct
76
+ 5. **DPO Training:** 6 epochs of iterative DPO using ORPO-Mix-40K
77
+ 6. **Final Merge:** Combined with Qwen2.5-3B-Instruct for optimal performance
78
+
79
+ ## Usage Examples
80
+
81
+ ```python
82
+ from transformers import AutoModelForCausalLM, AutoTokenizer
83
+
84
+ # Load the model and tokenizer
85
+ model = AutoModelForCausalLM.from_pretrained("arcee-ai/Arcee-VyLinh")
86
+ tokenizer = AutoTokenizer.from_pretrained("arcee-ai/Arcee-VyLinh")
87
+
88
+ prompt = "Một cộng một bằng mấy?"
89
+ messages = [
90
+ {"role": "system", "content": "Bạn là trợ lí hữu ích."},
91
+ {"role": "user", "content": prompt}
92
+ ]
93
+ text = tokenizer.apply_chat_template(
94
+ messages,
95
+ tokenize=False,
96
+ add_generation_prompt=True
97
+ )
98
+ model_inputs = tokenizer([text], return_tensors="pt").to(device)
99
+
100
+ generated_ids = model.generate(
101
+ model_inputs.input_ids,
102
+ max_new_tokens=1024,
103
+ eos_token_id=tokenizer.eos_token_id,
104
+ temperature=0.25,
105
+ )
106
+ generated_ids = [
107
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
108
+ ]
109
+
110
+ response = tokenizer.batch_decode(generated_ids)[0]
111
+ print(response)
112
+ ```