File size: 7,357 Bytes
de0d905
 
b8bc3fe
 
 
 
 
 
 
 
 
 
 
 
 
de0d905
2d7635f
db52f10
b8bc3fe
fee8678
b8bc3fe
2d7635f
fee8678
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73b572d
155cb09
 
 
 
 
 
 
 
 
 
 
 
 
93e5659
155cb09
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2d7635f
b8bc3fe
 
 
2d7635f
b8bc3fe
 
 
 
 
 
 
 
2d7635f
b8bc3fe
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fee8678
 
b8bc3fe
 
2d7635f
fee8678
 
425ae03
fee8678
 
431ccc2
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
license: other
base_model:
- beomi/Llama-3-Open-Ko-8B-Instruct-preview
- cognitivecomputations/dolphin-2.9-llama3-8b
- NousResearch/Meta-Llama-3-8B-Instruct
- NousResearch/Meta-Llama-3-8B
- abacusai/Llama-3-Smaug-8B
- Locutusque/Llama-3-Orca-1.0-8B
library_name: transformers
tags:
- mergekit
- merge
- llama

---
# πŸ‡°πŸ‡· SmartLlama-3-Ko-8B
<a href="https://ibb.co/C8Tcw1F"><img src="https://i.ibb.co/QQ1gJbG/smartllama3.png" alt="smartllama3" border="0"></a><br />

SmartLlama-3-Ko-8B is a sophisticated AI model that integrates the capabilities of several advanced language models. This merged model is designed to excel in a variety of tasks ranging from technical problem-solving to multilingual communication.

## πŸ“• Merge Details

### Component Models and Contributions

### 1. NousResearch/Meta-Llama-3-8B and Meta-Llama-3-8B-Instruct
- **General Language Understanding and Instruction-Following**: These base models provide a robust foundation in general language understanding. The instruct version is optimized to follow detailed user instructions, enhancing the model's utility in task-oriented dialogues.

### 2. cognitivecomputations/dolphin-2.9-llama3-8b
- **Complex Problem-Solving and Depth of Understanding**: Enhances the model's capabilities in technical and scientific domains, improving its performance in complex problem-solving and areas requiring intricate understanding.

### 3. abacusai/Llama-3-Smaug-8B
- **Multi-Turn Conversational Abilities**: Improves performance in real-world multi-turn conversations, crucial for applications in customer service and interactive learning.A multi-turn conversation refers to a dialogue that consists of several back-and-forth exchanges between participants. Unlike a single-turn interaction, where the conversation might end after one question and one response, multi-turn conversations require ongoing engagement from both sides. In such conversations, the context from previous messages is often crucial in shaping the response of each participant, making it necessary for them to remember or keep track of what was said earlier.For AI systems like chatbots or virtual assistants, the ability to handle multi-turn conversations is crucial. It allows the AI to engage more naturally and effectively with users, simulating human-like interactions. This capability is particularly important in customer service, where understanding the history of a customer’s issue can lead to more accurate and helpful responses, or in scenarios like therapy or tutoring, where the depth of the conversation can significantly impact the effectiveness of the interaction.

### 4. Locutusque/Llama-3-Orca-1.0-8B
- **Specialization in Math, Coding, and Writing**: Enhances the model's ability to handle mathematical equations, generate computer code, and produce high-quality written content.

### 5. beomi/Llama-3-Open-Ko-8B-Instruct-preview
- **Enhanced Korean Language Capabilities**: Specifically trained to understand and generate Korean, valuable for bilingual or multilingual applications targeting Korean-speaking audiences.

### Merging Technique: DARE TIES
- **Balanced Integration**: The DARE TIES method ensures that each component model contributes its strengths in a balanced manner, maintaining a high level of performance across all integrated capabilities.

### Overall Capabilities
SmartLlama-3-Ko-8B is highly capable and versatile, suitable for:

- **Technical and Academic Applications**: Enhanced capabilities in math, coding, and technical writing.
- **Customer Service and Interactive Applications**: Advanced conversational skills and sustained interaction handling.
- **Multilingual Communication**: Specialized training in Korean enhances its utility in global or region-specific settings.

This comprehensive capability makes SmartLlama-3-Ko-8B not only a powerful tool for general-purpose AI tasks but also a specialized resource for industries and applications demanding high levels of technical and linguistic precision.

## πŸ’» Ollama

```
ollama create smartllama-3-ko-8b -f ./Modelfile_Q5_K_M
```

[Modelfile_Q5_K_M]
```
FROM smartllama-3-ko-8b-Q5_K_M.gguf
TEMPLATE """
{{- if .System }}
system
<s>{{ .System }}</s>
{{- end }}
user
<s>Human:
{{ .Prompt }}</s>
assistant
<s>Assistant:
"""

SYSTEM """
μΉœμ ˆν•œ μ±—λ΄‡μœΌλ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ μžμ„Έν•˜κ³  μΉœμ ˆν•˜κ²Œ λ‹΅ν•˜μž. λͺ¨λ“  λŒ€λ‹΅μ€ ν•œκ΅­μ–΄(Korean)으둜 λŒ€λ‹΅ν•΄μ€˜.
"""

PARAMETER temperature 0.7
PARAMETER num_predict 256
PARAMETER num_ctx 4096
PARAMETER stop "<s>"
PARAMETER stop "</s>"
```
## πŸ–‹οΈ Merge Method

This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) as a base.

## 🎭 Models Merged

The following models were included in the merge:
* [beomi/Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview)
* [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
* [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
* [abacusai/Llama-3-Smaug-8B](https://huggingface.co/abacusai/Llama-3-Smaug-8B)
* [Locutusque/Llama-3-Orca-1.0-8B](https://huggingface.co/Locutusque/Llama-3-Orca-1.0-8B)

## πŸ—žοΈ Configuration

The following YAML configuration was used to produce this model:

```yaml
models:
  - model: NousResearch/Meta-Llama-3-8B
    # Base model providing a general foundation without specific parameters
  - model: NousResearch/Meta-Llama-3-8B-Instruct
    parameters:
      density: 0.58
      weight: 0.25  
  - model: cognitivecomputations/dolphin-2.9-llama3-8b
    parameters:
      density: 0.52
      weight: 0.15  
  - model: Locutusque/Llama-3-Orca-1.0-8B
    parameters:
      density: 0.52
      weight: 0.15  
  - model: abacusai/Llama-3-Smaug-8B
    parameters:
      density: 0.52
      weight: 0.15  
  - model: beomi/Llama-3-Open-Ko-8B-Instruct-preview
    parameters:
      density: 0.53
      weight: 0.2   
merge_method: dare_ties
base_model: NousResearch/Meta-Llama-3-8B
parameters:
  int8_mask: true
dtype: bfloat16


```

### 🎊 Test Result

**Korean Multi Turn Conversation**
<a href="https://ibb.co/TKPGx9G"><img src="https://i.ibb.co/0BYLRHL/Screenshot-2024-04-30-at-2-42-18-PM.png" alt="Screenshot-2024-04-30-at-2-42-18-PM" border="0"></a>
<a href="https://ibb.co/v40tkNj"><img src="https://i.ibb.co/hF3qVGm/Screenshot-2024-04-30-at-8-26-57-AM.png" alt="Screenshot-2024-04-30-at-8-26-57-AM" border="0"></a>

**Programming**
<a href="https://ibb.co/6tZLqwx"><img src="https://i.ibb.co/n10tKmv/Screenshot-2024-04-30-at-8-30-35-AM.png" alt="Screenshot-2024-04-30-at-8-30-35-AM" border="0"></a>

**Physics & Math**
<a href="https://ibb.co/jDhVNk0"><img src="https://i.ibb.co/jDhVNk0/Screenshot-2024-04-30-at-1-06-16-PM.png" alt="Screenshot-2024-04-30-at-1-06-16-PM" border="0"></a>
<a href="https://ibb.co/KKgN4j5"><img src="https://i.ibb.co/KKgN4j5/Screenshot-2024-04-30-at-1-06-31-PM.png" alt="Screenshot-2024-04-30-at-1-06-31-PM" border="0"></a>
<a href="https://ibb.co/ZzKHP5j"><img src="https://i.ibb.co/ZzKHP5j/Screenshot-2024-04-30-at-1-06-47-PM.png" alt="Screenshot-2024-04-30-at-1-06-47-PM" border="0"></a>