BootsofLagrangian
commited on
Commit
โข
a1a4d4b
1
Parent(s):
9db0880
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,261 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
libray_name: transformers
|
3 |
+
pipeline_tag: text-generation
|
4 |
+
license: other
|
5 |
+
license_name: llama3
|
6 |
+
license_link: LICENSE
|
7 |
+
language:
|
8 |
+
- ko
|
9 |
+
- en
|
10 |
+
tags:
|
11 |
+
- meta
|
12 |
+
- llama
|
13 |
+
- llama-3
|
14 |
+
- akallama
|
15 |
+
library_name: transformers
|
16 |
+
|
17 |
+
inference: false
|
18 |
+
---
|
19 |
+
<a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
|
20 |
+
<img src="https://github.com/0110tpwls/project/blob/master/image_720.png?raw=true" width="40%"/>
|
21 |
+
</a>
|
22 |
+
|
23 |
+
|
24 |
+
# AKALLAMA
|
25 |
+
|
26 |
+
AkaLlama is a series of Korean language models designed for practical usability across a wide range of tasks.
|
27 |
+
The initial model, AkaLlama-v0.1, is a fine-tuned version of Meta-Llama-3-70b-Instruct. It has been trained on a custom mix of publicly available datasets curated by the MIR Lab.
|
28 |
+
Our goal is to explore cost-effective ways to adapt high-performing LLMs for specific use cases, such as different languages (e.g., Korean) or domains (e.g., organization-specific chatbots).
|
29 |
+
|
30 |
+
For details, check out [our project page](https://yonsei-mir.github.io/AkaLLaMA-page).
|
31 |
+
|
32 |
+
### Model Description
|
33 |
+
|
34 |
+
This repo provides ExLlamav2 weight files for AkaLlama-70B-v0.1.
|
35 |
+
|
36 |
+
- **Developed by:** [Yonsei MIRLab](https://mirlab.yonsei.ac.kr/)
|
37 |
+
- **Language(s) (NLP):** Korean, English
|
38 |
+
- **License:** llama3
|
39 |
+
- **Finetuned from model:** [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)
|
40 |
+
- **Quantized from model:** [mirlab/AkaLlama-llama3-70b-v0.1](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1)
|
41 |
+
- **Dataset for Quantization:** [maywell/korean_textbooks/ko_wikidata](https://huggingface.co/datasets/maywell/korean_textbooks/tree/main/ko_wikidata)
|
42 |
+
|
43 |
+
Main branch of this repo only contains README.md. Please uses specific branch.
|
44 |
+
|
45 |
+
### Available Branch
|
46 |
+
| Name | Head Bits | avg Bits | Size | Max RAM required |
|
47 |
+
| ---- | ---- | ---- | ---- | ---- |
|
48 |
+
| [AkaLlama-llama3-70b-v0.1.2bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/2bpwh8) | 8 | 2 | 21 GB| 23.5 GB |
|
49 |
+
| [AkaLlama-llama3-70b-v0.1.2.25bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/2.25bpwh8) | 8 | 2.25 | 21 GB| 23.5 GB |
|
50 |
+
| [AkaLlama-llama3-70b-v0.1.2.4bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/2.4bpwh8) | 8 | 2.4 | 23 GB| 25.5 GB |
|
51 |
+
| [AkaLlama-llama3-70b-v0.1.3bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/3bpwh8) | 8 | 3 | 27 GB| 29.5 GB |
|
52 |
+
| [AkaLlama-llama3-70b-v0.1.4bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/4bpwh8) | 8 | 4 | 35 GB| 37.5 GB |
|
53 |
+
| [AkaLlama-llama3-70b-v0.1.4.25bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/4.25bpwh8) | 8 | 4.25 | 37 GB| 39.35 GB |
|
54 |
+
| [AkaLlama-llama3-70b-v0.1.5bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/5bpwh8) | 8 | 5 | 43 GB| 45.5 GB |
|
55 |
+
| [AkaLlama-llama3-70b-v0.1.6bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/6bpwh8) | 8 | 6 | 51 GB| 53.5 GB |
|
56 |
+
| [AkaLlama-llama3-70b-v0.1.8bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/8bpwh8) | 8 | 8 | 59 GB| 61.5 GB |
|
57 |
+
|
58 |
+
**Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
|
59 |
+
|
60 |
+
|
61 |
+
### About ExLlamaV2
|
62 |
+
|
63 |
+
ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.
|
64 |
+
|
65 |
+
Requires ExllamaV2, which is being developed by turboderp https://github.com/turboderp/exllamav2 under an MIT license.
|
66 |
+
|
67 |
+
### Use with ExLlamaV2 Third Party
|
68 |
+
|
69 |
+
They are also compatible with many third party UIs and libraries.
|
70 |
+
|
71 |
+
- [TabbyAPI](https://github.com/theroyallab/tabbyAPI/) is a FastAPI-based server that provides an OpenAI-style web API
|
72 |
+
compatible with [SillyTavern](https://sillytavernai.com/) and other frontends.
|
73 |
+
|
74 |
+
- [ExUI](https://github.com/turboderp/exui) is a simple, standalone single-user web UI that serves an ExLlamaV2 instance
|
75 |
+
directly with chat and notebook modes.
|
76 |
+
|
77 |
+
- [text-generation-webui](https://github.com/oobabooga/text-generation-webui) supports ExLlamaV2 through the **exllamav2**
|
78 |
+
and **exllamav2_HF** loaders.
|
79 |
+
|
80 |
+
- [lollms-webui](https://github.com/ParisNeo/lollms-webui) supports ExLlamaV2 through the exllamav2 binding.
|
81 |
+
|
82 |
+
|
83 |
+
## Training Details
|
84 |
+
### Training Procedure
|
85 |
+
|
86 |
+
We trained AkaLlama using a preference learning alignment algorithm called [Odds Ratio Preference Optimization (ORPO)](https://huggingface.co/papers/2403.07691).
|
87 |
+
Our training pipeline is almost identical to that of [HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1](https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1), aside from minor hyperparameter changes.
|
88 |
+
Please check out Huggingface's [alignment handbook](https://github.com/huggingface/alignment-handbook?tab=readme-ov-file) for further details, including the chat template.
|
89 |
+
|
90 |
+
### Training Data
|
91 |
+
|
92 |
+
Detailed descriptions regarding training data will be announced later.
|
93 |
+
|
94 |
+
### Examples
|
95 |
+
|
96 |
+
<!--<a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
|
97 |
+
<img src="https://github.com/0110tpwls/project/blob/master/image (8).png?raw=true" width="80%"/>
|
98 |
+
</a> -->
|
99 |
+
<pre style="white-space: pre-wrap;">
|
100 |
+
User:
|
101 |
+
๋๋ ๋๊ตฌ์ด๊ณ ์ด๋ค ์ผ์ ํ ์ ์์ด?
|
102 |
+
Model:
|
103 |
+
์ ๋ ์ธ๊ณต์ง๋ฅ ์ธ์ด ๋ชจ๋ธ์ธ AkaLlama(์์นด๋ผ๋ง)์
๋๋ค. ์ ๋ ๋ค์ํ ์ฃผ์ ์ ํ์์ ํ
์คํธ๋ฅผ ์์ฑํ ์ ์์ผ๋ฉฐ, ์ง๋ฌธ์ ๋ตํ๊ณ , ๋ํ๋ฅผ ์งํํ๋ฉฐ, ์ด์ผ๊ธฐ๋ฅผ ๋ง๋ค ์ ์์ต๋๋ค. ๋ํ ์ฌ์ฉ์๊ฐ ์ง์ ํ ์ฃผ์ ๋ ์ฅ๋ฅด์ ๋ฐ๋ผ ์ฐฝ์์ ์ธ ๊ธ์ฐ๊ธฐ๋ ํธ์ง ์์
์ ์ํํ ์๋ ์์ต๋๋ค.
|
104 |
+
</pre>
|
105 |
+
<details>
|
106 |
+
|
107 |
+
<summary><b>Math Solving[CLICK TO EXPAND]</b></summary>
|
108 |
+
<!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
|
109 |
+
<img src="https://github.com/0110tpwls/project/blob/master/image (9).png?raw=true" width="80%"/>
|
110 |
+
</a> -->
|
111 |
+
<pre style="white-space: pre-wrap;">
|
112 |
+
User:
|
113 |
+
์๋ ๋ฌธ์ ๋ฅผ ๊น๊ฒ ์๊ฐํ๋ฉด์ ํ์ด๋ด
|
114 |
+
์ฐํ 5 ํ์ ์ง์ฐ๊ฐ 45 ๊ฐ๊ฐ ์์ต๋๋ค. ์ด๊ฒ์ ๋ ์ ์๋ ๋๋ก ๋ง์ ํ์๋ค์๊ฒ ๋จ๊น์์ด ๋๊ฐ์ด ๋๋์ด ์ฃผ๋ ค๊ณ ํฉ๋๋ค. ๋ช ๋ช
๊น์ง ๋๋์ด ์ค ์ ์์ต๋๊น?
|
115 |
+
Model:
|
116 |
+
์๋ ๋ฌธ์ ๋ฅผ ๊น๊ฒ ์๊ฐํ๋ฉด์ ํ์ด๋ด
์๋ค:
|
117 |
+
์ฐํ 5 ํ์ ์ง์ฐ๊ฐ 45 ๊ฐ๊ฐ ์์ต๋๋ค. ์ด๊ฒ์ ๋ ์ ์๋ ๋๋ก ๋ง์ ํ์๋ค์๊ฒ ๋จ๊น์์ด ๋๊ฐ์ด ๋๋์ด ์ฃผ๋ ค๊ณ ํฉ๋๋ค. ๋ช ๋ช
๊น์ง ๋๋์ด ์ค ์ ์์ต๋๊น?
|
118 |
+
ํ์ด:
|
119 |
+
1. ์ฐํ 5 ํ๋ฅผ 1 ๊ฐ์ฉ ๋๋ ์ ๋ช ๋ช
์๊ฒ ๋๋์ด ์ค ์ ์๋์ง ๋จผ์ ๊ตฌํฉ๋๋ค.
|
120 |
+
- ๊ฐ ์ฐํ์๋ 12๊ฐ์ ์ฐํ์ด ํฌํจ๋์ด ์์ต๋๋ค. ๋ฐ๋ผ์ ์ด 12 * 5 = 60๊ฐ์
๋๋ค.
|
121 |
+
- 60๊ฐ๋ฅผ 1๋ช
๋น 1๊ฐ์ฉ ๋๋๋ฉด 60๋ช
์ ํ์๋ค์๊ฒ ๋๋ ์ค ์ ์์ต๋๋ค.
|
122 |
+
2. ์ง์ฐ๊ฐ 45 ๊ฐ๋ฅผ 1 ๊ฐ์ฉ ๋๋์ด ๋ช ๋ช
์๊ฒ ๋๋์ด ์ค ์ ์๋์ง ๊ตฌํฉ๋๋ค.
|
123 |
+
- 45 ๊ฐ๋ฅผ 1 ๋ช
๋น 1 ๊ฐ์ฉ ๋๋๋ฉด 45 ๋ช
์ ํ์๋ค์๊ฒ ๋๋ ์ค ์ ์์ต๋๋ค.
|
124 |
+
3. ์ฐํ๊ณผ ์ง์ฐ๊ฐ์ ๋ ๊ฐ์ง ๋ฌผํ์ ํจ๊ป ๊ณ ๋ คํ์ฌ ๊ฐ์ฅ ๋ง์ ํ์๋ค์๊ฒ ๋๊ฐ์ด ๋๋์ด ์ค ์ ์๋ ๋ฐฉ๋ฒ์ ์ฐพ์ต๋๋ค.
|
125 |
+
- ์ฐํ๋ก๋ง 60๋ช
, ์ง์ฐ๊ฐ๋ก๋ง 45๋ช
์ ํ์์๊ฒ ๋๋ ์ค ์ ์์ผ๋ฏ๋ก, ๋ ์ข
๋ฅ์ ๋ฌธ๊ตฌ๋ฅผ ํจ๊ป ๊ณ ๋ คํ ๋๋ ์ด๋ณด๋ค ์ ์ ์์ ํ์์๊ฒ ๋๋ ์ฃผ์ด์ผ ํฉ๋๋ค.
|
126 |
+
- ์ด ์ค์์ ๊ณตํต์ธ์(common divisor)๋ฅผ ์ฐพ์์ ์ด๋ฅผ ์ต์ ๋จ์๋ก ์ค์ ํ๋ฉด ๋ฉ๋๋ค.
|
127 |
+
- 60๊ณผ 45์ ์ต๋๊ณต์ฝ์(GCD)๋ 15์
๋๋ค.
|
128 |
+
4. ์ฐํ๊ณผ ์ง์ฐ๊ฐ ๊ฐ๊ฐ์ GCD(15)๋ก ๋๋์ด ๋ช ๊ฐ์ฉ ๋๋ ์ค ์ ์๋์ง ๊ตฌํฉ๋๋ค.
|
129 |
+
- ์ฐํ์ 60 / 15 = 4๊ฐ์ฉ, ์ด 4 * 12 = 48๊ฐ์ ์ฐํ์ด ํ์ํฉ๋๋ค.
|
130 |
+
- ์ง์ฐ๊ฐ๋ 45 / 15 = 3๊ฐ์ฉ, ์ด 3 * 1 = 3๊ฐ์ ์ง์ฐ๊ฐ๊ฐ ํ์ํฉ๋๋ค.
|
131 |
+
5. ์ต์ข
์ ์ผ๋ก, ๊ฐ์ฅ ๋ง์ ํ์๋ค์๊ฒ ๋๊ฐ์ด ๋๋์ด ์ค ์ ์๋ ๋ฐฉ๋ฒ์ ์ฐํ๊ณผ ์ง์ฐ๊ฐ๋ฅผ GCD(15)๋ก ๋๋์ด ๊ฐ ํ์์๊ฒ 4๊ฐ์ ์ฐํ๊ณผ 3๊ฐ์ ์ง์ฐ๊ฐ๋ฅผ ์ ๊ณตํ๋ ๊ฒ์
๋๋ค.
|
132 |
+
- ์ด ๊ฒฝ์ฐ ์ด 48 + 3 = 51๊ฐ์ ๋ฌผํ์ ์ฌ์ฉํ์ฌ 15๋ช
์ ํ์๋ค์๊ฒ ๋๊ฐ์ด ๋๋ ์ค ์ ์์ต๋๋ค.
|
133 |
+
๊ฒฐ๋ก : ์ฐํ๊ณผ ์ง์ฐ๊ฐ ๊ฐ๊ฐ์ GCD(15)๋ก ๋๋์ด ๊ฐ ํ์์๊ฒ 4๊ฐ์ ์ฐํ๊ณผ 3๊ฐ์ ์ง์ฐ๊ฐ๋ฅผ ์ ๊ณตํ๋ ๋ฐฉ๋ฒ์ผ๋ก ์ต๋ 15๋ช
์ ํ์๋ค์๊ฒ ๋๊ฐ์ด ๋๋์ด ์ค ์ ์์ต๋๋ค.
|
134 |
+
</pre>
|
135 |
+
</details>
|
136 |
+
|
137 |
+
<details>
|
138 |
+
<summary><b>Writting[CLICK TO EXPAND]</b></summary>
|
139 |
+
<!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
|
140 |
+
<img src="https://github.com/0110tpwls/project/blob/master/image (13).png?raw=true" width="80%"/>
|
141 |
+
</a> -->
|
142 |
+
|
143 |
+
<pre style="white-space: pre-wrap;">
|
144 |
+
User:
|
145 |
+
์๋ ๋ฌธ์ฅ์ ๋ณด๊ณ ๋ฌธ๋ฒ์ ์ผ๋ก ํ๋ฆฐ๋ถ๋ถ์ ์ง์ ํด์ค
|
146 |
+
๊ทธ๋ ์์ด์น ์๊ณ ๋นํํ๋ค.
|
147 |
+
Model:
|
148 |
+
์์ด์น"๋ผ๋ ๋จ์ด๋ "์์ด"์ด๋ผ๋ ๋์ฌ์ "์น"๋ผ๋ ์ ๋ฏธ์ฌ๋ฅผ ์ฌ์ฉํ์ฌ ๋ง๋ ๋จ์ด์
๋๋ค. ์ด ๋จ์ด๋ "์์ด"์ ์๋ฏธ๋ฅผ ๋ฐ๋ณต์ ์ผ๋ก ํํํ๋ ๋ฐ ์ฌ์ฉ๋๋ ์กฐ๋์ฌ์ ์ฉ๋ฒ์ผ๋ก, "์์ด"์ด๋ ๋์ฌ์ ๋ฐ๋ณต์ ์ธ ํ๋์ ๊ฐ์กฐํฉ๋๋ค. ๋ฐ๋ผ์ ๋ฌธ์ฅ์์ "์์ด์น"๋ผ๋ ๋จ์ด๋ ์ค๋ณต์ ์ผ๋ก ์ฌ์ฉ๋์ด ๋ฌธ๋ฒ์ ์ผ๋ก ํ๋ฆฐ ๋ถ๋ถ์
๋๋ค.
|
149 |
+
|
150 |
+
์ ๋ต์ ๋ค์๊ณผ ๊ฐ์ต๋๋ค:
|
151 |
+
๊ทธ๋ ์์ด์์ด ๋นํํ๋ค.
|
152 |
+
</pre>
|
153 |
+
|
154 |
+
<!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
|
155 |
+
<img src="https://github.com/0110tpwls/project/blob/master/image (7).png?raw=true" width="80%"/>
|
156 |
+
</a> -->
|
157 |
+
|
158 |
+
<pre style="white-space: pre-wrap;">
|
159 |
+
User:
|
160 |
+
์ค๋์ฃผ ์์ธ์ "๋ณ ํค๋ ๋ฐค"์ ๋ณด๊ณ ๋น์ ์๋์์ ๊ณ ๋ คํด์ ํด์ํด๋ด. ๊ทธ๋ฆฌ๊ณ ์ด๋ฅผ ์๊ฐ์ ๋
๋ฐฑํ์์ผ๋ก ๋ค์ ์์ฑํด๋ด.
|
161 |
+
Model:
|
162 |
+
์ค๋์ฃผ๋ 1941๋
์ผ๋ณธ ์ ๊ตญ์ฃผ์ ์นํ์์ ์ด ์ํ์ ์ฐ๋ฉด์, ๊ฐ์์ ์ธ ์ํฉ ์์์๋ ์์ ์ ์ ์ฒด์ฑ๊ณผ ๋ฌธํ์ ๋ํ ์ ์ฐฉ์ ํํํ๊ณ ์ ํ์ต๋๋ค. ๊ทธ๋ ํ๋์ ๊ฐ๋ ์ฐฌ ๋ณ๋ค์ ๋ณด๋ฉฐ ๊ทธ ์์ ํ๋ ๋ ์๊ฒจ์ง๋ ๋ณ๋ค์ด ์๋ค๊ณ ํํํ์ต๋๋ค. ์ด๋ฌํ ๋ณ๋ค์ ๊ฐ๊ฐ ์ถ์ต, ์ฌ๋, ์ธ์ธํจ, ๋๊ฒฝ, ์ ๋ฑ ๋ค์ํ ๊ฐ์ ์ ๋ด๊ณ ์์ต๋๋ค.
|
163 |
+
์ค๋์ฃผ๋ ์ด ๏ฟฝ๏ฟฝ๋ค ์ค ํ๋์ ์ด๋จธ๋๋ฅผ ๋ถ๋ฅด๋ฉฐ ๋ฉ๋ฆฌ ๋จ์ด์ ธ ์๋ ๊ฒ์ ๊ทธ๋ฆฌ์ํ๋ฉฐ, ์์ ์ ์ด๋ฆ์๋ฅผ ์ธ๋ ์์ ์ฐ๊ณ ํ์ผ๋ก ๋ฎ์ด ๋ฒ๋ฆฐ๋ค๊ณ ๋งํฉ๋๋ค. ์ด๋ฌํ ํ๋์ ์์ ์ ์ ์ฒด์ฑ์ ํ์ธํ๊ณ , ์์ด๋ฒ๋ฆฐ ๊ฒ์ ๋์ฐพ์ผ๋ ค๋ ๋
ธ๋ ฅ์ผ๋ก ๋ณผ ์ ์์ต๋๋ค.
|
164 |
+
์ค๋์ฃผ๋ ๋ํ ์์ ์ด ์ฃฝ๊ณ ๋๋ฉด ๋ฌด๋ค ์์ ํ๋ ์๋๊ฐ ํผ์ด๋๋ฏ์ด ์์ ์ ์ด๋ฆ์ ๋ฌปํ ์ธ๋ ์์๋ ํ์ด ๋ฌด์ฑํ ๊ฒ์ด๋ผ๊ณ ๋งํฉ๋๋ค. ์ด๋ ๋น์์ ์ฌํ์๊ณผ ์๊ฐ์ ๋ด๋ฉด์ ๋ฐ์ํ๊ณ ์์ผ๋ฉฐ, ํ๊ตญ์ธ์ ์ ์ฒด์ฑ์ ํ์ธํ๊ณ ์์ด๋ฒ๋ฆฐ ๊ฒ์ ๋์ฐพ์ผ๋ ค๋ ๋
ธ๋ ฅ, ๊ทธ๋ฆฌ๊ณ ์ฃฝ์ ์ดํ์ ์์ ์ ์กด์ฌ๋ฅผ ๋จ๊ฒจ๋๊ณ ์ถ๋ค๋ ๋ฐ๋์ ๋ด๊ณ ์์ต๋๋ค.
|
165 |
+
์ด ์๋ฅผ ์๊ฐ์ ๋
๋ฐฑํ์์ผ๋ก ๋ค์ ์์ฑํ๋ฉด ๋ค์๊ณผ ๊ฐ์ต๋๋ค:
|
166 |
+
"์ ๋ ์ด ๋ณ๋ค์ด ๋๋ฌด๋ ๋ง์ ๊ฒ ๊ฐ์์. ํ๋ ๋ ์๊ฒจ์ง๋ ๋ณ๋ค ์ค์๋ ์ถ์ต๋ ์๊ณ , ์ฌ๋๋ ์๊ณ , ์ธ์ธํจ๊ณผ ๋๊ฒฝ๋ ์์ด์. ๊ทธ๋ฆฌ๊ณ ์ด๋จธ๋, ๋น์ ์ ๋ฉ๋ฆฌ ๋ถ๊ฐ๋์ ๊ณ์ ๊ฑด๊ฐ์? ์ ๋ ๋น์ ์ ๋ถ๋ฅด๊ณ ์ถ์ด์.
|
167 |
+
์ ์ ์ด๋ฆ์๋ ์ธ๋ ์์ ์ฐ๊ณ ํ์ผ๋ก ๋ฎ์ด ๋ฒ๋ ธ์ด์. ๋ด์ ๋ฐค์ ์์ ์ฐ๋ ๋ฒ๋ ์ฒ๋ผ ๋ถ๋๋ฌ์ด ์ด๋ฆ์ ์ฌํผํ๋ ์ด์ ๊ฐ ์์ด์. ํ์ง๋ง ๊ฒจ์ธ์ด ์ง๋๊ณ ๋์ ๋ณ์๋ ๋ด์ด ์ค๋ฉด, ๋ฌด๋ค ์์ ํ๋ ์๋๊ฐ ํผ์ด๋๋ฏ์ด ์ ์ ์ด๋ฆ์ ๋ฌปํ ์ธ๋ ์์์๋ ํ์ด ๋ฌด์ฑํ ๊ฑฐ์์.
|
168 |
+
์ ๋ ์ด ๋ณ๋ค์ ๋ณด๊ณ , ์์ ์ ์ ์ฒด์ฑ์ ํ์ธํ๊ณ ์ถ์ด์. ์์ด๋ฒ๋ฆฐ ๊ฒ์ ๋์ฐพ์ผ๋ ค๋ ๋
ธ๋ ฅ์ ํ๊ณ ์ถ์ด์. ์ฃฝ์ ์ดํ์๋ ์์ ์ ์กด์ฌ๋ฅผ ๋จ๊ฒจ๋๊ณ ์ถ์ด์."
|
169 |
+
</pre>
|
170 |
+
</details>
|
171 |
+
|
172 |
+
<details>
|
173 |
+
<summary><b>Logical Reasoning[CLICK TO EXPAND]</b></summary>
|
174 |
+
<!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
|
175 |
+
<img src="https://github.com/0110tpwls/project/blob/master/image (15).png?raw=true" width="80%"/>
|
176 |
+
</a> -->
|
177 |
+
<pre style="white-space: pre-wrap;">
|
178 |
+
User:
|
179 |
+
์ธ์ข
๋์์ ๋งฅ๋ถํ๋ก ๋์ง ์ฌ๊ฑด์ ๋ํด ์ค๋ช
ํด์ค
|
180 |
+
Model:
|
181 |
+
์ธ์ข
๋์์ด ๋งฅ๋ถํ๋ก๋ฅผ ๋์ง ์ฌ๊ฑด์ ์ฌ์ค๋ฌด๊ทผํ ์ด์ผ๊ธฐ์
๋๋ค. ์ธ์ข
๋์์ ์กฐ์ ์ 4๋ ๊ตญ์์ผ๋ก 1418๋
๋ถํฐ 1450๋
๊น์ง ์ฌ์ํ์ต๋๋ค. ๊ทธ๋ ๊ณผํ, ๋ฌธํ, ์์ ๋ฑ ๋ค์ํ ๋ถ์ผ์ ๊ด์ฌ์ ๊ฐ๊ณ ์ฐ๊ตฌํ์์ผ๋ฉฐ, ํนํ ํ๊ธ ์ฐฝ์ ์ ์ฒ๋ฌธํ์ ํฐ ๊ณตํ์ ํ์ต๋๋ค.
|
182 |
+
ํํธ ๋งฅ๋ถํ๋ก๋ ์ ํ์ด ์์ฐํ๋ ๋
ธํธ๋ถ ์ปดํจํฐ๋ก, ์ธ์ข
๋์ ์๋์ ๊ธฐ์ ๋ก๋ ์กด์ฌํ ์ ์๋ ์ ํ์
๋๋ค. ๋ฐ๋ผ์ ์ธ์ข
๋์์ด ๋งฅ๋ถํ๋ก๋ฅผ ๋์ง ์ฌ๊ฑด์ ์ฌ์ค๊ณผ ๋ง์ง ์๋ ์ด์ผ๊ธฐ๊ฐ ๋ ๊ฒ์
๋๋ค.
|
183 |
+
</pre>
|
184 |
+
</details>
|
185 |
+
|
186 |
+
<details>
|
187 |
+
<summary><b>Coding [CLICK TO EXPAND]</b></summary>
|
188 |
+
<!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
|
189 |
+
<img src="https://github.com/0110tpwls/project/blob/master/image (11).png?raw=true" width="80%"/>
|
190 |
+
</a> -->
|
191 |
+
|
192 |
+
<pre style="white-space: pre-wrap;">
|
193 |
+
User:
|
194 |
+
PyTorch๋ฅผ ์ฌ์ฉํด์ ๊ฐ๋จํ CNN์ ๊ตฌํํด๋ด.
|
195 |
+
Model:
|
196 |
+
๋ค, ๋ฌผ๋ก ์ด์์! PyTorch๋ฅผ ์ฌ์ฉํ์ฌ ๊ฐ๋จํ CNN์ ๊ตฌํํ๋ ์์ ์ฝ๋๋ฅผ ๋ณด์ฌ๋๋ฆด๊ฒ์.
|
197 |
+
์ฐ์ , ํ์ํ ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ฅผ ๊ฐ์ ธ์์ผ ํด์. PyTorch์ `torch`์ `nn` ๋ชจ๋์ด ํ์ํด์.
|
198 |
+
```python
|
199 |
+
import torch\nimport torch.nn as nn
|
200 |
+
```
|
201 |
+
๊ทธ ๋ค์, ๊ฐ๋จํ CNN ๋ชจ๋ธ์ ์ ์ํ ๊ฑฐ์์. ์ด ๊ฒฝ์ฐ์๋ 2๊ฐ์ ์ปจ๋ณผ๋ฃจ์
๋ ์ด์ด์ 3๊ฐ์ ํ๋ง ๋ ์ด์ด๋ก ๊ตฌ์ฑ๋ ๋ชจ๋ธ์ด์์.
|
202 |
+
```python
|
203 |
+
class SimpleCNN(nn.Module):
|
204 |
+
def __init__(self):
|
205 |
+
super(SimpleCNN, self).__init__()
|
206 |
+
self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
|
207 |
+
self.pool = nn.MaxPool2d(2, 2)
|
208 |
+
self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
|
209 |
+
|
210 |
+
def forward(self, x):
|
211 |
+
x = self.pool(torch.relu(self.conv1(x)))
|
212 |
+
x = self.pool(torch.relu(self.conv2(x)))
|
213 |
+
return x
|
214 |
+
```
|
215 |
+
์ด์ ๋ชจ๋ธ์ ์ด๊ธฐํํ๊ณ , ์
๋ ฅ ๋ฐ์ดํฐ๋ฅผ ์ค๋นํด์ผ ํด์. ์ฌ๊ธฐ์๋ ์์์ 28x28 ํฌ๊ธฐ์ ์ด๋ฏธ์ง ๋ฐ์ดํฐ๋ฅผ ์์ฑํ์ด์.
|
216 |
+
```python
|
217 |
+
model = SimpleCNN()
|
218 |
+
input_data = torch.randn(1, 1, 28, 28)
|
219 |
+
```
|
220 |
+
๋ง์ง๋ง์ผ๋ก, ๋ชจ๋ธ์ ์
๋ ฅ ๋ฐ์ดํฐ๋ฅผ ๋ฃ์ด ์ถ๋ ฅ์ ํ์ธํ ์ ์์ด์.
|
221 |
+
```python
|
222 |
+
output = model(input_data)
|
223 |
+
print(output.shape)
|
224 |
+
```
|
225 |
+
์ด๋ ๊ฒ ํ๋ฉด ๊ฐ๋จํ CNN ๋ชจ๋ธ์ PyTorch๋ฅผ ์ฌ์ฉํด ๊ตฌํํ๊ณ , ์
๋ ฅ ๋ฐ์ดํฐ๋ฅผ ํตํด ์ถ๋ ฅ์ ํ์ธํ ์ ์์ด์. ๋ฌผ๋ก ์ค์ ๋ก๋ ๋ ๋ง์ ๋ ์ด์ด๋ ํ๋ผ๋ฏธํฐ ํ๋ ๋ฑ์ด ํ์ํ์ง๋ง, ์ด ์ฝ๋๋ ๊ธฐ๋ณธ์ ์ธ CNN ๊ตฌ์กฐ๋ฅผ ๋ณด์ฌ์ฃผ๋ ์์ ์ผ ๋ฟ์ด์์. ๋ ๊ถ๊ธํ ์ ์ด ์์ผ๋ฉด ์ธ์ ๋ ์ง ๋ฌผ์ด๋ด ์ฃผ์ธ์!
|
226 |
+
</pre>
|
227 |
+
</details>
|
228 |
+
|
229 |
+
You can find more examples at [our project page](https://yonsei-mir.github.io/AkaLLaMA-page)
|
230 |
+
|
231 |
+
|
232 |
+
|
233 |
+
## Warning
|
234 |
+
|
235 |
+
Although AKALlama-70B has significant potential, its responses can sometimes be inaccurate, biased, or misaligned, presenting risks if used without additional testing and refinement. Furthermore, the quality of the model's output is greatly influenced by the system prompt and decoding strategy. Changes in these areas could result in less precise outputs. Therefore, we strongly recommend handling our model with considerable caution.
|
236 |
+
|
237 |
+
## Citation
|
238 |
+
If you find our model useful in your work, please cite the model as follows:
|
239 |
+
|
240 |
+
@misc{akallama,
|
241 |
+
author = {Chung, Jiwan and Jeon, Jaehyun and Kim, Saejin and Lim, Seungwon and Oh, Giyeong and Son, Yejin and Yu, Youngjae},
|
242 |
+
title = {AkaLlama: Yonsei University Large Language Model Project},
|
243 |
+
year = {2024},
|
244 |
+
publisher = {Hugging Face},
|
245 |
+
journal = {Hugging Face repository},
|
246 |
+
howpublished = {\url{https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1}},
|
247 |
+
}
|
248 |
+
|
249 |
+
## Contact
|
250 |
+
We look forward for your feedback and welcome collaboration on this exciting project!
|
251 |
+
|
252 |
+
### Contributors
|
253 |
+
- [YoungJaeYu](https://yj-yu.github.io/home/)
|
254 |
+
- [Yonsei MIRLab](https://mirlab.yonsei.ac.kr/)
|
255 |
+
|
256 |
+
## Special Thanks
|
257 |
+
|
258 |
+
- Data Center of the Department of Artificial Intelligence at Yonsei University for the computation resources
|
259 |
+
## Acknowledgement
|
260 |
+
|
261 |
+
- Title image generated by DALLยทE 3
|