aashish1904 commited on
Commit
a2e39d2
β€’
1 Parent(s): 95b47c1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +345 -0
README.md ADDED
@@ -0,0 +1,345 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ language:
5
+ - en
6
+ - ko
7
+ license: llama3
8
+ library_name: transformers
9
+ base_model:
10
+ - meta-llama/Meta-Llama-3-8B
11
+
12
+ ---
13
+
14
+ ![](https://cdn.discordapp.com/attachments/791342238541152306/1264099835221381251/image.png?ex=669ca436&is=669b52b6&hm=129f56187c31e1ed22cbd1bcdbc677a2baeea5090761d2f1a458c8b1ec7cca4b&)
15
+
16
+ # QuantFactory/llama-3-Korean-Bllossom-8B-GGUF
17
+ This is quantized version of [MLP-KTLim/llama-3-Korean-Bllossom-8B](https://huggingface.co/MLP-KTLim/llama-3-Korean-Bllossom-8B) created using llama.cpp
18
+
19
+ # Original Model Card
20
+
21
+
22
+ <a href="https://github.com/MLP-Lab/Bllossom">
23
+ <img src="https://github.com/teddysum/bllossom/blob/main//bllossom_icon.png?raw=true" width="40%" height="50%">
24
+ </a>
25
+
26
+
27
+
28
+ # Update!
29
+ * [2024.06.18] μ‚¬μ „ν•™μŠ΅λŸ‰μ„ **250GB**κΉŒμ§€ 늘린 Bllossom ELOλͺ¨λΈλ‘œ μ—…λ°μ΄νŠΈ λ˜μ—ˆμŠ΅λ‹ˆλ‹€. λ‹€λ§Œ 단어확μž₯은 ν•˜μ§€ μ•Šμ•˜μŠ΅λ‹ˆλ‹€. κΈ°μ‘΄ 단어확μž₯된 long-context λͺ¨λΈμ„ ν™œμš©ν•˜κ³  μ‹ΆμœΌμ‹ λΆ„μ€ κ°œμΈμ—°λ½μ£Όμ„Έμš”!
30
+ * [2024.06.18] Bllossom ELO λͺ¨λΈμ€ 자체 κ°œλ°œν•œ ELOμ‚¬μ „ν•™μŠ΅ 기반으둜 μƒˆλ‘œμš΄ ν•™μŠ΅λœ λͺ¨λΈμž…λ‹ˆλ‹€. [LogicKor](https://github.com/StableFluffy/LogicKor) 벀치마크 κ²°κ³Ό ν˜„μ‘΄ν•˜λŠ” ν•œκ΅­μ–΄ 10Bμ΄ν•˜ λͺ¨λΈμ€‘ SOTA점수λ₯Ό λ°›μ•˜μŠ΅λ‹ˆλ‹€.
31
+
32
+ LogicKor μ„±λŠ₯ν‘œ :
33
+ | Model | Math | Reasoning | Writing | Coding | Understanding | Grammar | Single ALL | Multi ALL | Overall |
34
+ |:---------:|:-----:|:------:|:-----:|:-----:|:----:|:-----:|:-----:|:-----:|:----:|
35
+ | gpt-3.5-turbo-0125 | 7.14 | 7.71 | 8.28 | 5.85 | 9.71 | 6.28 | 7.50 | 7.95 | 7.72 |
36
+ | gemini-1.5-pro-preview-0215 | 8.00 | 7.85 | 8.14 | 7.71 | 8.42 | 7.28 | 7.90 | 6.26 | 7.08 |
37
+ | llama-3-Korean-Bllossom-8B | 5.43 | 8.29 | 9.0 | 4.43 | 7.57 | 6.86 | 6.93 | 6.93 | 6.93 |
38
+
39
+
40
+
41
+ # Bllossom | [Demo]() | [Homepage](https://www.bllossom.ai/) | [Github](https://github.com/MLP-Lab/Bllossom) |
42
+
43
+ <!-- [GPU용 Colab μ½”λ“œμ˜ˆμ œ](https://colab.research.google.com/drive/1fBOzUVZ6NRKk_ugeoTbAOokWKqSN47IG?usp=sharing) | -->
44
+ <!-- [CPU용 Colab μ–‘μžν™”λͺ¨λΈ μ½”λ“œμ˜ˆμ œ](https://colab.research.google.com/drive/129ZNVg5R2NPghUEFHKF0BRdxsZxinQcJ?usp=drive_link) -->
45
+
46
+ ```bash
47
+ 저희 BllossomνŒ€ μ—μ„œ ν•œκ΅­μ–΄-μ˜μ–΄ 이쀑 μ–Έμ–΄λͺ¨λΈμΈ Bllossom을 κ³΅κ°œν–ˆμŠ΅λ‹ˆλ‹€!
48
+ μ„œμšΈκ³ΌκΈ°λŒ€ μŠˆνΌμ»΄ν“¨νŒ… μ„Όν„°μ˜ μ§€μ›μœΌλ‘œ 100GBκ°€λ„˜λŠ” ν•œκ΅­μ–΄λ‘œ λͺ¨λΈμ „체λ₯Ό ν’€νŠœλ‹ν•œ ν•œκ΅­μ–΄ κ°•ν™” 이쀑언어 λͺ¨λΈμž…λ‹ˆλ‹€!
49
+ ν•œκ΅­μ–΄ μž˜ν•˜λŠ” λͺ¨λΈ μ°Ύκ³  μžˆμ§€ μ•ŠμœΌμ…¨λ‚˜μš”?
50
+ - ν•œκ΅­μ–΄ 졜초! 무렀 3λ§Œκ°œκ°€ λ„˜λŠ” ν•œκ΅­μ–΄ μ–΄νœ˜ν™•μž₯
51
+ - Llama3λŒ€λΉ„ λŒ€λž΅ 25% 더 κΈ΄ 길이의 ν•œκ΅­μ–΄ Context μ²˜λ¦¬κ°€λŠ₯
52
+ - ν•œκ΅­μ–΄-μ˜μ–΄ Pararell Corpusλ₯Ό ν™œμš©ν•œ ν•œκ΅­μ–΄-μ˜μ–΄ 지식연결 (μ‚¬μ „ν•™μŠ΅)
53
+ - ν•œκ΅­μ–΄ λ¬Έν™”, μ–Έμ–΄λ₯Ό κ³ λ €ν•΄ μ–Έμ–΄ν•™μžκ°€ μ œμž‘ν•œ 데이터λ₯Ό ν™œμš©ν•œ λ―Έμ„Έμ‘°μ •
54
+ - κ°•ν™”ν•™μŠ΅
55
+ 이 λͺ¨λ“ κ²Œ ν•œκΊΌλ²ˆμ— 적용되고 상업적 이용이 κ°€λŠ₯ν•œ Bllossom을 μ΄μš©ν•΄ μ—¬λŸ¬λΆ„ 만의 λͺ¨λΈμ„ λ§Œλ“€μ–΄λ³΄μ„Έμš₯!
56
+ 무렀 Colab 무료 GPU둜 ν•™μŠ΅μ΄ κ°€λŠ₯ν•©λ‹ˆλ‹€. ν˜Ήμ€ μ–‘μžν™” λͺ¨λΈλ‘œ CPUμ—μ˜¬λ €λ³΄μ„Έμš” [μ–‘μžν™”λͺ¨λΈ](https://huggingface.co/MLP-KTLim/llama-3-Korean-Bllossom-8B-4bit)
57
+
58
+ 1. Bllossom-8BλŠ” μ„œμšΈκ³ΌκΈ°λŒ€, ν…Œλ””μΈ, μ—°μ„ΈλŒ€ μ–Έμ–΄μžμ› μ—°κ΅¬μ‹€μ˜ μ–Έμ–΄ν•™μžμ™€ ν˜‘μ—…ν•΄ λ§Œλ“  μ‹€μš©μ£Όμ˜κΈ°λ°˜ μ–Έμ–΄λͺ¨λΈμž…λ‹ˆλ‹€! μ•žμœΌλ‘œ 지속적인 μ—…λ°μ΄νŠΈλ₯Ό 톡해 κ΄€λ¦¬ν•˜κ² μŠ΅λ‹ˆλ‹€ 많이 ν™œμš©ν•΄μ£Όμ„Έμš” πŸ™‚
59
+ 2. 초 κ°•λ ₯ν•œ Advanced-Bllossom 8B, 70Bλͺ¨λΈ, μ‹œκ°-μ–Έμ–΄λͺ¨λΈμ„ λ³΄μœ ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€! (κΆκΈˆν•˜μ‹ λΆ„μ€ κ°œλ³„ μ—°λ½μ£Όμ„Έμš”!!)
60
+ 3. Bllossom은 NAACL2024, LREC-COLING2024 (ꡬ두) λ°œν‘œλ‘œ μ±„νƒλ˜μ—ˆμŠ΅λ‹ˆλ‹€.
61
+ 4. 쒋은 μ–Έμ–΄λͺ¨λΈ 계속 μ—…λ°μ΄νŠΈ ν•˜κ² μŠ΅λ‹ˆλ‹€!! ν•œκ΅­μ–΄ κ°•ν™”λ₯Όμœ„ν•΄ 곡동 μ—°κ΅¬ν•˜μ‹€λΆ„(νŠΉνžˆλ…Όλ¬Έ) μ–Έμ œλ“  ν™˜μ˜ν•©λ‹ˆλ‹€!!
62
+ 특히 μ†ŒλŸ‰μ˜ GPU라도 λŒ€μ—¬ κ°€λŠ₯ν•œνŒ€μ€ μ–Έμ œλ“  μ—°λ½μ£Όμ„Έμš”! λ§Œλ“€κ³  싢은거 λ„μ™€λ“œλ €μš”.
63
+ ```
64
+
65
+ The Bllossom language model is a Korean-English bilingual language model based on the open-source LLama3. It enhances the connection of knowledge between Korean and English. It has the following features:
66
+
67
+ * **Knowledge Linking**: Linking Korean and English knowledge through additional training
68
+ * **Vocabulary Expansion**: Expansion of Korean vocabulary to enhance Korean expressiveness.
69
+ * **Instruction Tuning**: Tuning using custom-made instruction following data specialized for Korean language and Korean culture
70
+ * **Human Feedback**: DPO has been applied
71
+ * **Vision-Language Alignment**: Aligning the vision transformer with this language model
72
+
73
+ **This model developed by [MLPLab at Seoultech](http://mlp.seoultech.ac.kr), [Teddysum](http://teddysum.ai/) and [Yonsei Univ](https://sites.google.com/view/hansaemkim/hansaem-kim)**
74
+
75
+ ## Demo Video
76
+
77
+ <div style="display: flex; justify-content: space-between;">
78
+ <!-- 첫 번째 컬럼 -->
79
+ <div style="width: 49%;">
80
+ <a>
81
+ <img src="https://github.com/lhsstn/lhsstn/blob/main/x-llava_dem.gif?raw=true" style="width: 100%; height: auto;">
82
+ </a>
83
+ <p style="text-align: center;">Bllossom-V Demo</p>
84
+ </div>
85
+
86
+ <!-- 두 번째 컬럼 (ν•„μš”ν•˜λ‹€λ©΄) -->
87
+ <div style="width: 49%;">
88
+ <a>
89
+ <img src="https://github.com/lhsstn/lhsstn/blob/main/bllossom_demo_kakao.gif?raw=true" style="width: 70%; height: auto;">
90
+ </a>
91
+ <p style="text-align: center;">Bllossom Demo(Kakao)γ…€γ…€γ…€γ…€γ…€γ…€γ…€γ…€</p>
92
+ </div>
93
+ </div>
94
+
95
+
96
+
97
+ # NEWS
98
+ * [2024.06.18] We have reverted to the non-vocab-expansion model. However, we have significantly increased the amount of pre-training data to 250GB.
99
+ * [2024.05.08] Vocab Expansion Model Update
100
+ * [2024.04.25] We released Bllossom v2.0, based on llama-3
101
+
102
+ ## Example code
103
+
104
+ ### Colab Tutorial
105
+ - [Inference-Code-Link](https://colab.research.google.com/drive/1fBOzUVZ6NRKk_ugeoTbAOokWKqSN47IG?usp=sharing)
106
+
107
+ ### Install Dependencies
108
+ ```bash
109
+ pip install torch transformers==4.40.0 accelerate
110
+ ```
111
+
112
+ ### Python code with Pipeline
113
+ ```python
114
+ import transformers
115
+ import torch
116
+
117
+ model_id = "MLP-KTLim/llama-3-Korean-Bllossom-8B"
118
+
119
+ pipeline = transformers.pipeline(
120
+ "text-generation",
121
+ model=model_id,
122
+ model_kwargs={"torch_dtype": torch.bfloat16},
123
+ device_map="auto",
124
+ )
125
+
126
+ pipeline.model.eval()
127
+
128
+ PROMPT = '''You are a helpful AI assistant. Please answer the user's questions kindly. 당신은 유λŠ₯ν•œ AI μ–΄μ‹œμŠ€ν„΄νŠΈ μž…λ‹ˆλ‹€. μ‚¬μš©μžμ˜ μ§ˆλ¬Έμ— λŒ€ν•΄ μΉœμ ˆν•˜κ²Œ λ‹΅λ³€ν•΄μ£Όμ„Έμš”.'''
129
+ instruction = "μ„œμšΈμ˜ 유λͺ…ν•œ κ΄€κ΄‘ μ½”μŠ€λ₯Ό λ§Œλ“€μ–΄μ€„λž˜?"
130
+
131
+ messages = [
132
+ {"role": "system", "content": f"{PROMPT}"},
133
+ {"role": "user", "content": f"{instruction}"}
134
+ ]
135
+
136
+ prompt = pipeline.tokenizer.apply_chat_template(
137
+ messages,
138
+ tokenize=False,
139
+ add_generation_prompt=True
140
+ )
141
+
142
+ terminators = [
143
+ pipeline.tokenizer.eos_token_id,
144
+ pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
145
+ ]
146
+
147
+ outputs = pipeline(
148
+ prompt,
149
+ max_new_tokens=2048,
150
+ eos_token_id=terminators,
151
+ do_sample=True,
152
+ temperature=0.6,
153
+ top_p=0.9
154
+ )
155
+
156
+ print(outputs[0]["generated_text"][len(prompt):])
157
+ ```
158
+ ```
159
+ # 물둠이죠! μ„œμšΈμ€ λ‹€μ–‘ν•œ 문화와 역사, μžμ—°μ„ κ²ΈλΉ„ν•œ λ„μ‹œλ‘œ, λ§Žμ€ κ΄€κ΄‘ λͺ…μ†Œλ₯Ό μžλž‘ν•©λ‹ˆλ‹€. μ—¬κΈ° μ„œμšΈμ˜ 유λͺ…ν•œ κ΄€κ΄‘ μ½”μŠ€λ₯Ό μ†Œκ°œν•΄ λ“œλ¦΄κ²Œμš”.
160
+
161
+ ### μ½”μŠ€ 1: 역사와 λ¬Έν™” 탐방
162
+
163
+ 1. **경볡ꢁ**
164
+ - μ„œμšΈμ˜ λŒ€ν‘œμ μΈ ꢁꢐ둜, μ‘°μ„  μ™•μ‘°μ˜ 역사와 λ¬Έν™”λ₯Ό μ²΄ν—˜ν•  수 μžˆλŠ” κ³³μž…λ‹ˆλ‹€.
165
+
166
+ 2. **뢁촌 ν•œμ˜₯λ§ˆμ„**
167
+ - 전톡 ν•œμ˜₯이 잘 보쑴된 λ§ˆμ„λ‘œ, μ‘°μ„ μ‹œλŒ€μ˜ μƒν™œμƒμ„ λŠλ‚„ 수 μžˆμŠ΅λ‹ˆλ‹€.
168
+
169
+ 3. **인사동**
170
+ - 전톡 문화와 ν˜„λŒ€ 예술이 κ³΅μ‘΄ν•˜λŠ” 거리둜, λ‹€μ–‘ν•œ κ°€λŸ¬λ¦¬μ™€ 전톡 μŒμ‹μ μ΄ μžˆμŠ΅λ‹ˆλ‹€.
171
+
172
+ 4. **μ²­κ³„μ²œ**
173
+ - μ„œμšΈμ˜ 쀑심에 μœ„μΉ˜ν•œ 천문으둜, μ‘°κΉ…κ³Ό 산책을 즐길 수 μžˆλŠ” κ³³μž…λ‹ˆλ‹€.
174
+
175
+ ### μ½”μŠ€ 2: μžμ—°κ³Ό μ‡Όν•‘
176
+
177
+ 1. **남산 μ„œμšΈνƒ€μ›Œ**
178
+ - μ„œμšΈμ˜ 전경을 ν•œλˆˆμ— λ³Ό 수 μžˆλŠ” 곳으둜, 특히 저녁 μ‹œκ°„λŒ€μ— 일λͺ°μ„ κ°μƒν•˜λŠ” 것이 μ’‹μŠ΅λ‹ˆλ‹€.
179
+
180
+ 2. **λͺ…동**
181
+ - μ‡Όν•‘κ³Ό μŒμ‹μ μ΄ μ¦λΉ„ν•œ μ§€μ—­μœΌλ‘œ, λ‹€μ–‘ν•œ λΈŒλžœλ“œμ™€ 전톡 μŒμ‹μ„ 맛볼 수 μžˆμŠ΅λ‹ˆλ‹€.
182
+
183
+ 3. **ν•œκ°•κ³΅μ›**
184
+ - μ„œμšΈμ˜ μ£Όμš” 곡원 쀑 ν•˜λ‚˜λ‘œ, μ‘°κΉ…, μžμ „κ±° 타기, λ°°λ‚­ 여행을 즐길 수 μžˆμŠ΅λ‹ˆλ‹€.
185
+
186
+ 4. **ν™λŒ€**
187
+ - μ Šμ€μ΄λ“€μ΄ 즐겨 μ°ΎλŠ” μ§€μ—­μœΌλ‘œ, λ‹€μ–‘ν•œ 카페, λ ˆμŠ€ν† λž‘, 클럽이 μžˆμŠ΅λ‹ˆλ‹€.
188
+
189
+ ### μ½”μŠ€ 3: ν˜„λŒ€μ™€ μ „ν†΅μ˜ μ‘°ν™”
190
+
191
+ 1. **λ™λŒ€λ¬Έ λ””μžμΈ ν”ŒλΌμž (DDP)**
192
+ - ν˜„λŒ€μ μΈ κ±΄μΆ•λ¬Όλ‘œ, λ‹€μ–‘ν•œ μ „μ‹œμ™€ μ΄λ²€νŠΈκ°€ μ—΄λ¦¬λŠ” κ³³μž…λ‹ˆλ‹€.
193
+
194
+ 2. **μ΄νƒœμ›**
195
+ - λ‹€μ–‘ν•œ ꡭ제 μŒμ‹κ³Ό μΉ΄νŽ˜κ°€ μžˆλŠ” μ§€μ—­μœΌλ‘œ, λ‹€μ–‘ν•œ λ¬Έν™”λ₯Ό κ²½ν—˜ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
196
+
197
+ 3. **κ΄‘ν™”λ¬Έ**
198
+ - μ„œμšΈμ˜ 쀑심에 μœ„μΉ˜ν•œ κ΄‘μž₯으둜, λ‹€μ–‘ν•œ 곡연과 행사가 μ—΄λ¦½λ‹ˆλ‹€.
199
+
200
+ 4. **μ„œμšΈλžœλ“œ**
201
+ - μ„œμšΈ 외곽에 μœ„μΉ˜ν•œ ν…Œλ§ˆνŒŒν¬λ‘œ, κ°€μ‘±λ‹¨μœ„ κ΄€κ΄‘κ°λ“€μ—κ²Œ 인기 μžˆλŠ” κ³³μž…λ‹ˆλ‹€.
202
+
203
+ 이 μ½”μŠ€λ“€μ€ μ„œμšΈμ˜ λ‹€μ–‘ν•œ λ©΄λͺ¨λ₯Ό κ²½ν—˜ν•  수 μžˆλ„λ‘ κ΅¬μ„±λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. 각 μ½”μŠ€λ§ˆλ‹€ μ‹œκ°„μ„ μ‘°μ ˆν•˜κ³ , 개인의 관심사에 맞게 μ„ νƒν•˜μ—¬ λ°©λ¬Έν•˜λ©΄ 쒋을 것 κ°™μŠ΅λ‹ˆλ‹€. 즐거운 μ—¬ν–‰ λ˜μ„Έμš”!
204
+ ```
205
+
206
+ ### Python code with AutoModel
207
+ ```python
208
+
209
+ import os
210
+ import torch
211
+ from transformers import AutoTokenizer, AutoModelForCausalLM
212
+
213
+ model_id = 'MLP-KTLim/llama-3-Korean-Bllossom-8B'
214
+
215
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
216
+ model = AutoModelForCausalLM.from_pretrained(
217
+ model_id,
218
+ torch_dtype=torch.bfloat16,
219
+ device_map="auto",
220
+ )
221
+
222
+ model.eval()
223
+
224
+ PROMPT = '''You are a helpful AI assistant. Please answer the user's questions kindly. 당신은 유λŠ₯ν•œ AI μ–΄μ‹œμŠ€ν„΄νŠΈ μž…λ‹ˆλ‹€. μ‚¬μš©μžμ˜ μ§ˆλ¬Έμ— λŒ€ν•΄ μΉœμ ˆν•˜κ²Œ λ‹΅λ³€ν•΄μ£Όμ„Έμš”.'''
225
+ instruction = "μ„œμšΈμ˜ 유λͺ…ν•œ κ΄€κ΄‘ μ½”μŠ€λ₯Ό λ§Œλ“€μ–΄μ€„λž˜?"
226
+
227
+ messages = [
228
+ {"role": "system", "content": f"{PROMPT}"},
229
+ {"role": "user", "content": f"{instruction}"}
230
+ ]
231
+
232
+ input_ids = tokenizer.apply_chat_template(
233
+ messages,
234
+ add_generation_prompt=True,
235
+ return_tensors="pt"
236
+ ).to(model.device)
237
+
238
+ terminators = [
239
+ tokenizer.eos_token_id,
240
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
241
+ ]
242
+
243
+ outputs = model.generate(
244
+ input_ids,
245
+ max_new_tokens=2048,
246
+ eos_token_id=terminators,
247
+ do_sample=True,
248
+ temperature=0.6,
249
+ top_p=0.9
250
+ )
251
+
252
+ print(tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True))
253
+ ```
254
+ ```
255
+ # 물둠이죠! μ„œμšΈμ€ λ‹€μ–‘ν•œ 문화와 역사, μžμ—°μ„ κ²ΈλΉ„ν•œ λ„μ‹œλ‘œ, λ§Žμ€ κ΄€κ΄‘ λͺ…μ†Œλ₯Ό μžλž‘ν•©λ‹ˆλ‹€. μ—¬κΈ° μ„œμšΈμ˜ 유λͺ…ν•œ κ΄€κ΄‘ μ½”μŠ€λ₯Ό μ†Œκ°œν•΄ λ“œλ¦΄κ²Œμš”.
256
+
257
+ ### μ½”μŠ€ 1: 역사와 λ¬Έν™” 탐방
258
+
259
+ 1. **경볡ꢁ**
260
+ - μ„œμšΈμ˜ λŒ€ν‘œμ μΈ ꢁꢐ둜, μ‘°μ„  μ™•μ‘°μ˜ 역사와 λ¬Έν™”λ₯Ό μ²΄ν—˜ν•  수 μžˆλŠ” κ³³μž…λ‹ˆλ‹€.
261
+
262
+ 2. **뢁촌 ν•œμ˜₯λ§ˆμ„**
263
+ - 전톡 ν•œμ˜₯이 잘 보쑴된 λ§ˆμ„λ‘œ, μ‘°μ„ μ‹œλŒ€μ˜ μƒν™œμƒμ„ λŠλ‚„ 수 μžˆμŠ΅λ‹ˆλ‹€.
264
+
265
+ 3. **인사동**
266
+ - 전톡 문화와 ν˜„λŒ€ 예술이 κ³΅μ‘΄ν•˜λŠ” 거리둜, λ‹€μ–‘ν•œ κ°€λŸ¬λ¦¬μ™€ 전톡 μŒμ‹μ μ΄ μžˆμŠ΅λ‹ˆλ‹€.
267
+
268
+ 4. **μ²­κ³„μ²œ**
269
+ - μ„œμšΈμ˜ 쀑심에 μœ„μΉ˜ν•œ 천문으둜, μ‘°κΉ…κ³Ό 산책을 즐길 수 μžˆλŠ” κ³³μž…λ‹ˆλ‹€.
270
+
271
+ ### μ½”μŠ€ 2: μžμ—°κ³Ό μ‡Όν•‘
272
+
273
+ 1. **남산 μ„œμšΈνƒ€μ›Œ**
274
+ - μ„œμšΈμ˜ 전경을 ν•œλˆˆμ— λ³Ό 수 μžˆλŠ” 곳으둜, 특히 저녁 μ‹œκ°„λŒ€μ— 일λͺ°μ„ κ°μƒν•˜λŠ” 것이 μ’‹μŠ΅λ‹ˆλ‹€.
275
+
276
+ 2. **λͺ…동**
277
+ - μ‡Όν•‘κ³Ό μŒμ‹μ μ΄ μ¦λΉ„ν•œ μ§€μ—­μœΌλ‘œ, λ‹€μ–‘ν•œ λΈŒλžœλ“œμ™€ 전톡 μŒμ‹μ„ 맛볼 수 μžˆμŠ΅λ‹ˆλ‹€.
278
+
279
+ 3. **ν•œκ°•κ³΅μ›**
280
+ - μ„œμšΈμ˜ μ£Όμš” 곡원 쀑 ν•˜λ‚˜λ‘œ, μ‘°κΉ…, μžμ „κ±° 타기, λ°°λ‚­ 여행을 즐길 수 μžˆμŠ΅λ‹ˆλ‹€.
281
+
282
+ 4. **ν™λŒ€**
283
+ - μ Šμ€μ΄λ“€μ΄ 즐겨 μ°ΎλŠ” μ§€μ—­μœΌλ‘œ, λ‹€μ–‘ν•œ 카페, λ ˆμŠ€ν† λž‘, 클럽이 μžˆμŠ΅λ‹ˆλ‹€.
284
+
285
+ ### μ½”μŠ€ 3: ν˜„λŒ€μ™€ μ „ν†΅μ˜ μ‘°ν™”
286
+
287
+ 1. **λ™λŒ€λ¬Έ λ””μžμΈ ν”ŒλΌμž (DDP)**
288
+ - ν˜„λŒ€μ μΈ κ±΄μΆ•λ¬Όλ‘œ, λ‹€μ–‘ν•œ μ „μ‹œμ™€ μ΄λ²€νŠΈκ°€ μ—΄λ¦¬λŠ” κ³³μž…λ‹ˆλ‹€.
289
+
290
+ 2. **μ΄νƒœμ›**
291
+ - λ‹€μ–‘ν•œ ꡭ제 μŒμ‹κ³Ό μΉ΄νŽ˜κ°€ μžˆλŠ” μ§€μ—­μœΌλ‘œ, λ‹€μ–‘ν•œ λ¬Έν™”λ₯Ό κ²½ν—˜ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
292
+
293
+ 3. **κ΄‘ν™”λ¬Έ**
294
+ - μ„œμšΈμ˜ 쀑심에 μœ„μΉ˜ν•œ κ΄‘μž₯으둜, λ‹€μ–‘ν•œ 곡연과 행사가 μ—΄λ¦½λ‹ˆλ‹€.
295
+
296
+ 4. **μ„œμšΈλžœλ“œ**
297
+ - μ„œμšΈ 외곽에 μœ„μΉ˜ν•œ ν…Œλ§ˆνŒŒν¬λ‘œ, κ°€μ‘±λ‹¨μœ„ κ΄€κ΄‘κ°λ“€μ—κ²Œ 인기 μžˆλŠ” κ³³μž…λ‹ˆλ‹€.
298
+
299
+ 이 μ½”μŠ€λ“€μ€ μ„œμšΈμ˜ λ‹€μ–‘ν•œ λ©΄λͺ¨λ₯Ό κ²½ν—˜ν•  수 μžˆλ„λ‘ κ΅¬μ„±λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. 각 μ½”μŠ€λ§ˆλ‹€ μ‹œκ°„μ„ μ‘°μ ˆν•˜κ³ , 개인의 관심사에 맞게 μ„ νƒν•˜μ—¬ λ°©λ¬Έν•˜λ©΄ 쒋을 것 κ°™μŠ΅λ‹ˆλ‹€. 즐거운 μ—¬ν–‰ λ˜μ„Έμš”!
300
+ ```
301
+
302
+
303
+
304
+ ## Citation
305
+ **Language Model**
306
+ ```text
307
+ @misc{bllossom,
308
+ author = {ChangSu Choi, Yongbin Jeong, Seoyoon Park, InHo Won, HyeonSeok Lim, SangMin Kim, Yejee Kang, Chanhyuk Yoon, Jaewan Park, Yiseul Lee, HyeJin Lee, Younggyun Hahm, Hansaem Kim, KyungTae Lim},
309
+ title = {Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean},
310
+ year = {2024},
311
+ journal = {LREC-COLING 2024},
312
+ paperLink = {\url{https://arxiv.org/pdf/2403.10882}},
313
+ },
314
+ }
315
+ ```
316
+
317
+ **Vision-Language Model**
318
+ ```text
319
+ @misc{bllossom-V,
320
+ author = {Dongjae Shin, Hyunseok Lim, Inho Won, Changsu Choi, Minjun Kim, Seungwoo Song, Hangyeol Yoo, Sangmin Kim, Kyungtae Lim},
321
+ title = {X-LLaVA: Optimizing Bilingual Large Vision-Language Alignment},
322
+ year = {2024},
323
+ publisher = {GitHub},
324
+ journal = {NAACL 2024 findings},
325
+ paperLink = {\url{https://arxiv.org/pdf/2403.11399}},
326
+ },
327
+ }
328
+ ```
329
+
330
+ ## Contact
331
+ - μž„κ²½νƒœ(KyungTae Lim), Professor at Seoultech. `ktlim@seoultech.ac.kr`
332
+ - ν•¨μ˜κ· (Younggyun Hahm), CEO of Teddysum. `hahmyg@teddysum.ai`
333
+ - κΉ€ν•œμƒ˜(Hansaem Kim), Professor at Yonsei. `khss@yonsei.ac.kr`
334
+
335
+ ## Contributor
336
+ - 졜창수(Chansu Choi), choics2623@seoultech.ac.kr
337
+ - 김상민(Sangmin Kim), sangmin9708@naver.com
338
+ - μ›μΈν˜Έ(Inho Won), wih1226@seoultech.ac.kr
339
+ - κΉ€λ―Όμ€€(Minjun Kim), mjkmain@seoultech.ac.kr
340
+ - μ†‘μŠΉμš°(Seungwoo Song), sswoo@seoultech.ac.kr
341
+ - μ‹ λ™μž¬(Dongjae Shin), dylan1998@seoultech.ac.kr
342
+ - μž„ν˜„μ„(Hyeonseok Lim), gustjrantk@seoultech.ac.kr
343
+ - μœ‘μ •ν›ˆ(Jeonghun Yuk), usually670@gmail.com
344
+ - μœ ν•œκ²°(Hangyeol Yoo), 21102372@seoultech.ac.kr
345
+ - μ†‘μ„œν˜„(Seohyun Song), alexalex225225@gmail.com