File size: 2,419 Bytes
12ac9a7
 
 
 
 
 
 
 
 
 
 
 
 
d999884
 
12ac9a7
 
0d4f138
12ac9a7
0d4f138
12ac9a7
0d4f138
12ac9a7
 
 
 
6834a70
 
 
12ac9a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
language:
- de
pipeline_tag: text-generation
library_name: transformers
tags:
- bloom
- LLM
inference: false
widget:
- text: "TODO"
---

# IGEL: Instruction-tuned German large Language Model for Text 



IGEL is an LLM model family developed for German. The first version of IGEL is built on top **[BigScience BLOOM](https://bigscience.huggingface.co/blog/bloom),** adapted to **[German from Malte Ostendorff](https://huggingface.co/malteos/bloom-6b4-clp-german)**. IGEL is designed to provide accurate and reliable language understanding capabilities for a wide range of natural language understanding tasks, including sentiment analysis, language translation, and question answering.

### **You can try out the model at [igel-playground](https://huggingface.co/spaces/philschmid/igel-playground).**

The IGEL family currently includes `instruct-igel-001` and `chat-igel-001` _(coming soon)_.


## Model Description

LoRA tuned [BLOOM-CLP German (6.4B parameters)](https://huggingface.co/malteos/bloom-6b4-clp-german) with merged weights. The `001` was designed as a naive test to determine whether it is possible to create an german instruction-tuned model using a small, undertrained LLM and a naive translated dataset. The goal of this test was to explore the potential of the BLOOM architecture for language modeling tasks that require instruction-based responses.

To achieve this goal, we used a pre-trained LLM model with limited training, and fine-tuned it using a dataset of naive translations of instruction-based content. The dataset was created by taking instructions in English and translating them into German using an automated translation tool. While this approach may introduce errors in the translated content, we wanted to test whether the model could still learn to generate instruction-based responses in a variety of languages.

## Training data

`instruct-igel-001` is trained on naive translated instruction datasets, without much post-processing. 

### Known limitations

`instruct-igel-001` also exhibits several common deficiencies of language models, including hallucination, toxicity, and stereotypes. 

For example, in the following figure, `instruct-igel-001` wrongly says that the cancelor of Germany is Angela Merkel. 

![cancelor](./assets/cancelor.png)


### Training procedure

_coming soon_

## How to use

You can test the model in this LLM playground.

_coming soon_