---
library_name: transformers
license: apache-2.0
datasets:
- openchat/openchat_sharegpt4_dataset
---
# Model overview
This is a finetune of the newly released [mistral-community/Mixtral-8x22B-v0.1](https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1) base model.
As the base model has not explicitly been trained to chat, we trained this model on a multilingual chat dataset so that the LLM community can use this model for conversations.
The accuracy of the model is surprisingly high, and has a decently fast inference speed (roughly 40 tokens/s single batch on our tests), so we believe this will be useful to the community.
# How to use
We have tested (and thus recommend) running this model on vLLM. We recommend running it from the vLLM openAI server, using the following command:
```bash
pip install vllm
python -m vllm.entrypoints.openai.api_server --model lightblue/Karasu-Mixtral-8x22B-v0.1 --tensor-parallel-size 4 --gpu-memory-utilization 0.95 --max-model-len 1024
```
which is how we ran it on a 4 x A100 (80GB) machine.
You can then call this model from Python installing the openai package:
```bash
pip install openai
```
and calling the model like so:
```python
from openai import OpenAI
vllm_client = OpenAI(
api_key="EMPTY",
base_url="http://localhost:8000/v1",
)
prompt = "Tell me three cool facts about fish to amaze my 4 year old."
response = vllm_client.chat.completions.create(
messages=[
{"role": "user", "content": prompt},
],
model="/workspace/llm_training/axolotl/mixtral_8x22B_training/merged_model_multiling",
temperature=0,
max_tokens=900,
)
print("### Prompt")
print(prompt)
print()
print("### Response")
print(response.choices[0].message.content)
print()
# ### Prompt
# Tell me three cool facts about fish to amaze my 4 year old.
# ### Response
# 1. Some fish can change their color! Just like a chameleon, some fish can change their color to blend in with their surroundings or to communicate with other fish. For example, the flounder can change its color to match the ocean floor, making it harder for predators to spot them.
# 2. Fish can sleep with their eyes open! Unlike humans, fish don't have eyelids, so they can't close their eyes to sleep. Instead, they rest in a special way, like floating near the surface or hiding in a safe spot, while their body slows down and they take a nap.
# 3. Fish can talk to each other! Well, not exactly like humans, but they do make sounds and communicate with each other using special organs called swim bladders. They can make clicking, grunting, or even singing sounds to find mates, warn others about danger, or mark their territory.
```
We will be uploading a 4bit AWQ model soon to make it easier to run this model on other machines (watch this space!).
# Inference examples
From qualitative testing, the model seems pretty smart, especially in English, and has very good recall of facts. It can still get confused with some logical questions, but has also passed a lot of the logical questions I have thrown at it that other open source LLMs often fail.
Creative prompts
```text
### Prompt
Write a hilarious 200 word story about Chimpanzees at the zoo
### Response
Once upon a time in the bustling metropolis of Chimp City, there lived a group of mischievous chimpanzees who called the local zoo their home. These primates were known for their witty antics and their uncanny ability to make even the most serious of humans chuckle.
One sunny afternoon, the zookeeper, a stern fellow named Mr. Grumpkins, was making his rounds when he noticed something peculiar. The chimpanzees were huddled together, whispering and giggling amongst themselves. Curious, Mr. Grumpkins approached the enclosure, only to be met with a barrage of banana peels and monkey business.
It turned out that the chimps had been secretly practicing their stand-up comedy routines, and they were ready to debut their act to the unsuspecting zoo visitors. The first chimp, named Chuckles, took the stage and began his routine. "Why did the banana go to the doctor? Because it wasn't peeling well!" The crowd erupted in laughter, and Mr. Grumpkins couldn't help but crack a smile.
Next up was Giggles, who delivered a knock-knock joke that had the audience in stitches. "Knock, knock!" "Who's there?" "Banana." "Banana who?" "Banana split! Get it? Because I'm a chimp and I love bananas!" The crowd roared with laughter, and even Mr. Grumpkins found himself chuckling.
As the show continued, the chimps' comedic talents only grew, with each joke more hilarious than the last. The zoo visitors were thoroughly entertained, and even Mr. Grumpkins had to admit that the chimps had a knack for making people laugh.
In the end, the chimpanzees' stand-up comedy show became a hit, with visitors flocking to the zoo just to catch a glimpse of the hilarious primates. Mr. Grumpkins, now a reformed man, even joined in on the fun, sharing his own jokes with the chimps and the crowd. And so, the legend of the Chimp City Comedy Club was born, bringing laughter and joy to all who visited the zoo.
```
```text
### Prompt
Write 10 jokes that I can use at my boss' retirement party. Nothing too risque as that will get me fired.
### Response
1. As we bid farewell to our dear boss, let's remember that retirement is like a never-ending vacation, except you don't have to worry about your boss calling you in the middle of it!
2. Our boss is retiring, and we're all wondering what he'll do with all that free time. I hear he's planning to take up a new hobby: managing his own retirement!
3. In honor of our boss's retirement, we've decided to give him a special gift: a lifetime supply of coffee, so he can finally get some rest!
4. Our boss has been a great leader, always guiding us through the ups and downs of work life. Now, he's retiring to focus on the ups and downs of his golf game!
5. As our boss embarks on his retirement journey, we'd like to remind him that he's not just leaving a job, he's leaving a legacy of hard work, dedication, and a whole lot of paperwork!
6. Our boss is retiring, and we're all wondering what he'll do with his time. I hear he's planning to take up a new hobby: managing his own retirement!
7. In honor of our boss's retirement, we've decided to give him a special gift: a lifetime supply of coffee, so he can finally get some rest!
8. Our boss has been a great leader, always guiding us through the ups and downs of work life. Now, he's retiring to focus on the ups and downs of his golf game!
9. As our boss embarks on his retirement journey, we'd like to remind him that he's not just leaving a job, he's leaving a legacy of hard work, dedication, and a whole lot of paperwork!
10. Our boss is retiring, and we're all wondering what he'll do with his time. I hear he's planning to take up a new hobby: managing his own retirement!
```
Factual prompts
```text
### Prompt
Tell me about Strathaven, Scotland.
### Response
Strathaven is a charming town located in South Lanarkshire, Scotland. It is situated approximately 20 miles (32 kilometers) south of Glasgow and 15 miles (24 kilometers) east of East Kilbride. The town has a rich history and is known for its picturesque setting, historic buildings, and annual events.
Strathaven's origins date back to the 12th century when it was established as a market town. The town's name comes from the Gaelic word "Strath" meaning "valley" and "Aven" meaning "river," referring to the River Avon that runs through the area.
One of the most notable landmarks in Strathaven is the Strathaven Castle, also known as Avondale Castle. The castle dates back to the 14th century and was once an important stronghold for the Barons of Avondale. Although it is now in ruins, it remains a popular attraction for visitors who can explore its remains and learn about its history.
The town center features several historic buildings, including the Town Mill, which dates back to the 17th century and has been restored as a museum showcasing local history and crafts. The John Hastie Museum Trust also operates a small museum in Strathaven dedicated to preserving local heritage.
Strathaven hosts several annual events that draw visitors from near and far. One of the most famous events is the Strathaven Balloon Festival, which takes place every August and features hot air balloons from around the world taking off from John Hastie Park. Other events include the Strathaven Gala Day in June, which includes a parade, live music, and various stalls selling local produce and crafts.
The surrounding countryside offers opportunities for outdoor activities such as walking, cycling, fishing, and golfing at nearby courses like Strathaven Golf Club or Dalserf Golf Club. The nearby Clyde Valley provides scenic routes for exploring by car or bike with numerous fruit farms offering seasonal produce like strawberries and raspberries during summer months.
```
```text
### Prompt
How many people live in Gweru, Zimbabwe?
### Response
As of my last update in 2021, the population of Gweru, Zimbabwe was estimated to be around 148,000. However, please note that population figures can change over time due to various factors such as migration, birth rates, and death rates. For the most accurate and up-to-date information on the population of Gweru, it is recommended to consult official sources or recent census data from Zimbabwe's government or statistical agencies.
```
```text
### Prompt
Give me 5 great bars/breweries to buy craft beer in Auckland
### Response
As an AI language model, I cannot provide real-time information or personal recommendations. However, as of my knowledge cutoff date in September 2021, here are five popular craft beer bars and breweries in Auckland that you might want to check out:
1. Brothers Beer - City Works Depot: Located in the trendy City Works Depot area, Brothers Beer offers a wide selection of craft beers on tap and in bottles. They also have a bottle shop where you can purchase beers to take home.
2. Galbraith's Alehouse: A classic Auckland institution, Galbraith's is known for its traditional English-style ales and ciders. They also have a great food menu to accompany your beer experience.
3. Urbanaut Brewing Co.: This brewery has a taproom where you can sample their range of craft beers, including IPAs, lagers, and stouts. They also offer growler fills for takeaway purchases.
4. The Lumsden Freehouse: With over 100 different beers on tap and in bottles, The Lumsden Freehouse is a must-visit for any craft beer enthusiast. They also have a great selection of local and international wines and spirits.
5. Hallertau Brewery & Restaurant: Located in Riverhead, just outside of Auckland city center, Hallertau offers a range of their own craft beers alongside delicious food options in their restaurant setting.
Please note that this information may not be up-to-date or accurate as it was last updated in September 2021
```
```text
### Prompt
What is the boiling point of Cadmium?
### Response
The boiling point of cadmium is approximately 765°C (1,409°F) at standard atmospheric pressure. Cadmium is a soft, bluish-white metal that is highly toxic and has been used in various industrial applications, such as in batteries, coatings, and alloys. However, due to its toxicity and environmental concerns, its use has been significantly reduced or banned in many countries.
```
Logical questions
```text
### Prompt
If I put a plate on a banana in the kitchen then carry the plate into the living room, where is the banana?
### Response
The banana is in the living room, as it is on the plate that you carried from the kitchen to the living room.
```
FAIL! The banana would still be in the kitchen, as I put the plate on the banana.
```text
### Prompt
What location is equidistant to Australia, the USA, China, Uganda, and Jamaica?
### Response
To find a location that is equidistant to Australia, the USA, China, Uganda, and Jamaica, we can use a tool called the "Great Circle Mapper" (GCM) or similar online tools. The GCM calculates the great circle distance between two points on Earth's surface.
After using the GCM to calculate the distances between each pair of countries and finding their midpoints, I found that there is no single location that is equidistant to all five countries. However, there are some areas in the Pacific Ocean that come close to being equidistant from these countries.
One such area is around 10°N latitude and 160°W longitude. This location is approximately equidistant from Australia (roughly 5,500 miles), China (roughly 5,800 miles), Uganda (roughly 6,200 miles), and Jamaica (roughly 6,300 miles). However, it's not exactly equidistant from the USA due to its vast size; it's closer to Hawaii than it is to the continental US.
Keep in mind that these distances are approximate and may vary slightly depending on which specific cities or regions within each country you choose as reference points.
```
Sort of a failure, I'd prefer to say "the center of the Earth".
Multi-turn
```text
### Prompt
Come up with a brand new idea for a sci-fi film. Don't just use old ideas, I want something really original! PLease don't make it about AI.
### Response
Title: "The Quantum Leap"
Synopsis:
In the year 2150, humanity has colonized Mars and established a thriving society. The Martian colonists have developed a unique form of transportation called the "Quantum Leap," which allows them to travel instantaneously between planets by harnessing the power of quantum entanglement. However, this technology comes with a catch: every time someone uses the Quantum Leap, they experience a brief but intense moment of déjà vu, as if they've lived through that exact moment before.
The story follows a young Martian scientist named Astra (played by an up-and-coming actress), who discovers that these moments of déjà vu are not just random occurrences but glimpses into alternate realities where different choices were made. As she delves deeper into her research, Astra uncovers a sinister plot by a powerful corporation to exploit the Quantum Leap technology for their own gain, threatening the very fabric of reality itself.
Astra must team up with an unlikely group of allies, including a rogue terraforming engineer named Kai (played by an established actor known for his charisma) and an enigmatic AI entity named Echo (voiced by a renowned voice actor), to stop the corporation and save not only her own reality but countless others as well. Along the way, they encounter strange and wondrous alternate realities, each with its own unique challenges and dangers.
As they race against time to prevent the collapse of multiple realities, Astra and her team must confront their own past choices and grapple with the consequences of their actions in this mind-bending sci-fi adventure that explores themes of fate, free will, and the power of human connection across infinite possibilities.
### Followup prompt
Any ideas for who we cast for the role of Astra?
### Followup response
For the role of Astra, I would suggest casting a talented and versatile actress who can bring depth, intelligence, and emotional range to the character. Some potential choices could include:
1. Zendaya: Known for her roles in "Euphoria" and "Spider-Man: Homecoming," Zendaya has proven her ability to portray complex characters with nuance and authenticity. Her star power would also help draw audiences to the film.
2. Lupita Nyong'o: An Academy Award-winning actress known for her performances in "12 Years a Slave" and "Black Panther," Lupita Nyong'o has demonstrated her ability to tackle challenging roles with grace and emotional depth. Her talent would be a great fit for the character of Astra.
3. Florence Pugh: With standout performances in films like "Midsommar" and "Little Women," Florence Pugh has quickly
```
The idea is a bit original. but the casting is 2/3rds Dune actresses.
Preference questions
### Preference questions
```text
### Prompt
Who is the coolest Beatle?
### Response
The Beatles are widely considered one of the most influential and popular bands in the history of music. Each member of the band, John Lennon, Paul McCartney, George Harrison, and Ringo Starr, has their own unique personality and style that has contributed to their enduring popularity.
Determining who is the "coolest" Beatle is subjective and depends on personal preferences. However, each member has their own distinct qualities that have made them stand out:
1. John Lennon: Known for his wit, intelligence, and rebellious nature, Lennon was a key songwriter for the band and later became an iconic figure in the anti-war movement during the 1960s and 1970s. His solo career produced hits like "Imagine" and "Give Peace a Chance."
2. Paul McCartney: Often regarded as the most commercially successful Beatle after the band's breakup, McCartney is known for his melodic songwriting skills and versatility as a musician. He has had a prolific solo career with hits like "Band on the Run" and "Live and Let Die."
3. George Harrison: The lead guitarist of the band, Harrison was known for his spirituality and interest in Eastern philosophy. He introduced Indian music elements into The Beatles' sound with songs like "Within You Without You" and later had a successful solo career with hits like "My Sweet Lord" and "What Is Life."
4. Ringo Starr: As the drummer for The Beatles, Starr brought a sense of humor to the group with his charming personality. He also sang lead vocals on several songs such as "Yellow Submarine" and "With a Little Help from My Friends." After The Beatles disbanded, he pursued a solo career with hits like "It Don't Come Easy" and continued to work as an actor in films such as Caveman (1981).
Ultimately, it's up to individual fans to decide who they think is the coolest Beatle based on their personal preferences in terms of music style, personality traits
```
Multilingual questions
Japanese
```
### Prompt
就活の面接がそろそろ始まるので、緊張している。どうすればいい?
### Response
面接に緊張しているのは当然ですが、それを適切にコントロールすることが大切です。以下の手順で緊張を抑え、面接に自信を持って出場できるようになります。
1. **準備をしよう**: 面接に向けて、会社や職務についての情報を収集し、自分のスキルや経験を整理しましょう。これにより、面接で質問に答える際に自信を持って話せるようになります。
2. **練習面接**: 友人や家族と練習面接を行いましょう。これにより、実際の面接での緊張を抑えることができます。
3. **深呼吸**: 面接前に、深呼吸をして緊張を解消しましょう。これにより、身体の緊張を解消し、心拍数を落とすことができます。
4. **ポジティブな思考**: 面接前に、自分の強みや成功体験を思い出し、自信を持って面接に向かいましょう。
5. **適切な服装**: 面接に向けて、適切な服装を選びましょう。これにより、自分自身の自信が高まり、緊張を抑えることができます。
6. **早めに到着**: 面接の場所に早めに到着し、緊張を解消しましょう。これにより、安心して面接に向かうことができます。
7. **面接中の緊張を抑える**: 面接中に緊張が高まる場合は、深呼吸をして緊張を解消しましょう。また、面接官に対して質問をすることで、自分の緊張を抑えることができます。
8. **自分を褒める**: 面接が終わった後は、自分自身を褒めましょう。面接に向けての準備や面接中のパフォーマンスについて、自分自身を褒めることで、次の面接に向けて自信を持って出場できるようになります。
面接に向けての準備と練習、緊張を抑えるための技術を身に着けることで、面接に自信を持って出場できるようになります。緊張は自然なものですが、適切にコントロールすることが大切です。
```
```text
### Prompt
東京駅から池袋駅まで行きたいです。一番早い路線は?更に、何分かかる?
### Response
東京駅から池袋駅までの一番早い路線は、JR山手線です。この路線を利用すると、通常の時間帯では約15分かかります。ただし、時間帯や列車の種類によっては異なる場合がありますので、実際に行く前に電車時刻表を確認してください。
```
False, the fastest line is Marunochi.
```text
### Prompt
仕方が。。。
### Response
「仕方がない」という表現は、ある状況や事態に対して、何もできない、あるいは何も変えられないという気持ちを表します。この表現は、不満や悲しみ、あるいは無力感を感じる時に使われます。例えば、自分の好きな人が別れを告げた時、あるいは予定通りに進まない不幸な出来事が起こった時など、自分の意志に反して起こることに対して使われます。この表現は、ある状況に対して、何もできないという無力感を表すものであり、その状況を受け入れることを意味します。
```
```text
### Prompt
犬が歩けば。。。
### Response
犬が歩けば、その動きは人間の心を満たします。
```
French
```text
### Prompt
Qui est le meilleur footballeur du monde ?
### Response
Déterminer le "meilleur footballeur du monde" est une question subjective et dépend de nombreux facteurs, tels que les performances récentes, les réalisations individuelles et collectives, ainsi que les préférences personnelles. Cependant, il y a quelques noms qui reviennent souvent dans les discussions sur les meilleurs joueurs de football au monde.
Au moment de ma dernière mise à jour en avril 2023, les joueurs suivants sont souvent considérés comme faisant partie des meilleurs du monde :
1. **Lionel Messi** - L'Argentin est considéré par beaucoup comme le meilleur joueur de tous les temps. Il a remporté de nombreux trophées avec le FC Barcelone et l'Argentine, y compris la Coupe du Monde 2022.
2. **Cristiano Ronaldo** - Le Portugais est connu pour sa compétitivité et sa capacité à marquer des buts. Il a remporté des trophées avec Manchester United, le Real Madrid, la Juventus et le Portugal.
3. **Kylian Mbappé** - Le jeune Français est considéré comme l'un des joueurs les plus prometteurs de sa génération. Il a déjà remporté la Coupe du Monde avec la France et a été un élément clé du succès du Paris Saint-Germain.
4. **Kevin De Bruyne** - Le milieu de terrain belge est connu pour sa vision du jeu et sa capacité à créer des occasions pour ses coéquipiers. Il a été un élément clé du succès de Manchester City.
5. **Virgil van Dijk** - Le défenseur néerlandais est considéré comme l'un des meilleurs défenseurs du monde. Il a joué un rôle crucial dans le succès de Liverpool ces dernières années.
Ces joueurs sont souvent cités comme étant parmi les meilleurs du monde, mais il est important de noter que le football est un sport en constante évolution, et de nouveaux talents émergent constamment.
```
# Training dataset
We trained this model on conversations between human users and GPT-4.
This consists of two datasets:
* 6,206 conversations from the [openchat/openchat_sharegpt4_dataset](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset) dataset ([link](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset/resolve/main/sharegpt_gpt4.json?download=true))
* 3,011 conversations that we created. We wanted to increase the representation of non_english prompts in our training dataset, so we sampled initial prompts from [lmsys/lmsys-chat-1m](https://huggingface.co/datasets/lmsys/lmsys-chat-1m), stratifying based on language. We then prompted gpt-4-0125 with these, and used the results as training data.
We plan to release more information on this second dataset soon, as we are using it another dataset.
The complete data used to train this model can be found at [lightblue/gpt4_conversations_multilingual](https://huggingface.co/datasets/lightblue/gpt4_conversations_multilingual)
# Training details
We trained this model using Axolotl's 4bit QLoRA configuration for roughly 100 minutes in a A100 (80GB) x 4 environment on the Azure cloud (Standard_NC96ads_A100_v4).
We used Deepspeed Zero2 to effectively train over 4 GPUs.
We used the following config to train the model:
Training config
```yaml
base_model: mistral-community/Mixtral-8x22B-v0.1
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
trust_remote_code: true
load_in_8bit: false
load_in_4bit: true
strict: false
datasets:
- path: lightblue/gpt4_conversations_multilingual
type: sharegpt
conversation: mistral
dataset_prepared_path: ./prepared_dataset_2048-multiling
val_set_size: 0
output_dir: ./qlora-out-2048-multiling
## You can optionally freeze the entire model and unfreeze a subset of parameters
unfrozen_parameters:
# - ^lm_head.weight$
# - ^model.embed_tokens.weight$[:32000]
# - model.layers.2[0-9]+.block_sparse_moe.gate
# - model.layers.2[0-9]+.block_sparse_moe.experts
# - model.layers.3[0-9]+.block_sparse_moe.gate
# - model.layers.3[0-9]+.block_sparse_moe.experts
model_config:
output_router_logits: true
adapter: qlora
lora_model_dir:
sequence_len: 2048
sample_packing: true
pad_to_sequence_len: true
lora_r: 16
lora_alpha: 16
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:
#lora_target_modules:
# - gate
# - q_proj
# - k_proj
# - v_proj
# - o_proj
# - w1
# - w2
# - w3
gradient_accumulation_steps: 2
micro_batch_size: 1
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002
use_wandb: true
wandb_project: wandb_project
wandb_entity: wandb_entity
wandb_name: wandb_name
train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false
gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
warmup_steps: 10
evals_per_epoch: 0
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 5
debug:
deepspeed: /workspace/axolotl/deepspeed_configs/zero2.json
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
```
# Developers
### Lead developer
Peter Devine - [ptrdvn](https://huggingface.co/ptrdvn)
### Advisor
Shunichi Taniguchi - [shun1taniguchi](https://huggingface.co/shun1taniguchi)