---
base_model: mistralai/Mistral-Nemo-Instruct-2407
tags:
- text-generation-inference
- transformers
- unsloth
- trl
- gammacorpus
- geneva
- chat
- mistral
- conversational
license: apache-2.0
language:
- en
- fr
- de
- es
- it
- pt
- ru
- zh
- ja
datasets:
- rubenroy/GammaCorpus-v2-50k
pipeline_tag: text-generation
library_name: transformers
---

![Geneva Banner](https://cdn.ruben-roy.com/AI/Geneva/img/banner-12B-50k.png)

# Geneva 12B GammaCorpus v2-50k
*A Mistral NeMo model fine-tuned on the GammaCorpus dataset*

## Overview
Geneva 12B GammaCorpus v2-50k is a fine-tune of Mistral's **Mistral Nemo Instruct 2407** model. Geneva is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-50k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-50k).

## Model Details
- **Base Model:** [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407)
- **Parameters:** 12B
- **Layers:** 40
- **Dim:** 5,120
- **Head dim:** 128
- **Hidden dim:** 14,336
- **Activation Function:** SwiGLU
- **Number of heads:** 32
- **Number of kv-heads:** 8 (GQA)
- **Vocabulary size:** 2**17 ~= 128k
- **Rotary embeddings (theta = 1M)**

## Training Details

Geneva-12B-GCv2-50k underwent fine-tuning with 1 A100 GPU for ~15 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Geneva-12B-GCv2-50k was trained for **60 Epochs**. 

## Usage

### Requirements

Please use the following Transformers version here:

```
pip install git+https://github.com/huggingface/transformers.git
```

### Quickstart

If you want to use Hugging Face `transformers` to generate text, you can do something like this:

```python
from transformers import pipeline

prompt = "How tall is the Eiffel tower?"

messages = [
    {"role": "system", "content": "You are a helpful assistant named Geneva, built on the Mistral NeMo model developed by Mistral AI, and fine-tuned by Ruben Roy."},
    {"role": "user", "content": prompt},
]

infer = pipeline("text-generation", model="rubenroy/Geneva-12B-GCv2-50k", max_new_tokens=128)

infer(messages)
```

## About GammaCorpus

This model, and all Geneva models, are trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that is filled with structured and filtered multi-turn conversations.
GammaCorpus has 4 version with different sizes in each. These are the following versions and sizes:

### GammaCorpus v1
- 10k UNFILTERED
- 50k UNFILTERED
- 70k UNFILTERED

Here is a link to the GCv1 dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-v1-67935e4e52a04215f15a7a60

### GammaCorpus v2
- 10k
- **50k <-- This is the version of GammaCorpus v2 that the Geneva model you are using was trained on.**
- 100k
- 500k
- 1m
- 5m

Here is a link to the GCv2 dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-v2-67935e895e1259c404a579df

### GammaCorpus CoT
- Math 170k

Here is a link to the GC-CoT dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-cot-6795bbc950b62b1ced41d14f

### GammaCorpus QA
- Fact 450k

Here is a link to the GC-QA dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-qa-679857017bb3855234c1d8c7

### The link to the full GammaCorpus dataset collection can be found [here](https://huggingface.co/collections/rubenroy/gammacorpus-67765abf607615a0eb6d61ac).

## Known Limitations:

- **Bias:** We have tried our best to mitigate as much bias we can, but please be aware of the possibility that the model might generate some biased answers.

## Licence:
The model is released under the **[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)**. Please refer to the license for usage rights and restrictions.