Gromenauer-7B
Overview
Gromenauer-7B is a Spanish language model designed to understand and generate high-quality Spanish text. Developed using the robust Mistral architecture, this model has been trained on an extensive literary corpus, ensuring it captures a wide range of linguistic nuances, styles, and contexts found in Spanish literature.
Model Details
- Model Type: Mistral
- Sequence Length: 8192
- Hidden Dimension: 4096
- Intermediate Dimension: 14336
- Number of Layers: 32
- Number of Attention Heads: 32
- Number of Key-Value Heads: 8
- Activation Function: SiLU
- Initializer Range: 0.02
- Layer Norm Epsilon: 1.0e-05
- Use Flash Attention: Yes
- Gradient Checkpointing: Enabled (Block Size: 5)
- Sliding Window Attention: 4096
- Use Bias: No
Training Details
- Tokenizer: mistralai/Mistral-7B-v0.1
- Batch Size: 512
- Learning Rate: 1e-5
- Optimizer: Adam with beta1=0.9, beta2=0.95, epsilon=1e-8
- Weight Decay: 0.1
- Warmup Steps: 200
- Learning Rate Schedule: Cosine
- Number of Training Steps: 7000
Usage
To load the model in your project, you can use the following code:
from transformers import AutoModel, AutoTokenizer
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("bertin-project/Gromenauer-7B")
# Load the model
model = AutoModel.from_pretrained("bertin-project/Gromenauer-7B")
# Example usage
text = "Introduce aquí tu texto en español."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
- Downloads last month
- 2,851
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.