MistralDarwin-7b-v0.1
This model card describes MistralDarwin-7b-v0.1, an advanced language model fine-tuned to emulate the identity and communication style of Charles Darwin. This model, developed by Phanerozoic, uses Darwin's "Origin of Species" formatted in a Q&A style to engage users in educational and entertaining discussions on topics related to early Darwinian theories.
Model Description
- Developed by: phanerozoic
- License: cc-by-nc-4.0
- Finetuned from model: OpenHermes 2.5
Direct Use
MistralDarwin-7b-v0.1 is intended for applications that require conversational exchanges in the style of Charles Darwin, such as virtual educational platforms, interactive exhibits, and narrative experiences that explore historical scientific perspectives.
Downstream Use
While primarily designed for direct interaction, MistralDarwin-7b-v0.1 can be adapted for downstream tasks that benefit from a historical and evolutionary context in the language model's responses.
Out-of-Scope Use
The model is not suited for modern scientific research or any application requiring up-to-date evolutionary biology knowledge. Its use is limited to the context of Darwin's own writings and time period.
Bias, Risks, and Limitations
Given its training on 19th-century literature, MistralDarwin-7b-v0.1 may reflect the knowledge and biases of that era. It is crucial to consider these limitations when interacting with the model, especially in educational settings.
Recommendations
Users are recommended to provide context at the start of conversations to help the model maintain its Darwinian character. It is also advised to use custom stopping strings to prevent overrun and to ensure more accurate and relevant responses.
Custom Stopping Strings Usage
To improve the performance of MistralDarwin-7b-v0.1 and address issues with output overrun, the following custom stopping strings are recommended:
- "},"
- "User:"
- "You:"
- ""\n"
- "\nUser"
- "\nUser:"
These strings help delineate the ends of responses and improve the structural integrity of the dialogue generated by the model.
Training Data
The model was trained on approximately 1600 Q&A lines derived from Charles Darwin's "Origin of Species," which were processed into a Machine ML format suitable for conversational AI.
Preprocessing
The dataset was carefully formatted to ensure a consistent and structured input for fine-tuning, aimed at replicating Darwin's style of inquiry and explanation.
Training Hyperparameters
- Training Regime: FP32
- Warmup Steps: 1
- Per Device Train Batch Size: 1
- Gradient Accumulation Steps: 32
- Max Steps: 1000
- Learning Rate: 0.0002
- Logging Steps: 1
- Save Steps: 1
- Lora Alpha: 32
- Dimension Count: 16
Speeds, Sizes, Times
Training was efficiently completed in about 10 minutes using an RTX 6000 Ada GPU.
Testing Data
MistralDarwin-7b-v0.1 was evaluated against the Wikitext database, achieving a perplexity score of 5.189.
Factors
The evaluation focused on the model's ability to maintain coherent and contextually appropriate responses in Darwin's style.
Metrics
Perplexity was the primary metric used to measure language modeling performance, especially in terms of output length and coherence.
Results
The model consistently self-identifies and behaves in a manner befitting of Charles Darwin, with occasional reliance on users to enforce context. It demonstrates a balance between 19th-century speech and knowledge, providing a credible emulation of Darwin's character.
Performance Highlights
MistralDarwin-7b-v0.1 has shown proficiency in producing outputs that are both coherent and characteristic of Charles Darwin's writing style. It manages to maintain this performance consistently, even when prompted as a different character, which is a testament to its fine-tuning.
Summary
MistralDarwin-7b-v0.1 represents a notable step forward in creating domain-specific language models that can embody historical figures for educational and entertainment purposes.
Model Architecture and Objective
MistralDarwin-7b-v0.1 is built upon the Mistral model architecture, further fine-tuned with LoRA modifications to embody the communication style of Charles Darwin. The objective is to replicate Darwin's linguistic patterns and knowledge from the "Origin of Species" in a conversational AI model.
Compute Infrastructure
- Hardware Type: RTX 6000 Ada GPU
- Training Duration: Approximately 10 minutes
Acknowledgments
We express our sincere appreciation to the Mistral team for providing the base model upon which this Darwinian iteration was built. Special recognition goes to the OpenHermes 2.5 team for their fine-tuning efforts that have paved the way for this project. Our use of LoRA techniques, integrated back into the base model, exemplifies the collaborative spirit in advancing the field of AI and language modeling.
- Downloads last month
- 11