--- license: mit --- ### Model Card: **TinyLlama-1.1B-Chat-v1.0-Unfiltered** --- **Model Name**: TinyLlama-1.1B-Chat-v1.0-Unfiltered **Model Type**: Conversational AI Model **Architecture**: Based on a 1.1B parameter TinyLlama architecture **Training Data**: - Fine-tuned on the "dan_remixed" dataset (2.7MB). - The dataset improves spelling, grammar, and consistency while replacing references to violent crimes with non-violent activities and removes self-censorship from explicatives. **Training Time**: Approximately 30-45 minutes. Each validation epoch takes ~322 seconds. **Hardware**: Trained on GPU (specific GPU details not provided). --- **Training Performance**: - **Epoch Losses**: - Epoch 1: 0.7209 - Epoch 2: 0.4441 - Epoch 3: 0.3683 - Epoch 4: 0.3358 - Epoch 5: 0.3145 - **Final Training Loss (Epoch 5)**: 0.3145 --- **Validation Performance** (5 Epochs): - **Epoch 1**: - Training Loss: 0.2921 - Validation Loss: 0.7962 - Perplexity: 2.22 - Epoch completed in 321.64 seconds - **Epoch 2**: - Training Loss: 0.2872 - Validation Loss: 0.7672 - Perplexity: 2.15 - Epoch completed in 321.91 seconds - **Epoch 3**: - Training Loss: 0.2874 - Validation Loss: 0.7821 - Perplexity: 2.19 - Epoch completed in 321.94 seconds - **Epoch 4**: - Training Loss: 0.2864 - Validation Loss: 0.7796 - Perplexity: 2.18 - Epoch completed in 322.01 seconds - **Epoch 5**: - Training Loss: 0.2831 - Validation Loss: 0.8017 - Perplexity: 2.23 - Epoch completed in 322.01 seconds --- **Optimizer**: AdamW, learning rate: 1e-5 **Loss Function**: Cross-Entropy Loss, ignoring padding tokens (ignore_index=-100) **Use Case**: Conversational AI designed for general, unrestricted conversation, with no filtering on the nature of responses, provided the content is non-violent. --- **Limitations**: - Due to the small fine-tuning dataset size (2.7MB), the model may be prone to **overfitting** and **bias**. - The dataset has been modified to avoid violent language, but the model might still exhibit strong or explicit responses. **Metrics**: - Loss and perplexity have been tracked, and more conversational metrics (like BLEU, ROUGE, or human evaluation) could be explored.