README.md · datasciguy/TinyLlama-1.1B-Chat-v1.0-Unfiltered at 4801cca46b2719ea6543f88f539a2ebfcf62ed7a

metadata

license: mit

Model Card: TinyLlama-1.1B-Chat-v1.0-Unfiltered

Model Name: TinyLlama-1.1B-Chat-v1.0-Unfiltered
Model Type: Conversational AI Model
Architecture: Based on a 1.1B parameter TinyLlama architecture

Training Data:

Fine-tuned on the "dan_remixed" dataset (2.7MB).
The dataset improves spelling, grammar, and consistency while replacing references to violent crimes with non-violent activities and removes self-censorship from explicatives.

Training Time: Approximately 30-45 minutes. Each validation epoch takes ~322 seconds.
Hardware: Trained on GPU (specific GPU details not provided).

Training Performance:

Epoch Losses:
- Epoch 1: 0.7209
- Epoch 2: 0.4441
- Epoch 3: 0.3683
- Epoch 4: 0.3358
- Epoch 5: 0.3145
Final Training Loss (Epoch 5): 0.3145

Validation Performance (5 Epochs):

Epoch 1:
- Training Loss: 0.2921
- Validation Loss: 0.7962
- Perplexity: 2.22
- Epoch completed in 321.64 seconds
Epoch 2:
- Training Loss: 0.2872
- Validation Loss: 0.7672
- Perplexity: 2.15
- Epoch completed in 321.91 seconds
Epoch 3:
- Training Loss: 0.2874
- Validation Loss: 0.7821
- Perplexity: 2.19
- Epoch completed in 321.94 seconds
Epoch 4:
- Training Loss: 0.2864
- Validation Loss: 0.7796
- Perplexity: 2.18
- Epoch completed in 322.01 seconds
Epoch 5:
- Training Loss: 0.2831
- Validation Loss: 0.8017
- Perplexity: 2.23
- Epoch completed in 322.01 seconds

Optimizer: AdamW, learning rate: 1e-5
Loss Function: Cross-Entropy Loss, ignoring padding tokens (ignore_index=-100)
Use Case: Conversational AI designed for general, unrestricted conversation, with no filtering on the nature of responses, provided the content is non-violent.

Limitations:

Due to the small fine-tuning dataset size (2.7MB), the model may be prone to overfitting and bias.
The dataset has been modified to avoid violent language, but the model might still exhibit strong or explicit responses.

Metrics:

Loss and perplexity have been tracked, and more conversational metrics (like BLEU, ROUGE, or human evaluation) could be explored.