Τhis is just an Experiment!
Model Description
The Digital Navigator model is designed to assist users by generating natural language responses to input queries. It is a fine-tuned GPT-2 model, customized for providing assistanse to the visitors of Communication and Digital Media Department's website: https://cdm.uowm.gr/en/index/.
- Developed by: Papagiannakis Panagiotis
- Model type: GPT-2 (Generative Pre-trained Transformer 2)
- Language(s) (NLP): English
- License: CC BY-NC 4.0
- Finetuned from model [optional]: GPT-2
Direct Use
The Digital Navigator model can be directly used for generating conversational responses in English. It is intended for use in chatbots, virtual assistants, and other applications requiring natural language understanding and generation.
- Conversational AI
- Customer Support
- Virtual Assistance
Out-of-Scope Use
- Generating harmful, biased, or misleading content
- Use in high-stakes decision-making without human oversight
- Applications requiring high accuracy and context understanding
Bias, Risks, and Limitations
The model may generate biased or inappropriate content based on the training data. Users should be cautious of the following:
- Inherent biases in the training data
- Generation of factually incorrect information
- Limited understanding of context and nuance
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. It is recommended to monitor and evaluate the model's output in real-world applications to ensure it meets the desired criteria and ethical standards.
Training Details
Training Data
The model was fine-tuned on a dataset that includes conversational data. These data were collected from my university's website https://cdm.uowm.gr/en/index/.
Training Procedure
The model was trained using standard procedures for fine-tuning GPT-2 models.
Training Hyperparameters
- Training regime: fp32
- Learning Rate: 5e-5
- Batch Size: 16
- Epochs: 30
- Block Size: 128
Metrics
The model was evaluated using metrics like BLEU score for language quality, accuracy for factual correctness, precision for the accuracy of positive predictions, recall for the ability to find all relevant instances, and F1 score to balance precision and recall.
Results
Summary
Due to the high complexity and small variety of the data:
- Training loss: 0.3999
- Validation loss: 1.5456
- Precision: 0.011
- Recall: 0.012
- Accuracy: 0.012
- F1: 0.011
Model Examination
- Hardware Type: NVIDIA T4
- Hours used: 18 minutes and 47 seconds
- Cloud Provider: Google Colab
Model Architecture and Objective
The model architecture is a transformer gpt. Its is an LLM model with a purpose of providing assistanse to visitors of my university's website.
Citation [optional]
Coming soon...
BibTeX:
Coming soon...
APA:
Coming soon...
- Downloads last month
- 11