Locutusque
/

gpt2-conversational-or-qa

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Locutusque commited on Apr 28, 2023

Commit

20d4ee4

•

1 Parent(s): 54181ec

Update README.md

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -29,7 +29,11 @@ The model is trained on a large dataset of conversational data, consisting of in
 The model architecture used in this model is GPT-2, a transformer-based language model that is capable of generating high-quality text with a wide range of styles and tones. The GPT-2 architecture consists of a multi-layered transformer encoder-decoder, with self-attention mechanisms that allow the model to capture long-term dependencies and generate coherent text.
 ## Evaluation Metrics
-The model is evaluated based on several metrics, including loss, reward, penalty, BLEU score, and perplexity. The loss metric is calculated during training and reflects the difference between the predicted output and the actual output. The reward metric is based on the number of correct words generated by the model, while the penalty metric penalizes the model for repeating words consecutively. The BLEU score measures the similarity between the generated text and the ground truth text, while the perplexity metric measures how well the model is able to predict the next word in a sequence.
 ## Limitations and Bias
 Because I have a rather weak computer for machine learning, I was not able to train this model for too long. The model may output irrelevant answers, or even sometimes the responses can be nonsensical. The Interface API is not a recommended place to test the model because this model requires a input format.

 The model architecture used in this model is GPT-2, a transformer-based language model that is capable of generating high-quality text with a wide range of styles and tones. The GPT-2 architecture consists of a multi-layered transformer encoder-decoder, with self-attention mechanisms that allow the model to capture long-term dependencies and generate coherent text.
 ## Evaluation Metrics
+The model is evaluated based on several metrics, including loss, reward, penalty, BLEU score, and perplexity. The loss metric is calculated during training and reflects the difference between the predicted output and the actual output. The reward metric is based on the number of correct words generated by the model, while the penalty metric penalizes the model for repeating words consecutively. The BLEU score measures the similarity between the generated text and the ground truth text, while the perplexity metric measures how well the model is able to predict the next word in a sequence. During validation, the model achieved the following metrics:
+- Average BLEU score: 20
+- Average perplexity: 32
+- Average loss: 1.7
 ## Limitations and Bias
 Because I have a rather weak computer for machine learning, I was not able to train this model for too long. The model may output irrelevant answers, or even sometimes the responses can be nonsensical. The Interface API is not a recommended place to test the model because this model requires a input format.