Update README.md
Browse files
README.md
CHANGED
@@ -14,18 +14,28 @@ inference: False
|
|
14 |
license: apache-2.0
|
15 |
---
|
16 |
|
17 |
-
# ethzanalytics/gpt-j-8bit-
|
18 |
|
19 |
-
|
|
|
|
|
20 |
|
21 |
-
_NOTE: this needs to be loaded via the special patching technique outlined in the hivemind model card (as with all 8bit models)_
|
22 |
|
|
|
23 |
|
24 |
-
TODO: rest of README
|
25 |
|
|
|
26 |
|
27 |
-
|
|
|
|
|
|
|
|
|
28 |
|
29 |
|
30 |
-
|
31 |
|
|
|
|
|
|
|
|
|
|
14 |
license: apache-2.0
|
15 |
---
|
16 |
|
17 |
+
# ethzanalytics/gpt-j-8bit-daily_dialogues
|
18 |
|
19 |
+
<a href="https://colab.research.google.com/gist/pszemraj/e49c60aafe04acc52fcfdd1baefe12e4/-ai-msgbot-gpt-j-6b-8bit-with-hub.ipynb">
|
20 |
+
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
21 |
+
</a>
|
22 |
|
|
|
23 |
|
24 |
+
This version of `hivemind/gpt-j-6B-8bit` is fine-tuned on a parsed version of the [daily dialogues](https://huggingface.co/datasets/daily_dialog) dataset for an epoch. It can be used as a chatbot.
|
25 |
|
|
|
26 |
|
27 |
+
It is designed to be used with [ai-msgbot](https://github.com/pszemraj/ai-msgbot) to take advantage of prompt engineering in fine-tuning.
|
28 |
|
29 |
+
## Usage
|
30 |
+
|
31 |
+
_**NOTE: this needs to be loaded via the special patching technique** outlined in the hivemind model card (as with all 8bit models)_
|
32 |
+
|
33 |
+
Examples of how to load the model correctly are already in place in the notebook linked above. A `.py` of said notebook was uploaded to the repo for reference - [link here](https://huggingface.co/ethzanalytics/gpt-j-8bit-daily_dialogues/blob/main/_ai_msgbot_gpt_j_6b_8bit_with_hub.py)
|
34 |
|
35 |
|
36 |
+
## Training
|
37 |
|
38 |
+
For details, please see [this wandb report](https://wandb.ai/pszemraj/conversational-6B-train-vanilla/reports/Training-6B-GPT-J-8bit-for-Dialogue--VmlldzoyNTg3MzE0) for both the daily-dialogues version and the WoW version.
|
39 |
+
|
40 |
+
|
41 |
+
---
|