Update README.md
Browse files
README.md
CHANGED
@@ -59,8 +59,7 @@ pipeline_tag: text-generation
|
|
59 |
# BEE-spoke-data/mega-ar-126m-4k
|
60 |
|
61 |
|
62 |
-
This may not be the _best_ language model, but it is a language model! It's interesting for
|
63 |
-
|
64 |
|
65 |
Details:
|
66 |
|
@@ -75,7 +74,9 @@ For more info on MEGA (_& what some of the params above mean_), check out the [m
|
|
75 |
|
76 |
## Usage
|
77 |
|
78 |
-
|
|
|
|
|
79 |
|
80 |
## evals
|
81 |
|
|
|
59 |
# BEE-spoke-data/mega-ar-126m-4k
|
60 |
|
61 |
|
62 |
+
This may not be the _best_ language model, but it is a language model! It's interesting for several reasons, not the least of which is that it's not technically a transformer.
|
|
|
63 |
|
64 |
Details:
|
65 |
|
|
|
74 |
|
75 |
## Usage
|
76 |
|
77 |
+
Usage is the same as any other small textgen model.
|
78 |
+
|
79 |
+
Given the model's small size and architecture, it's probably best to leverage its longer context by adding input context to "see more" rather than "generate more".
|
80 |
|
81 |
## evals
|
82 |
|