Commit
•
594b7e2
1
Parent(s):
a0687c6
Update README.md (#1)
Browse files- Update README.md (2a776549b399a9b40f8663b794e12d75187a8083)
Co-authored-by: Kate K <katek@users.noreply.huggingface.co>
README.md
CHANGED
@@ -566,8 +566,7 @@ language:
|
|
566 |
|
567 |
# Refact-1.6B
|
568 |
|
569 |
-
Finally, the model we started training with our blog post
|
570 |
-
[Applying Recent Innovations](https://refact.ai/blog/2023/applying-recent-innovations-to-train-model/) is ready 🎉
|
571 |
|
572 |
After fine-tuning on generated data, it beats Replit 3b, Stability Code 3b and many other models. It almost beats
|
573 |
StarCoder ten times the size!
|
@@ -614,7 +613,7 @@ Filtering is the key to success of this model:
|
|
614 |
The text to code proportion was 50:50, model trained for 1.2T tokens.
|
615 |
|
616 |
We don't release the base model, because its Fill-in-the-Middle (FIM) capability likes to repeat itself too much, so
|
617 |
-
its practical use is limited. But if you still want it, write us a message on
|
618 |
|
619 |
|
620 |
# Finetuning
|
@@ -633,7 +632,7 @@ The former is likely finished, so the model tries to come up with a suggestion t
|
|
633 |
You are likely to have half-written code as you work on it, there is no single addition that can repair it
|
634 |
fully.
|
635 |
|
636 |
-
In practice, model needs to have a tendency to stop after a couple of lines added, and sometimes don't write
|
637 |
anything at all. We found that just giving it empty completions, single line completions, multiline
|
638 |
completions that end with a smaller text indent or at least a newline -- makes it much more usable. This data
|
639 |
was used as the rest 85% of the finetune dataset.
|
|
|
566 |
|
567 |
# Refact-1.6B
|
568 |
|
569 |
+
Finally, the model we started training with our [blog post](https://refact.ai/blog/2023/applying-recent-innovations-to-train-model/) is ready 🎉
|
|
|
570 |
|
571 |
After fine-tuning on generated data, it beats Replit 3b, Stability Code 3b and many other models. It almost beats
|
572 |
StarCoder ten times the size!
|
|
|
613 |
The text to code proportion was 50:50, model trained for 1.2T tokens.
|
614 |
|
615 |
We don't release the base model, because its Fill-in-the-Middle (FIM) capability likes to repeat itself too much, so
|
616 |
+
its practical use is limited. But if you still want it, write us a message on Discord.
|
617 |
|
618 |
|
619 |
# Finetuning
|
|
|
632 |
You are likely to have half-written code as you work on it, there is no single addition that can repair it
|
633 |
fully.
|
634 |
|
635 |
+
In practice, model needs to have a tendency to stop after a couple of lines are added, and sometimes don't write
|
636 |
anything at all. We found that just giving it empty completions, single line completions, multiline
|
637 |
completions that end with a smaller text indent or at least a newline -- makes it much more usable. This data
|
638 |
was used as the rest 85% of the finetune dataset.
|