Transfer learning?

by Zemulax - opened Jul 10

Jul 10

Hi there, I just wanted to know if you pretrained this model without using gpt or any other model as a boost. like from literal scratch were you did not load any pretraained checkpoint. I need help.
Thanks

Felladrin

Owner Jul 24

Hi, @Zemulax !

Yes, it was trained from scratch, without using any other model.
I used specifically this command line, listed under "Creating a model on the fly", on Transformers examples:

You can also read more about the making of this model here:
The making of Minueza-32M: Transformer model trained from scratch

Zemulax

Jul 26

I read your incredible story. Its similar to what I want to achieve.
However, I have 5billion tokens at my fingertips that I want to utilise. I am struggling with lr. How do I set the learning rate, which lr is suitable for my situation. I have done research but still cannot come to a draw. Please help

Felladrin

Owner Aug 10

Ah, the learning rate...
I believe each dataset has its own unique LR sweet spot.
Before actually starting training the model, I suggest doing a warmup training (using only 10K samples from your dataset) with 4 different LRs and then checking which one provided the best responses. Then you'll have at least a better starting point.
The first four LRs that I try on are: 5e-5, 5e-6, 8e-7, 2e-4.

Zemulax

Aug 15

Thank you Victor, and oh,how much did it cost you to pretrain, what GPUS did you use and cloud provider.

Felladrin

Owner Aug 17

I trained Minueza-32M all locally, on a Macbook M1. It took some weeks and I thought I'd have an increase in the electricity bill, but in the end, I didn't notice any difference, so I'd say there were no costs.

Zemulax

Aug 20

wow awesome. Thank you broh, this has been helpful. I am taking it a step further by pretraining something similar to GPT1 or 2-small. Its quite a journey I must say

Felladrin changed discussion status to closed about 1 month ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment