Edit model card

Finetune of microsoft/Phi-3-mini-128k-instruct on m-a-p/CodeFeedback-Filtered-Instruction for ~9-10h using a single 3090 24GB.

Due to limited resources and time, the training was only on half (0.5136) of the epoch.

  train_loss: 0.43311
    learning_rate=5e-5,
    lr_scheduler_type="cosine",
    max_length=1024,
    max_prompt_length=512,
    overwrite_output_dir=True,
    beta=0.1,
    gradient_accumulation_steps=8,
    optim="adamw_torch",
    num_train_epochs=1,
    evaluation_strategy="steps",
    eval_steps=0.2,
    logging_steps=1,
    warmup_steps=50,
    fp16=True,
    save_steps=50
Downloads last month
433
Safetensors
Model size
3.82B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train RDson/Phi-3-mini-code-finetune-128k-instruct-v1