---
base_model: deepseek-ai/deepseek-coder-1.3b-base
datasets:
- generator
library_name: peft
license: other
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: sanity_check
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/stojchets/huggingface/runs/sanity_check)
# sanity_check

This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on the generator dataset.
It achieves the following results on the evaluation set:
- Loss: 1.1990

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1.41e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1

### Training results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 1.2662        | 0.0128 | 1    | 1.2382          |
| 1.2248        | 0.0256 | 2    | 1.2365          |
| 1.2565        | 0.0384 | 3    | 1.2350          |
| 1.1548        | 0.0512 | 4    | 1.2335          |
| 1.2197        | 0.064  | 5    | 1.2321          |
| 1.2074        | 0.0768 | 6    | 1.2307          |
| 1.2091        | 0.0896 | 7    | 1.2294          |
| 1.2575        | 0.1024 | 8    | 1.2281          |
| 1.2185        | 0.1152 | 9    | 1.2268          |
| 1.2379        | 0.128  | 10   | 1.2256          |
| 1.2604        | 0.1408 | 11   | 1.2247          |
| 1.1883        | 0.1536 | 12   | 1.2238          |
| 1.1613        | 0.1664 | 13   | 1.2230          |
| 1.277         | 0.1792 | 14   | 1.2221          |
| 1.2099        | 0.192  | 15   | 1.2212          |
| 1.2417        | 0.2048 | 16   | 1.2203          |
| 1.2294        | 0.2176 | 17   | 1.2195          |
| 1.2384        | 0.2304 | 18   | 1.2187          |
| 1.1781        | 0.2432 | 19   | 1.2180          |
| 1.1793        | 0.256  | 20   | 1.2172          |
| 1.207         | 0.2688 | 21   | 1.2165          |
| 1.1863        | 0.2816 | 22   | 1.2158          |
| 1.226         | 0.2944 | 23   | 1.2150          |
| 1.2658        | 0.3072 | 24   | 1.2144          |
| 1.1897        | 0.32   | 25   | 1.2137          |
| 1.229         | 0.3328 | 26   | 1.2131          |
| 1.1983        | 0.3456 | 27   | 1.2125          |
| 1.1793        | 0.3584 | 28   | 1.2119          |
| 1.1742        | 0.3712 | 29   | 1.2114          |
| 1.1599        | 0.384  | 30   | 1.2108          |
| 1.1661        | 0.3968 | 31   | 1.2103          |
| 1.19          | 0.4096 | 32   | 1.2098          |
| 1.2341        | 0.4224 | 33   | 1.2094          |
| 1.2048        | 0.4352 | 34   | 1.2089          |
| 1.1698        | 0.448  | 35   | 1.2084          |
| 1.1636        | 0.4608 | 36   | 1.2079          |
| 1.1916        | 0.4736 | 37   | 1.2075          |
| 1.2079        | 0.4864 | 38   | 1.2070          |
| 1.2424        | 0.4992 | 39   | 1.2066          |
| 1.1794        | 0.512  | 40   | 1.2062          |
| 1.204         | 0.5248 | 41   | 1.2058          |
| 1.2037        | 0.5376 | 42   | 1.2055          |
| 1.1547        | 0.5504 | 43   | 1.2051          |
| 1.1731        | 0.5632 | 44   | 1.2048          |
| 1.1697        | 0.576  | 45   | 1.2044          |
| 1.2069        | 0.5888 | 46   | 1.2041          |
| 1.1558        | 0.6016 | 47   | 1.2038          |
| 1.1896        | 0.6144 | 48   | 1.2035          |
| 1.1886        | 0.6272 | 49   | 1.2032          |
| 1.169         | 0.64   | 50   | 1.2029          |
| 1.205         | 0.6528 | 51   | 1.2026          |
| 1.1534        | 0.6656 | 52   | 1.2023          |
| 1.1985        | 0.6784 | 53   | 1.2021          |
| 1.1716        | 0.6912 | 54   | 1.2019          |
| 1.1512        | 0.704  | 55   | 1.2016          |
| 1.1721        | 0.7168 | 56   | 1.2014          |
| 1.1946        | 0.7296 | 57   | 1.2012          |
| 1.1883        | 0.7424 | 58   | 1.2010          |
| 1.1634        | 0.7552 | 59   | 1.2008          |
| 1.198         | 0.768  | 60   | 1.2006          |
| 1.2374        | 0.7808 | 61   | 1.2005          |
| 1.2249        | 0.7936 | 62   | 1.2003          |
| 1.1671        | 0.8064 | 63   | 1.2002          |
| 1.2117        | 0.8192 | 64   | 1.2000          |
| 1.1646        | 0.832  | 65   | 1.1999          |
| 1.1244        | 0.8448 | 66   | 1.1998          |
| 1.25          | 0.8576 | 67   | 1.1997          |
| 1.1358        | 0.8704 | 68   | 1.1995          |
| 1.1664        | 0.8832 | 69   | 1.1995          |
| 1.1417        | 0.896  | 70   | 1.1994          |
| 1.1963        | 0.9088 | 71   | 1.1993          |
| 1.2278        | 0.9216 | 72   | 1.1992          |
| 1.2004        | 0.9344 | 73   | 1.1992          |
| 1.1583        | 0.9472 | 74   | 1.1991          |
| 1.211         | 0.96   | 75   | 1.1991          |
| 1.2288        | 0.9728 | 76   | 1.1991          |
| 1.1682        | 0.9856 | 77   | 1.1990          |
| 1.1916        | 0.9984 | 78   | 1.1990          |


### Framework versions

- PEFT 0.10.0
- Transformers 4.43.0.dev0
- Pytorch 2.2.2+cu121
- Datasets 2.19.2
- Tokenizers 0.19.1