Edit model card

Visualize in Weights & Biases

cae853d7940781dc7e9d9554f584df5f

This model is a fine-tuned version of deepseek-ai/deepseek-coder-1.3b-base on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1733

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.41e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
1.2662 0.0128 1 1.2383
1.2249 0.0256 2 1.2367
1.2567 0.0384 3 1.2352
1.155 0.0512 4 1.2338
1.2201 0.064 5 1.2324
1.2077 0.0768 6 1.2310
1.2095 0.0896 7 1.2296
1.2579 0.1024 8 1.2283
1.2189 0.1152 9 1.2271
1.2382 0.128 10 1.2258
1.2605 0.1408 11 1.2248
1.1883 0.1536 12 1.2239
1.1614 0.1664 13 1.2230
1.2769 0.1792 14 1.2220
1.2099 0.192 15 1.2211
1.2414 0.2048 16 1.2201
1.2291 0.2176 17 1.2192
1.2381 0.2304 18 1.2184
1.1776 0.2432 19 1.2175
1.1788 0.256 20 1.2167
1.2061 0.2688 21 1.2159
1.1856 0.2816 22 1.2150
1.2252 0.2944 23 1.2142
1.2646 0.3072 24 1.2134
1.1888 0.32 25 1.2126
1.228 0.3328 26 1.2118
1.1969 0.3456 27 1.2111
1.1779 0.3584 28 1.2103
1.1726 0.3712 29 1.2096
1.1582 0.384 30 1.2089
1.1643 0.3968 31 1.2083
1.1878 0.4096 32 1.2076
1.2315 0.4224 33 1.2070
1.2022 0.4352 34 1.2063
1.1669 0.448 35 1.2057
1.1609 0.4608 36 1.2051
1.1888 0.4736 37 1.2045
1.2044 0.4864 38 1.2039
1.2389 0.4992 39 1.2033
1.1755 0.512 40 1.2027
1.1997 0.5248 41 1.2021
1.1997 0.5376 42 1.2015
1.1511 0.5504 43 1.2009
1.1689 0.5632 44 1.2004
1.1654 0.576 45 1.1998
1.2018 0.5888 46 1.1993
1.1503 0.6016 47 1.1988
1.1835 0.6144 48 1.1983
1.1831 0.6272 49 1.1977
1.1629 0.64 50 1.1972
1.2002 0.6528 51 1.1967
1.1467 0.6656 52 1.1963
1.193 0.6784 53 1.1959
1.1652 0.6912 54 1.1955
1.1446 0.704 55 1.1950
1.1657 0.7168 56 1.1946
1.1865 0.7296 57 1.1941
1.1803 0.7424 58 1.1936
1.1562 0.7552 59 1.1931
1.1881 0.768 60 1.1926
1.2279 0.7808 61 1.1921
1.2158 0.7936 62 1.1915
1.1586 0.8064 63 1.1910
1.2019 0.8192 64 1.1906
1.155 0.832 65 1.1901
1.1142 0.8448 66 1.1897
1.2389 0.8576 67 1.1894
1.1259 0.8704 68 1.1889
1.1568 0.8832 69 1.1886
1.1306 0.896 70 1.1882
1.1814 0.9088 71 1.1877
1.2137 0.9216 72 1.1873
1.1884 0.9344 73 1.1868
1.1446 0.9472 74 1.1863
1.1979 0.96 75 1.1858
1.2137 0.9728 76 1.1854
1.1541 0.9856 77 1.1851
1.1775 0.9984 78 1.1847
1.1489 1.0112 79 1.1844
1.131 1.024 80 1.1841
1.1427 1.0368 81 1.1837
1.2006 1.0496 82 1.1833
1.1473 1.0624 83 1.1830
1.1315 1.0752 84 1.1826
1.1497 1.088 85 1.1823
1.1845 1.1008 86 1.1820
1.1845 1.1136 87 1.1817
1.1167 1.1264 88 1.1814
1.1639 1.1392 89 1.1811
1.1952 1.152 90 1.1808
1.1327 1.1648 91 1.1805
1.0937 1.1776 92 1.1802
1.1549 1.1904 93 1.1799
1.1704 1.2032 94 1.1797
1.1479 1.216 95 1.1794
1.2221 1.2288 96 1.1792
1.1193 1.2416 97 1.1789
1.1259 1.2544 98 1.1786
1.1816 1.2672 99 1.1784
1.1566 1.28 100 1.1782
1.1093 1.2928 101 1.1780
1.1985 1.3056 102 1.1779
1.1553 1.3184 103 1.1778
1.1772 1.3312 104 1.1776
1.1154 1.3440 105 1.1775
1.1666 1.3568 106 1.1774
1.1494 1.3696 107 1.1772
1.1508 1.3824 108 1.1771
1.201 1.3952 109 1.1770
1.1919 1.408 110 1.1769
1.1885 1.4208 111 1.1768
1.2055 1.4336 112 1.1767
1.1522 1.4464 113 1.1766
1.1565 1.4592 114 1.1765
1.1551 1.472 115 1.1764
1.17 1.4848 116 1.1763
1.1631 1.4976 117 1.1762
1.1396 1.5104 118 1.1761
1.1355 1.5232 119 1.1760
1.1606 1.536 120 1.1760
1.1594 1.5488 121 1.1759
1.1783 1.5616 122 1.1758
1.1592 1.5744 123 1.1758
1.1159 1.5872 124 1.1757
1.1807 1.6 125 1.1756
1.2294 1.6128 126 1.1756
1.1922 1.6256 127 1.1755
1.1532 1.6384 128 1.1755
1.1956 1.6512 129 1.1754
1.1954 1.6640 130 1.1754
1.1479 1.6768 131 1.1753
1.1398 1.6896 132 1.1753
1.1724 1.7024 133 1.1752
1.1397 1.7152 134 1.1752
1.2162 1.728 135 1.1751
1.1854 1.7408 136 1.1751
1.1411 1.7536 137 1.1751
1.0747 1.7664 138 1.1750
1.1727 1.7792 139 1.1750
1.1701 1.792 140 1.1750
1.1688 1.8048 141 1.1750
1.1545 1.8176 142 1.1750
1.1512 1.8304 143 1.1749
1.203 1.8432 144 1.1749
1.1665 1.8560 145 1.1749
1.186 1.8688 146 1.1748
1.1283 1.8816 147 1.1748
1.1555 1.8944 148 1.1748
1.1243 1.9072 149 1.1748
1.1767 1.92 150 1.1747
1.1505 1.9328 151 1.1747
1.1012 1.9456 152 1.1747
1.2098 1.9584 153 1.1747
1.1476 1.9712 154 1.1746
1.2055 1.984 155 1.1746
1.1539 1.9968 156 1.1746
1.176 2.0096 157 1.1745
1.1357 2.0224 158 1.1745
1.1943 2.0352 159 1.1745
1.1447 2.048 160 1.1744
1.123 2.0608 161 1.1744
1.1638 2.0736 162 1.1744
1.1551 2.0864 163 1.1744
1.1409 2.0992 164 1.1743
1.1071 2.112 165 1.1743
1.1705 2.1248 166 1.1743
1.2038 2.1376 167 1.1742
1.1734 2.1504 168 1.1742
1.1538 2.1632 169 1.1742
1.179 2.176 170 1.1742
1.1614 2.1888 171 1.1741
1.1397 2.2016 172 1.1741
1.1569 2.2144 173 1.1741
1.1379 2.2272 174 1.1740
1.1304 2.24 175 1.1740
1.1855 2.2528 176 1.1740
1.1763 2.2656 177 1.1740
1.1194 2.2784 178 1.1739
1.0971 2.2912 179 1.1739
1.1566 2.304 180 1.1739
1.1421 2.3168 181 1.1739
1.1645 2.3296 182 1.1738
1.1782 2.3424 183 1.1738
1.1514 2.3552 184 1.1738
1.175 2.368 185 1.1738
1.1279 2.3808 186 1.1738
1.1158 2.3936 187 1.1738
1.202 2.4064 188 1.1737
1.164 2.4192 189 1.1737
1.1431 2.432 190 1.1737
1.1271 2.4448 191 1.1737
1.1746 2.4576 192 1.1736
1.1126 2.4704 193 1.1736
1.1652 2.4832 194 1.1736
1.1692 2.496 195 1.1736
1.1764 2.5088 196 1.1736
1.1905 2.5216 197 1.1736
1.1679 2.5344 198 1.1735
1.1324 2.5472 199 1.1735
1.124 2.56 200 1.1735
1.1296 2.5728 201 1.1735
1.1498 2.5856 202 1.1735
1.1845 2.5984 203 1.1735
1.0965 2.6112 204 1.1735
1.1511 2.624 205 1.1735
1.1703 2.6368 206 1.1734
1.1948 2.6496 207 1.1734
1.1688 2.6624 208 1.1734
1.1528 2.6752 209 1.1734
1.1261 2.6880 210 1.1734
1.1662 2.7008 211 1.1734
1.1596 2.7136 212 1.1734
1.1474 2.7264 213 1.1734
1.1813 2.7392 214 1.1734
1.1624 2.752 215 1.1734
1.1604 2.7648 216 1.1734
1.1596 2.7776 217 1.1734
1.2008 2.7904 218 1.1734
1.1813 2.8032 219 1.1734
1.2147 2.816 220 1.1734
1.1821 2.8288 221 1.1734
1.1476 2.8416 222 1.1734
1.1416 2.8544 223 1.1734
1.1228 2.8672 224 1.1733
1.1908 2.88 225 1.1733
1.1666 2.8928 226 1.1733
1.0962 2.9056 227 1.1733
1.1721 2.9184 228 1.1733
1.1158 2.9312 229 1.1733
1.1282 2.944 230 1.1733
1.1401 2.9568 231 1.1733
1.1897 2.9696 232 1.1733
1.1395 2.9824 233 1.1733
1.141 2.9952 234 1.1733

Framework versions

  • PEFT 0.10.0
  • Transformers 4.43.0.dev0
  • Pytorch 2.2.2+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
3
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for stojchet/cae853d7940781dc7e9d9554f584df5f

Adapter
(99)
this model