Mistral_Sparse_refined_web_50p_cut_pre_mlp_2024-03-23
This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.1205
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 0
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 10000
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.5127 | 0.0 | 25 | 2.5938 |
2.3459 | 0.01 | 50 | 2.5549 |
2.3273 | 0.01 | 75 | 2.5028 |
2.3381 | 0.02 | 100 | 2.5017 |
2.2772 | 0.02 | 125 | 2.4983 |
2.2464 | 0.03 | 150 | 2.4843 |
2.2732 | 0.03 | 175 | 2.4808 |
2.3294 | 0.03 | 200 | 2.4697 |
2.1752 | 0.04 | 225 | 2.4677 |
2.3093 | 0.04 | 250 | 2.4660 |
2.3592 | 0.05 | 275 | 2.4681 |
2.3321 | 0.05 | 300 | 2.4595 |
2.2232 | 0.05 | 325 | 2.4572 |
2.2089 | 0.06 | 350 | 2.4553 |
2.204 | 0.06 | 375 | 2.4508 |
2.2677 | 0.07 | 400 | 2.4514 |
2.2544 | 0.07 | 425 | 2.4482 |
2.2969 | 0.08 | 450 | 2.4442 |
2.3415 | 0.08 | 475 | 2.4489 |
2.3428 | 0.08 | 500 | 2.4489 |
2.2938 | 0.09 | 525 | 2.4393 |
2.3459 | 0.09 | 550 | 2.4389 |
2.2487 | 0.1 | 575 | 2.4457 |
2.197 | 0.1 | 600 | 2.4433 |
2.272 | 0.11 | 625 | 2.4396 |
2.2425 | 0.11 | 650 | 2.4367 |
2.2543 | 0.11 | 675 | 2.4387 |
2.2598 | 0.12 | 700 | 2.4352 |
2.2381 | 0.12 | 725 | 2.4408 |
2.3656 | 0.13 | 750 | 2.4307 |
2.352 | 0.13 | 775 | 2.4299 |
2.1816 | 0.13 | 800 | 2.4344 |
2.24 | 0.14 | 825 | 2.4305 |
2.3039 | 0.14 | 850 | 2.4245 |
2.3169 | 0.15 | 875 | 2.4318 |
2.184 | 0.15 | 900 | 2.4287 |
2.2618 | 0.16 | 925 | 2.4308 |
2.2207 | 0.16 | 950 | 2.4327 |
2.2786 | 0.16 | 975 | 2.4244 |
2.3708 | 0.17 | 1000 | 2.4275 |
2.3165 | 0.17 | 1025 | 2.4286 |
2.2927 | 0.18 | 1050 | 2.4272 |
2.2849 | 0.18 | 1075 | 2.4297 |
2.2898 | 0.19 | 1100 | 2.4294 |
2.3798 | 0.19 | 1125 | 2.4188 |
2.4131 | 0.19 | 1150 | 2.4314 |
2.1314 | 0.2 | 1175 | 2.4265 |
2.3814 | 0.2 | 1200 | 2.4254 |
2.2761 | 0.21 | 1225 | 2.4238 |
2.2327 | 0.21 | 1250 | 2.4327 |
2.2236 | 0.22 | 1275 | 2.4245 |
2.2343 | 0.22 | 1300 | 2.4280 |
2.265 | 0.22 | 1325 | 2.4186 |
2.1813 | 0.23 | 1350 | 2.4303 |
2.2276 | 0.23 | 1375 | 2.4231 |
2.2444 | 0.24 | 1400 | 2.4234 |
2.3472 | 0.24 | 1425 | 2.4225 |
2.3111 | 0.24 | 1450 | 2.4240 |
2.3111 | 0.25 | 1475 | 2.4288 |
2.3205 | 0.25 | 1500 | 2.4291 |
2.3389 | 0.26 | 1525 | 2.4234 |
2.2517 | 0.26 | 1550 | 2.4255 |
2.3416 | 0.27 | 1575 | 2.4245 |
2.1858 | 0.27 | 1600 | 2.4184 |
2.1582 | 0.27 | 1625 | 2.4182 |
2.1512 | 0.28 | 1650 | 2.4246 |
2.248 | 0.28 | 1675 | 2.4253 |
2.2535 | 0.29 | 1700 | 2.4246 |
2.3005 | 0.29 | 1725 | 2.4195 |
2.2144 | 0.3 | 1750 | 2.4236 |
2.198 | 0.3 | 1775 | 2.4237 |
2.1911 | 0.3 | 1800 | 2.4203 |
2.2513 | 0.31 | 1825 | 2.4250 |
2.2442 | 0.31 | 1850 | 2.4231 |
2.2877 | 0.32 | 1875 | 2.4239 |
2.3341 | 0.32 | 1900 | 2.4187 |
2.2493 | 0.32 | 1925 | 2.4262 |
2.2687 | 0.33 | 1950 | 2.4222 |
2.2674 | 0.33 | 1975 | 2.4200 |
2.2928 | 0.34 | 2000 | 2.4126 |
2.2556 | 0.34 | 2025 | 2.4283 |
2.1929 | 0.35 | 2050 | 2.4195 |
2.1952 | 0.35 | 2075 | 2.4249 |
2.2114 | 0.35 | 2100 | 2.4234 |
2.2207 | 0.36 | 2125 | 2.4223 |
2.3071 | 0.36 | 2150 | 2.4223 |
2.2019 | 0.37 | 2175 | 2.4152 |
2.2224 | 0.37 | 2200 | 2.4230 |
2.1832 | 0.38 | 2225 | 2.4188 |
2.291 | 0.38 | 2250 | 2.4179 |
2.228 | 0.38 | 2275 | 2.4234 |
2.1592 | 0.39 | 2300 | 2.4178 |
2.2529 | 0.39 | 2325 | 2.4169 |
2.1175 | 0.4 | 2350 | 2.4169 |
2.3012 | 0.4 | 2375 | 2.4243 |
2.2626 | 0.4 | 2400 | 2.4165 |
2.1595 | 0.41 | 2425 | 2.4215 |
2.2097 | 0.41 | 2450 | 2.4179 |
2.2954 | 0.42 | 2475 | 2.4183 |
2.2535 | 0.42 | 2500 | 2.4167 |
2.2211 | 0.43 | 2525 | 2.4181 |
2.2505 | 0.43 | 2550 | 2.4264 |
2.1676 | 0.43 | 2575 | 2.4108 |
2.1906 | 0.44 | 2600 | 2.4152 |
2.2112 | 0.44 | 2625 | 2.4152 |
2.2729 | 0.45 | 2650 | 2.4147 |
2.2493 | 0.45 | 2675 | 2.4228 |
2.2266 | 0.46 | 2700 | 2.4186 |
2.2447 | 0.46 | 2725 | 2.4186 |
2.2216 | 0.46 | 2750 | 2.4132 |
2.3827 | 0.47 | 2775 | 2.4202 |
2.3067 | 0.47 | 2800 | 2.4126 |
2.1683 | 0.48 | 2825 | 2.4149 |
2.1962 | 0.48 | 2850 | 2.4131 |
2.2222 | 0.48 | 2875 | 2.4154 |
2.3168 | 0.49 | 2900 | 2.4141 |
2.2526 | 0.49 | 2925 | 2.4142 |
2.3378 | 0.5 | 2950 | 2.4183 |
2.2296 | 0.5 | 2975 | 2.4125 |
2.2563 | 0.51 | 3000 | 2.4137 |
2.3374 | 0.51 | 3025 | 2.4189 |
2.1736 | 0.51 | 3050 | 2.4094 |
2.3238 | 0.52 | 3075 | 2.4124 |
2.2334 | 0.52 | 3100 | 2.4152 |
2.3054 | 0.53 | 3125 | 2.4113 |
2.3322 | 0.53 | 3150 | 2.4123 |
2.2122 | 0.54 | 3175 | 2.4139 |
2.3256 | 0.54 | 3200 | 2.4085 |
2.2293 | 0.54 | 3225 | 2.4141 |
2.2341 | 0.55 | 3250 | 2.4148 |
2.2464 | 0.55 | 3275 | 2.4169 |
2.2551 | 0.56 | 3300 | 2.4115 |
2.3158 | 0.56 | 3325 | 2.4185 |
2.2789 | 0.56 | 3350 | 2.4138 |
2.3503 | 0.57 | 3375 | 2.4213 |
2.3434 | 0.57 | 3400 | 2.4154 |
2.3048 | 0.58 | 3425 | 2.4161 |
2.259 | 0.58 | 3450 | 2.4166 |
2.219 | 0.59 | 3475 | 2.4117 |
2.1541 | 0.59 | 3500 | 2.4193 |
2.2086 | 0.59 | 3525 | 2.4143 |
2.1673 | 0.6 | 3550 | 2.4184 |
2.1865 | 0.6 | 3575 | 2.4197 |
2.2537 | 0.61 | 3600 | 2.4141 |
2.2065 | 0.61 | 3625 | 2.4174 |
2.159 | 0.62 | 3650 | 2.4147 |
2.3402 | 0.62 | 3675 | 2.4175 |
2.2399 | 0.62 | 3700 | 2.4181 |
2.3507 | 0.63 | 3725 | 2.4153 |
2.2658 | 0.63 | 3750 | 2.4170 |
2.3211 | 0.64 | 3775 | 2.4088 |
2.2072 | 0.64 | 3800 | 2.4126 |
2.2433 | 0.65 | 3825 | 2.4160 |
2.225 | 0.65 | 3850 | 2.4088 |
2.1458 | 0.65 | 3875 | 2.4121 |
2.3704 | 0.66 | 3900 | 2.4097 |
2.2315 | 0.66 | 3925 | 2.4092 |
2.2295 | 0.67 | 3950 | 2.4141 |
2.2763 | 0.67 | 3975 | 2.4149 |
2.217 | 0.67 | 4000 | 2.4139 |
2.2287 | 0.68 | 4025 | 2.4113 |
2.2748 | 0.68 | 4050 | 2.4077 |
2.1584 | 0.69 | 4075 | 2.4121 |
2.2214 | 0.69 | 4100 | 2.4166 |
2.3557 | 0.7 | 4125 | 2.4076 |
2.2453 | 0.7 | 4150 | 2.4151 |
2.2167 | 0.7 | 4175 | 2.4140 |
2.3674 | 0.71 | 4200 | 2.4119 |
2.2979 | 0.71 | 4225 | 2.4146 |
2.2178 | 0.72 | 4250 | 2.4152 |
2.2091 | 0.72 | 4275 | 2.4101 |
2.3138 | 0.73 | 4300 | 2.4104 |
2.2504 | 0.73 | 4325 | 2.4136 |
2.2348 | 0.73 | 4350 | 2.4150 |
2.2141 | 0.74 | 4375 | 2.4174 |
2.1284 | 0.74 | 4400 | 2.4094 |
2.2926 | 0.75 | 4425 | 2.4178 |
2.1642 | 0.75 | 4450 | 2.4102 |
2.2263 | 0.75 | 4475 | 2.4196 |
2.3722 | 0.76 | 4500 | 2.4099 |
2.1992 | 0.76 | 4525 | 2.4114 |
2.2651 | 0.77 | 4550 | 2.4149 |
2.289 | 0.77 | 4575 | 2.4078 |
2.2911 | 0.78 | 4600 | 2.4073 |
2.2206 | 0.78 | 4625 | 2.4061 |
2.1851 | 0.78 | 4650 | 2.4094 |
2.2674 | 0.79 | 4675 | 2.4064 |
2.2032 | 0.79 | 4700 | 2.4055 |
2.1522 | 0.8 | 4725 | 2.4138 |
2.3039 | 0.8 | 4750 | 2.4096 |
2.2066 | 0.81 | 4775 | 2.4122 |
2.2193 | 0.81 | 4800 | 2.4156 |
2.2599 | 0.81 | 4825 | 2.4098 |
2.2994 | 0.82 | 4850 | 2.4053 |
2.2463 | 0.82 | 4875 | 2.4052 |
2.1318 | 0.83 | 4900 | 2.4072 |
2.1696 | 0.83 | 4925 | 2.4086 |
2.2104 | 0.83 | 4950 | 2.4082 |
2.3455 | 0.84 | 4975 | 2.4070 |
2.165 | 0.84 | 5000 | 2.4092 |
2.2742 | 0.85 | 5025 | 2.4096 |
2.3341 | 0.85 | 5050 | 2.4103 |
2.2294 | 0.86 | 5075 | 2.4082 |
2.2256 | 0.86 | 5100 | 2.4136 |
2.1586 | 0.86 | 5125 | 2.4132 |
2.2623 | 0.87 | 5150 | 2.4126 |
2.2405 | 0.87 | 5175 | 2.4120 |
2.1848 | 0.88 | 5200 | 2.4158 |
2.216 | 0.88 | 5225 | 2.4126 |
2.2648 | 0.89 | 5250 | 2.4093 |
2.2928 | 0.89 | 5275 | 2.4100 |
2.2365 | 0.89 | 5300 | 2.4081 |
2.1913 | 0.9 | 5325 | 2.4041 |
2.1835 | 0.9 | 5350 | 2.4097 |
2.2158 | 0.91 | 5375 | 2.4083 |
2.2001 | 0.91 | 5400 | 2.4067 |
2.2133 | 0.91 | 5425 | 2.4122 |
2.2104 | 0.92 | 5450 | 2.4169 |
2.3368 | 0.92 | 5475 | 2.4124 |
2.2057 | 0.93 | 5500 | 2.4108 |
2.1003 | 0.93 | 5525 | 2.4058 |
2.1589 | 0.94 | 5550 | 2.4154 |
2.1885 | 0.94 | 5575 | 2.4058 |
2.2291 | 0.94 | 5600 | 2.4113 |
2.2688 | 0.95 | 5625 | 2.4097 |
2.3387 | 0.95 | 5650 | 2.4123 |
2.2701 | 0.96 | 5675 | 2.4108 |
2.2732 | 0.96 | 5700 | 2.4070 |
2.2823 | 0.97 | 5725 | 2.4057 |
2.2029 | 0.97 | 5750 | 2.4096 |
2.2392 | 0.97 | 5775 | 2.4099 |
2.1963 | 0.98 | 5800 | 2.4165 |
2.2922 | 0.98 | 5825 | 2.4105 |
2.1884 | 0.99 | 5850 | 2.4119 |
2.2883 | 0.99 | 5875 | 2.4087 |
2.3162 | 1.0 | 5900 | 2.4069 |
2.2246 | 1.0 | 5925 | 2.4028 |
2.2586 | 1.0 | 5950 | 2.4107 |
2.1367 | 1.01 | 5975 | 2.4095 |
2.2341 | 1.01 | 6000 | 2.4152 |
2.2638 | 1.02 | 6025 | 2.4048 |
2.1898 | 1.02 | 6050 | 2.4097 |
2.1071 | 1.02 | 6075 | 2.4133 |
2.2763 | 1.03 | 6100 | 2.4056 |
2.159 | 1.03 | 6125 | 2.4060 |
2.2005 | 1.04 | 6150 | 2.4111 |
2.3398 | 1.04 | 6175 | 2.4146 |
2.2017 | 1.05 | 6200 | 2.4085 |
2.202 | 1.05 | 6225 | 2.4093 |
2.1532 | 1.05 | 6250 | 2.4086 |
2.1735 | 1.06 | 6275 | 2.4106 |
2.1104 | 1.06 | 6300 | 2.4105 |
2.2282 | 1.07 | 6325 | 2.4117 |
2.2969 | 1.07 | 6350 | 2.4063 |
2.2284 | 1.08 | 6375 | 2.4044 |
2.2823 | 1.08 | 6400 | 2.4114 |
2.1878 | 1.08 | 6425 | 2.4115 |
2.3074 | 1.09 | 6450 | 2.4090 |
2.238 | 1.09 | 6475 | 2.4104 |
2.2031 | 1.1 | 6500 | 2.4075 |
2.1617 | 1.1 | 6525 | 2.4113 |
2.1508 | 1.1 | 6550 | 2.4047 |
2.1803 | 1.11 | 6575 | 2.4170 |
2.2613 | 1.11 | 6600 | 2.4116 |
2.1954 | 1.12 | 6625 | 2.4092 |
2.3341 | 1.12 | 6650 | 2.4116 |
2.2835 | 1.13 | 6675 | 2.4058 |
2.2413 | 1.13 | 6700 | 2.4150 |
2.32 | 1.13 | 6725 | 2.4130 |
2.2163 | 1.14 | 6750 | 2.4042 |
2.3013 | 1.14 | 6775 | 2.4119 |
2.2821 | 1.15 | 6800 | 2.4124 |
2.1525 | 1.15 | 6825 | 2.4123 |
2.2313 | 1.16 | 6850 | 2.4108 |
2.1835 | 1.16 | 6875 | 2.4084 |
2.2945 | 1.16 | 6900 | 2.4134 |
2.233 | 1.17 | 6925 | 2.4033 |
2.3066 | 1.17 | 6950 | 2.4069 |
2.3245 | 1.18 | 6975 | 2.4074 |
2.1988 | 1.18 | 7000 | 2.4095 |
2.1995 | 1.18 | 7025 | 2.4101 |
2.2988 | 1.19 | 7050 | 2.4085 |
2.1385 | 1.19 | 7075 | 2.4079 |
2.2207 | 1.2 | 7100 | 2.3976 |
2.1971 | 1.2 | 7125 | 2.4097 |
2.2652 | 1.21 | 7150 | 2.4052 |
2.1848 | 1.21 | 7175 | 2.4023 |
2.2584 | 1.21 | 7200 | 2.4040 |
2.2193 | 1.22 | 7225 | 2.4069 |
2.2586 | 1.22 | 7250 | 2.3954 |
2.2102 | 1.23 | 7275 | 2.4041 |
2.2741 | 1.23 | 7300 | 2.3994 |
2.2261 | 1.24 | 7325 | 2.3986 |
2.2745 | 1.24 | 7350 | 2.3970 |
2.2266 | 1.24 | 7375 | 2.4001 |
2.2462 | 1.25 | 7400 | 2.4028 |
2.2968 | 1.25 | 7425 | 2.3983 |
2.1915 | 1.26 | 7450 | 2.3978 |
2.2201 | 1.26 | 7475 | 2.3957 |
2.126 | 1.26 | 7500 | 2.3922 |
2.2625 | 1.27 | 7525 | 2.4001 |
2.24 | 1.27 | 7550 | 2.3976 |
2.2113 | 1.28 | 7575 | 2.4051 |
2.1994 | 1.28 | 7600 | 2.4024 |
2.2568 | 1.29 | 7625 | 2.3984 |
2.243 | 1.29 | 7650 | 2.4095 |
2.2187 | 1.29 | 7675 | 2.4072 |
2.1955 | 1.3 | 7700 | 2.4030 |
2.2341 | 1.3 | 7725 | 2.3987 |
2.3218 | 1.31 | 7750 | 2.3983 |
2.1958 | 1.31 | 7775 | 2.3980 |
2.222 | 1.32 | 7800 | 2.4046 |
2.2631 | 1.32 | 7825 | 2.3974 |
2.1505 | 1.32 | 7850 | 2.3952 |
2.1824 | 1.33 | 7875 | 2.3976 |
2.2468 | 1.33 | 7900 | 2.4025 |
2.1383 | 1.34 | 7925 | 2.3926 |
2.0483 | 1.34 | 7950 | 2.3984 |
2.32 | 1.34 | 7975 | 2.3971 |
2.3582 | 1.35 | 8000 | 2.3988 |
2.2773 | 1.35 | 8025 | 2.3919 |
2.2302 | 1.36 | 8050 | 2.4016 |
2.152 | 1.36 | 8075 | 2.3958 |
2.2021 | 1.37 | 8100 | 2.4047 |
2.2351 | 1.37 | 8125 | 2.4041 |
2.1452 | 1.37 | 8150 | 2.4009 |
2.2575 | 1.38 | 8175 | 2.4004 |
2.1978 | 1.38 | 8200 | 2.3994 |
2.2648 | 1.39 | 8225 | 2.3982 |
2.2322 | 1.39 | 8250 | 2.3990 |
2.2488 | 1.4 | 8275 | 2.3997 |
2.2343 | 1.4 | 8300 | 2.3982 |
2.2011 | 1.4 | 8325 | 2.4020 |
2.2347 | 1.41 | 8350 | 2.3990 |
2.2446 | 1.41 | 8375 | 2.4003 |
2.2258 | 1.42 | 8400 | 2.4069 |
2.1781 | 1.42 | 8425 | 2.4104 |
2.3193 | 1.43 | 8450 | 2.4069 |
2.2015 | 1.43 | 8475 | 2.3985 |
2.2139 | 1.43 | 8500 | 2.3998 |
2.2006 | 1.44 | 8525 | 2.3986 |
2.2181 | 1.44 | 8550 | 2.4072 |
2.3598 | 1.45 | 8575 | 2.4098 |
2.3421 | 1.45 | 8600 | 2.4073 |
2.2152 | 1.45 | 8625 | 2.4090 |
2.2308 | 1.46 | 8650 | 2.4059 |
2.1773 | 1.46 | 8675 | 2.4078 |
2.2713 | 1.47 | 8700 | 2.4028 |
2.2826 | 1.47 | 8725 | 2.4051 |
2.2942 | 1.48 | 8750 | 2.4051 |
2.1512 | 1.48 | 8775 | 2.3998 |
2.1678 | 1.48 | 8800 | 2.4036 |
2.1948 | 1.49 | 8825 | 2.4052 |
2.1395 | 1.49 | 8850 | 2.3990 |
2.1999 | 1.5 | 8875 | 2.4053 |
2.2187 | 1.5 | 8900 | 2.4014 |
2.2549 | 1.51 | 8925 | 2.4035 |
2.1782 | 1.51 | 8950 | 2.4066 |
2.2073 | 1.51 | 8975 | 2.4083 |
2.1925 | 1.52 | 9000 | 2.3987 |
2.2846 | 1.52 | 9025 | 2.4008 |
2.1969 | 1.53 | 9050 | 2.4071 |
2.2831 | 1.53 | 9075 | 2.4040 |
2.3457 | 1.53 | 9100 | 2.4057 |
2.2346 | 1.54 | 9125 | 2.4002 |
2.2253 | 1.54 | 9150 | 2.4078 |
2.3162 | 1.55 | 9175 | 2.3958 |
2.2181 | 1.55 | 9200 | 2.4020 |
2.1335 | 1.56 | 9225 | 2.4077 |
2.2222 | 1.56 | 9250 | 2.4029 |
2.118 | 1.56 | 9275 | 2.4011 |
2.1778 | 1.57 | 9300 | 2.4068 |
2.1706 | 1.57 | 9325 | 2.4020 |
2.2519 | 1.58 | 9350 | 2.3994 |
2.1389 | 1.58 | 9375 | 2.4033 |
2.3475 | 1.59 | 9400 | 2.4030 |
2.2375 | 1.59 | 9425 | 2.4060 |
2.1758 | 1.59 | 9450 | 2.4113 |
2.2083 | 1.6 | 9475 | 2.4064 |
2.2299 | 1.6 | 9500 | 2.4085 |
2.1834 | 1.61 | 9525 | 2.4042 |
2.1631 | 1.61 | 9550 | 2.4086 |
2.3827 | 1.61 | 9575 | 2.4068 |
2.181 | 1.62 | 9600 | 2.4083 |
2.2252 | 1.62 | 9625 | 2.4039 |
2.2509 | 1.63 | 9650 | 2.4104 |
2.2198 | 1.63 | 9675 | 2.4096 |
2.2605 | 1.64 | 9700 | 2.4149 |
2.2177 | 1.64 | 9725 | 2.4067 |
2.0864 | 1.64 | 9750 | 2.4106 |
2.1742 | 1.65 | 9775 | 2.4012 |
2.254 | 1.65 | 9800 | 2.4116 |
2.2758 | 1.66 | 9825 | 2.4114 |
2.1822 | 1.66 | 9850 | 2.4149 |
2.2293 | 1.67 | 9875 | 2.4034 |
2.2322 | 1.67 | 9900 | 2.4086 |
2.2173 | 1.67 | 9925 | 2.4115 |
2.1781 | 1.68 | 9950 | 2.3963 |
2.2739 | 1.68 | 9975 | 2.4091 |
2.1899 | 1.69 | 10000 | 2.4050 |
Framework versions
- Transformers 4.36.2
- Pytorch 2.1.2+cu121
- Datasets 2.15.0
- Tokenizers 0.15.0
- Downloads last month
- 4
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The HF Inference API does not support model that require custom code execution.
Model tree for thrunlab/Mistral_Sparse_refined_web_50p_cut_pre_mlp_2024-03-23
Base model
mistralai/Mistral-7B-v0.1