ProphetOfBostrom
commited on
Commit
•
7d0fc1f
1
Parent(s):
4a3634a
Create MISLEAD.md
Browse filesi've always wanted a blog. this file is some notes on what i 'learned' during the HQQ quantize process.
- MISLEAD.md +19 -0
MISLEAD.md
ADDED
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
It took me all day to figure this out. It turns out that while HQQ will go ahead and fill 180GB of memory to do this - there's absolutely no reason for it! I did this from a slow**, 200 GB swap partition.
|
2 |
+
On the off chance someone at Mobius see this - please don't ask transformers to load a 45B param model on to the CPU if you're not actually going to... call the model at all? It took ten minutes at SATA 2 speeds - and that was because it was padded to FP32 (CPU mode, right?).
|
3 |
+
|
4 |
+
45 Gigaweights \* 2 Bytes per weight \* fp32/bf16 = 180 GB of system memory allocated.
|
5 |
+
|
6 |
+
I wish I had that.
|
7 |
+
|
8 |
+
\**May have been zswap's fault. I'm pretty sure 200MB/s and an idle CPU isn't the best you can hope for when you're doing sequential reads from a 4.0x4 NVME device? My GPU fell asleep between optimization passes. It even has a Gamer LED on it. I'll fix my sysctl next time.
|
9 |
+
|
10 |
+
+ Try `$ python -i untitled.py`
|
11 |
+
|
12 |
+
having saved that script from the mobius hf repo because you'll be spending a while in IDLE figuring out
|
13 |
+
+ `>>> model.save_quantized("/absolute/path/noromaid") `
|
14 |
+
|
15 |
+
at the end and trust me, quantizing something chunky and then watching python shred it because the save directory is somehow a recursive lambda function and not a string is heartbreaking. I don't know if it was supposed to emit more than the model.pt and the config.json but I'm taking what I can get.
|
16 |
+
|
17 |
+
###### If anyone's looking to donate I could do with an Epyc Rome and perhaps another pair of H100s? I've embedded my XMR address in attention tensors with help from a realy horny embedding so when it starts generating gibberish right before the good stuff just paste that in to feather and send me all your money. Thanks! :)
|
18 |
+
|
19 |
+
`i'm joking. that's a joke. I didn't do that.`
|