Add/update the quantized ONNX model files and README.md for Transformers.js v3

#3
by whitphx HF Staff - opened

Applied Quantizations

✅ Based on decoder_model.onnx with slimming

↳ ✅ fp16: decoder_model_fp16.onnx (added)
↳ ✅ int8: decoder_model_int8.onnx (added)
↳ ✅ uint8: decoder_model_uint8.onnx (added)
↳ ✅ q4: decoder_model_q4.onnx (added)
↳ ✅ q4f16: decoder_model_q4f16.onnx (added)
↳ ✅ bnb4: decoder_model_bnb4.onnx (added)

✅ Based on decoder_with_past_model.onnx with slimming

↳ ✅ fp16: decoder_with_past_model_fp16.onnx (added)
↳ ✅ int8: decoder_with_past_model_int8.onnx (added)
↳ ✅ uint8: decoder_with_past_model_uint8.onnx (added)
↳ ✅ q4: decoder_with_past_model_q4.onnx (added)
↳ ✅ q4f16: decoder_with_past_model_q4f16.onnx (added)
↳ ✅ bnb4: decoder_with_past_model_bnb4.onnx (added)

❌ Based on decoder_model_merged.onnx with slimming

The base model decoder_model_merged.onnx has been renamed to model.onnx.

None

↳ ❌ fp16: `` (added but JS-based E2E test failed)

2025-08-29 07:17:35.684735812 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
            __classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
                                                                                           ^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_fp16.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
    at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
    at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
    at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0

↳ ❌ int8: `` (added but JS-based E2E test failed)

2025-08-29 07:17:46.341049068 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
            __classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
                                                                                           ^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_int8.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
    at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
    at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
    at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0

↳ ❌ uint8: `` (added but JS-based E2E test failed)

2025-08-29 07:17:57.546248750 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
            __classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
                                                                                           ^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_uint8.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
    at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
    at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
    at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0

↳ ❌ q4: `` (added but JS-based E2E test failed)

2025-08-29 07:18:06.398696360 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
            __classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
                                                                                           ^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_q4.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
    at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
    at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
    at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0

↳ ❌ q4f16: `` (added but JS-based E2E test failed)

2025-08-29 07:18:12.608451741 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
            __classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
                                                                                           ^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_q4f16.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
    at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
    at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
    at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0

↳ ❌ bnb4: `` (added but JS-based E2E test failed)

2025-08-29 07:18:21.406956044 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
            __classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
                                                                                           ^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_bnb4.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
    at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
    at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
    at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment