Add/update the quantized ONNX model files and README.md for Transformers.js v3
Applied Quantizations
✅ Based on decoder_model.onnx
with slimming
↳ ✅ fp16
: decoder_model_fp16.onnx
(added)
↳ ✅ int8
: decoder_model_int8.onnx
(added)
↳ ✅ uint8
: decoder_model_uint8.onnx
(added)
↳ ✅ q4
: decoder_model_q4.onnx
(added)
↳ ✅ q4f16
: decoder_model_q4f16.onnx
(added)
↳ ✅ bnb4
: decoder_model_bnb4.onnx
(added)
✅ Based on decoder_with_past_model.onnx
with slimming
↳ ✅ fp16
: decoder_with_past_model_fp16.onnx
(added)
↳ ✅ int8
: decoder_with_past_model_int8.onnx
(added)
↳ ✅ uint8
: decoder_with_past_model_uint8.onnx
(added)
↳ ✅ q4
: decoder_with_past_model_q4.onnx
(added)
↳ ✅ q4f16
: decoder_with_past_model_q4f16.onnx
(added)
↳ ✅ bnb4
: decoder_with_past_model_bnb4.onnx
(added)
❌ Based on decoder_model_merged.onnx
with slimming
The base model decoder_model_merged.onnx
has been renamed to model.onnx
.
None
↳ ❌ fp16
: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 07:17:35.684735812 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_fp16.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
↳ ❌ int8
: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 07:17:46.341049068 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_int8.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
↳ ❌ uint8
: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 07:17:57.546248750 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_uint8.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
↳ ❌ q4
: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 07:18:06.398696360 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_q4.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
↳ ❌ q4f16
: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 07:18:12.608451741 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_q4f16.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
↳ ❌ bnb4
: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 07:18:21.406956044 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_bnb4.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0