qualcomm
/

GoogLeNetQuantized

@@ -31,12 +31,13 @@ More details on model performance across various devices, can be found
   - Model checkpoint: Imagenet
   - Input resolution: 224x224
   - Number of parameters: 6.62M
-  - Model size: 16.0 MB
 | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
 | ---|---|---|---|---|---|---|---|
-| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | TFLite | 1.026 ms | 0 - 2 MB | FP16 | NPU |  [GoogLeNetQuantized.tflite](https://huggingface.co/qualcomm/GoogLeNetQuantized/blob/main/GoogLeNetQuantized.tflite)
 ## Installation
@@ -96,10 +97,17 @@ python -m qai_hub_models.models.googlenet_quantized.export
 ```
 Profile Job summary of GoogLeNetQuantized
 --------------------------------------------------
-Device: Samsung Galaxy S23 Ultra (13)
-Estimated Inference Time: 1.03 ms
-Estimated Peak Memory Range: 0.02-1.69 MB
-Compute Units: NPU (183) | Total (183)
 ```
@@ -218,7 +226,7 @@ Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
 ## License
 - The license for the original implementation of GoogLeNetQuantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf).
 ## References
 * [Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)

   - Model checkpoint: Imagenet
   - Input resolution: 224x224
   - Number of parameters: 6.62M
+  - Model size: 6.55 MB
 | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
 | ---|---|---|---|---|---|---|---|
+| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | TFLite | 0.331 ms | 0 - 2 MB | INT8 | NPU |  [GoogLeNetQuantized.tflite](https://huggingface.co/qualcomm/GoogLeNetQuantized/blob/main/GoogLeNetQuantized.tflite)
+| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 0.365 ms | 1 - 5 MB | INT8 | NPU |  [GoogLeNetQuantized.so](https://huggingface.co/qualcomm/GoogLeNetQuantized/blob/main/GoogLeNetQuantized.so)
 ## Installation
 ```
 Profile Job summary of GoogLeNetQuantized
 --------------------------------------------------
+Device: Samsung Galaxy S24 (14)
+Estimated Inference Time: 0.25 ms
+Estimated Peak Memory Range: 0.02-30.86 MB
+Compute Units: NPU (87) | Total (87)
+Profile Job summary of GoogLeNetQuantized
+--------------------------------------------------
+Device: Samsung Galaxy S24 (14)
+Estimated Inference Time: 0.26 ms
+Estimated Peak Memory Range: 0.59-45.16 MB
+Compute Units: NPU (89) | Total (89)
 ```
 ## License
 - The license for the original implementation of GoogLeNetQuantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
+- The license for the compiled assets for on-device deployment can be found [here]({deploy_license_url})
 ## References
 * [Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)