Upload folder using huggingface_hub
Browse files- .gitattributes +2 -0
- README.md +121 -0
- checkpoints/best_checkpoint.pth +3 -0
- checkpoints/pristine_prototype.pkl +3 -0
- configs/paper_cuda.toml +59 -0
- onnx/saga_awareness_v1.onnx +3 -0
- onnx/saga_awareness_v1.onnx.data +3 -0
- paper.pdf +3 -0
- pytorch/saga_awareness_v1.pth +3 -0
- pytorch/saga_awareness_v1.safetensors +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
onnx/saga_awareness_v1.onnx.data filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
paper.pdf filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,121 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
tags:
|
| 4 |
+
- object-detection
|
| 5 |
+
- self-awareness
|
| 6 |
+
- degradation-manifold
|
| 7 |
+
- image-quality
|
| 8 |
+
- perception
|
| 9 |
+
- anima
|
| 10 |
+
- robotflow
|
| 11 |
+
license: apache-2.0
|
| 12 |
+
library_name: pytorch
|
| 13 |
+
pipeline_tag: image-classification
|
| 14 |
+
datasets:
|
| 15 |
+
- coco
|
| 16 |
+
metrics:
|
| 17 |
+
- auroc
|
| 18 |
+
model-index:
|
| 19 |
+
- name: project_saga
|
| 20 |
+
results:
|
| 21 |
+
- task:
|
| 22 |
+
type: image-classification
|
| 23 |
+
name: Degradation Detection
|
| 24 |
+
dataset:
|
| 25 |
+
type: coco
|
| 26 |
+
name: COCO val2017
|
| 27 |
+
metrics:
|
| 28 |
+
- type: auroc
|
| 29 |
+
value: 0.7991
|
| 30 |
+
name: Pristine vs Degraded AUROC
|
| 31 |
+
---
|
| 32 |
+
|
| 33 |
+
# ANIMA Saga — Self-Aware Object Detection via Degradation Manifolds
|
| 34 |
+
|
| 35 |
+
**Paper**: [arXiv:2602.18394](https://arxiv.org/abs/2602.18394) (Becker et al., 2026)
|
| 36 |
+
|
| 37 |
+
**Implementation by**: [RobotFlow Labs / AIFLOW Labs](https://github.com/RobotFlow-Labs)
|
| 38 |
+
|
| 39 |
+
## Overview
|
| 40 |
+
|
| 41 |
+
Saga adds **degradation-aware self-awareness** to any object detector. A lightweight embedding
|
| 42 |
+
head trained via multi-layer contrastive learning detects when input quality degrades
|
| 43 |
+
(blur, noise, rain, fog, compression) — enabling safety-critical systems to flag unreliable
|
| 44 |
+
perception rather than trusting silent failures.
|
| 45 |
+
|
| 46 |
+
## Results
|
| 47 |
+
|
| 48 |
+
| Metric | Value |
|
| 49 |
+
|--------|-------|
|
| 50 |
+
| **AUROC** (pristine vs degraded) | **0.7991** |
|
| 51 |
+
| Detector backbone | yolov10m |
|
| 52 |
+
| Training epochs | 7 |
|
| 53 |
+
| Embedding dimension | 128 |
|
| 54 |
+
|
| 55 |
+
### Paper Table 1 Reference (YOLOv10-m, COCO mixed degradation)
|
| 56 |
+
|
| 57 |
+
| Severity | 1 | 2 | 3 | 4 | 5 |
|
| 58 |
+
|----------|---|---|---|---|---|
|
| 59 |
+
| **Paper** | 88.64 | 89.70 | 89.75 | 95.28 | 97.14 |
|
| 60 |
+
| **Ours** | TBD | TBD | TBD | TBD | TBD |
|
| 61 |
+
|
| 62 |
+
## Usage
|
| 63 |
+
|
| 64 |
+
```python
|
| 65 |
+
import torch
|
| 66 |
+
from anima_saga.wrappers.detector_registry import PaperDetectorWrapper
|
| 67 |
+
from anima_saga.core.prototype import PristinePrototype
|
| 68 |
+
|
| 69 |
+
# Load model
|
| 70 |
+
model = PaperDetectorWrapper("yolov10m", embedding_dim=128, freeze_backbone=True)
|
| 71 |
+
model.load_state_dict(torch.load("pytorch/saga_awareness_v1.pth")["model_state_dict"])
|
| 72 |
+
model.eval().cuda()
|
| 73 |
+
|
| 74 |
+
# Load prototype
|
| 75 |
+
prototype = PristinePrototype.load("checkpoints/pristine_prototype.pkl")
|
| 76 |
+
|
| 77 |
+
# Inference
|
| 78 |
+
image = torch.randn(1, 3, 640, 640).cuda() # Your image here
|
| 79 |
+
with torch.no_grad():
|
| 80 |
+
embedding = model(image)
|
| 81 |
+
score = prototype.score_cosine(embedding)
|
| 82 |
+
# score ~ 0: pristine, score > 0.5: degraded
|
| 83 |
+
print(f"Degradation score: {score.item():.4f}")
|
| 84 |
+
```
|
| 85 |
+
|
| 86 |
+
## Files
|
| 87 |
+
|
| 88 |
+
| File | Description |
|
| 89 |
+
|------|-------------|
|
| 90 |
+
| `pytorch/saga_awareness_v1.pth` | PyTorch checkpoint (resume training) |
|
| 91 |
+
| `pytorch/saga_awareness_v1.safetensors` | SafeTensors (fast loading) |
|
| 92 |
+
| `onnx/saga_awareness_v1.onnx` | ONNX (cross-platform) |
|
| 93 |
+
| `tensorrt/saga_awareness_v1_fp16.trt` | TensorRT FP16 (Jetson/L4) |
|
| 94 |
+
| `tensorrt/saga_awareness_v1_fp32.trt` | TensorRT FP32 |
|
| 95 |
+
| `checkpoints/pristine_prototype.pkl` | Pristine prototype for scoring |
|
| 96 |
+
| `configs/paper_cuda.toml` | Training config (reproducibility) |
|
| 97 |
+
| `logs/training_history.json` | Loss curves + metrics |
|
| 98 |
+
|
| 99 |
+
## Architecture
|
| 100 |
+
|
| 101 |
+
```
|
| 102 |
+
Input (640x640) → YOLOv10-m backbone → Multi-layer features
|
| 103 |
+
→ 1x1 conv + attention pooling per layer
|
| 104 |
+
→ Concatenate → MLP projection → L2 normalize
|
| 105 |
+
→ Cosine distance from pristine prototype = degradation score
|
| 106 |
+
```
|
| 107 |
+
|
| 108 |
+
## Citation
|
| 109 |
+
|
| 110 |
+
```bibtex
|
| 111 |
+
@article{becker2026selfaware,
|
| 112 |
+
title={Self-Aware Object Detection via Degradation Manifolds},
|
| 113 |
+
author={Becker, Stefan and Weiss, Simon and H\"ubner, Wolfgang and Arens, Michael},
|
| 114 |
+
journal={arXiv preprint arXiv:2602.18394},
|
| 115 |
+
year={2026}
|
| 116 |
+
}
|
| 117 |
+
```
|
| 118 |
+
|
| 119 |
+
## License
|
| 120 |
+
|
| 121 |
+
Apache 2.0
|
checkpoints/best_checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:18a1e81a6f9b51bfae7c5285b60f2bdaeafcf9b5a0fa5d6b62367900faeb8a40
|
| 3 |
+
size 78125243
|
checkpoints/pristine_prototype.pkl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ecc1a1d34a716ff6ec481ae1b29d0705b03825d32b3912438ce0f8869318fa51
|
| 3 |
+
size 788
|
configs/paper_cuda.toml
ADDED
|
@@ -0,0 +1,59 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Paper-faithful training config for CUDA (arXiv:2602.18394)
|
| 2 |
+
# Server: 8x NVIDIA L4 (23GB each)
|
| 3 |
+
|
| 4 |
+
[detector]
|
| 5 |
+
backbone = "yolov10m" # Paper primary: YOLOv10-m (Table 1, best AUROC)
|
| 6 |
+
input_size = 640 # Paper: "input size of 640"
|
| 7 |
+
freeze_backbone = false # Paper: "fine-tuned jointly" (Section 4.2)
|
| 8 |
+
|
| 9 |
+
[model]
|
| 10 |
+
proj_dim = 128 # Per-layer projection dimension d
|
| 11 |
+
embedding_dim = 128 # Final embedding dimension D
|
| 12 |
+
max_degradation_ops = 4 # Max ops per composition N_deg
|
| 13 |
+
|
| 14 |
+
[training]
|
| 15 |
+
batch_size = 48 # ~21GB on L4 (23GB), 90% util — verified live on GPU 5
|
| 16 |
+
epochs = 50 # Paper-aligned
|
| 17 |
+
learning_rate = 1e-3 # Base LR
|
| 18 |
+
lr_backbone_scale = 0.1 # Backbone LR = base * scale
|
| 19 |
+
lr_min = 1e-6 # Cosine annealing min
|
| 20 |
+
weight_decay = 1e-4
|
| 21 |
+
optimizer = "adamw"
|
| 22 |
+
scheduler = "cosine" # Cosine annealing
|
| 23 |
+
warmup_fraction = 0.05 # 5% warmup steps
|
| 24 |
+
seed = 42
|
| 25 |
+
num_workers = 12 # More workers to keep 3-4 GPUs fed
|
| 26 |
+
gradient_clip_max_norm = 1.0
|
| 27 |
+
mixed_precision = true # bf16 on CUDA
|
| 28 |
+
|
| 29 |
+
[contrastive]
|
| 30 |
+
temperature = 0.1 # NT-Xent temperature τ_c
|
| 31 |
+
hard_negatives = true # Resolution perturbation
|
| 32 |
+
|
| 33 |
+
[prototype]
|
| 34 |
+
momentum = 0.999 # EMA α (Eq 7)
|
| 35 |
+
warmup_fraction = 0.5 # Start updating after half of training
|
| 36 |
+
|
| 37 |
+
[data]
|
| 38 |
+
train_dir = "/mnt/forge-data/datasets/grounding_data/coco/train2017"
|
| 39 |
+
val_dir = "/mnt/forge-data/datasets/grounding_data/coco/val2017"
|
| 40 |
+
split_seed = 42
|
| 41 |
+
train_ratio = 0.9
|
| 42 |
+
val_ratio = 0.05
|
| 43 |
+
test_ratio = 0.05
|
| 44 |
+
|
| 45 |
+
[checkpoint]
|
| 46 |
+
output_dir = "/mnt/artifacts-datai/checkpoints/project_saga"
|
| 47 |
+
keep_best_n = 2
|
| 48 |
+
save_every_steps = 500 # Save checkpoint every N steps (resume-safe)
|
| 49 |
+
|
| 50 |
+
[logging]
|
| 51 |
+
log_dir = "/mnt/artifacts-datai/logs/project_saga"
|
| 52 |
+
tensorboard_dir = "/mnt/artifacts-datai/tensorboard/project_saga"
|
| 53 |
+
|
| 54 |
+
[early_stopping]
|
| 55 |
+
patience = 10
|
| 56 |
+
min_delta = 1e-4
|
| 57 |
+
|
| 58 |
+
[evaluation]
|
| 59 |
+
confidence_threshold = 0.001 # Paper: "confidence threshold to 0.001"
|
onnx/saga_awareness_v1.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a64f50b96f5fccd7a9b5742b564fafc02b12cd76e0b894cc0bf62dda9ab96bca
|
| 3 |
+
size 429484
|
onnx/saga_awareness_v1.onnx.data
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d81c8a933ebe0cbebdcc103319009ddff39cd147a116d4d68673c76beeaea102
|
| 3 |
+
size 38109184
|
paper.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e4f823dc5c8f5373e3b5b7999e08b05292d45e802d484d5c6a0252ae176695c8
|
| 3 |
+
size 17829954
|
pytorch/saga_awareness_v1.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c4d94f124fc39dfad75a5e54a5ed9519392c349bb7a346d215bf16969add1e24
|
| 3 |
+
size 70639187
|
pytorch/saga_awareness_v1.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:993403c20fd2d04968a4ea25df28cbaae3889d77608212d5884db8c65d001740
|
| 3 |
+
size 70398448
|