Update model config and README
Browse files
README.md
CHANGED
@@ -4,12 +4,10 @@ tags:
|
|
4 |
- timm
|
5 |
library_name: timm
|
6 |
license: apache-2.0
|
7 |
-
datasets:
|
8 |
-
- imagenet-1k
|
9 |
---
|
10 |
# Model card for vit_small_patch16_224.dino
|
11 |
|
12 |
-
A Vision Transformer (ViT) image
|
13 |
|
14 |
|
15 |
## Model Details
|
@@ -22,7 +20,7 @@ A Vision Transformer (ViT) image classification model. Trained with Self-Supervi
|
|
22 |
- **Papers:**
|
23 |
- Emerging Properties in Self-Supervised Vision Transformers: https://arxiv.org/abs/2104.14294
|
24 |
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale: https://arxiv.org/abs/2010.11929v2
|
25 |
-
- **Dataset:** ImageNet-1k
|
26 |
- **Original:** https://github.com/facebookresearch/dino
|
27 |
|
28 |
## Model Usage
|
|
|
4 |
- timm
|
5 |
library_name: timm
|
6 |
license: apache-2.0
|
|
|
|
|
7 |
---
|
8 |
# Model card for vit_small_patch16_224.dino
|
9 |
|
10 |
+
A Vision Transformer (ViT) image feature model. Trained with Self-Supervised DINO method.
|
11 |
|
12 |
|
13 |
## Model Details
|
|
|
20 |
- **Papers:**
|
21 |
- Emerging Properties in Self-Supervised Vision Transformers: https://arxiv.org/abs/2104.14294
|
22 |
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale: https://arxiv.org/abs/2010.11929v2
|
23 |
+
- **Pretrain Dataset:** ImageNet-1k
|
24 |
- **Original:** https://github.com/facebookresearch/dino
|
25 |
|
26 |
## Model Usage
|