Heedless Backbones

Vim Family

Select an option
Results
Parameters (M)
Images / Second
Publication Date
Select an option
---------
Object Detection
Instance Segmentation
Classification
Semantic Segmentation
Panoptic Segmentation
Select an option
---------
Cityscapes (val)
Cityscapes (test)
ADE20K (val)
ADE20K (test)
PASCAL VOC 2007 (val)
PASCAL VOC 2007 (test)
Select an option
mIoUms
pAccms
mAccms
mIoUss
pAccss
mAccss
GFLOPs
Select an option
---------
UPerNet
Mask2Former
Panoptic FPN
SETR
Select an option
----------
512x2048
640x2560
Select an option
Results
Parameters (M)
Images / Second
GFLOPs
Publication Date
Select an option
---------
MegData73M
JFT-3B
JFT-300M
ImageNet-1k
ImageNet-22k
Select an option
----------
Supervised
Sup. + TL
FCMAE
MAE
CL
Select an option
Family
Pretrain Dataset
Semantic Segmentation Head
Semantic Segmentation Resolution
Semantic Segmentation Training Epochs
Select an option
----------
Family
Pretrain Method
Semantic Segmentation Head
Semantic Segmentation Resolution
Semantic Segmentation Training Epochs

No Results

modelparams (m)pretrainfinetunegflopsIN-1k
Vim-Ti7.0IN-1k : Sup. : 300— : — : —None76.1/—
Vim-Ti7.0IN-1k : Sup. : 300IN-1k : 30 : 224None78.3/—
Vim-S26.0IN-1k : Sup. : 300— : — : —None80.3/—
Vim-S26.0IN-1k : Sup. : 300IN-1k : 30 : 224None81.4/—
Vim-B98.0IN-1k : Sup. : 300— : — : —None81.9/—
Vim-B98.0IN-1k : Sup. : 300IN-1k : 30 : 224None83.2/—

COCO (val)

modelpretrainheadtraingflopsmAPbAPb50APb75mAPbsmAPbmmAPbl
Vim-TiIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 200None45.763.949.626.149.063.2

COCO (val)

modelpretrainheadtraingflopsmAPmAPm50APm75mAPmsmAPmmmAPml
Vim-TiIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 200None39.260.941.718.241.860.2

ADE20K (val)

modelpretrainheadtraingflopsmIoUmspAccmsmAccmsmIoUsspAccssmAccss
Vim-TiIN-1k : Sup. : 300UPerNetADE20K (train) : 128 : 512None41.0
Vim-SIN-1k : Sup. : 300UPerNetADE20K (train) : 128 : 512None44.9