Heedless Backbones

SSViT Family

Select an option
Results
Parameters (M)
Images / Second
Publication Date
Select an option
---------
Object Detection
Instance Segmentation
Classification
Semantic Segmentation
Panoptic Segmentation
Select an option
---------
COCO (val)
COCO (test)
Cityscapes (val)
Cityscapes (test)
ADE20K (val)
ADE20K (test)
Select an option
----------
mAP
AP50
AP75
mAPs
mAPm
mAPl
GFLOPs
Select an option
---------
Mask R-CNN
Cascade Mask R-CNN
Mask2Former
HTC++
HTC
Panoptic FPN
Select an option
Results
Parameters (M)
Images / Second
GFLOPs
Publication Date
Select an option
---------
ImageNet-1k
ImageNet-22k
JFT-300M
JFT-3B
MegData73M
Select an option
----------
Supervised
Sup. + TL
FCMAE
MAE
CL
MAP
Select an option
Family
Pretrain Dataset
Instance Head
Instance Training Epochs
Select an option
----------
Family
Pretrain Method
Instance Head
Instance Training Epochs
modelparams (m)pretrainheadtrainGFLOPsmAP
SSViT-T15.0IN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12223.042.6
SSViT-S27.0IN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36266.045.4
SSViT-S27.0IN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12266.044.0
SSViT-S27.0IN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36745.046.6
SSViT-B57.0IN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36382.046.4
SSViT-B57.0IN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12382.045.4
SSViT-B57.0IN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36861.047.6
SSViT-L100.0IN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12572.046.0
modelparams (m)pretrainfinetunegflopsIN-1kIN-V2IN-AIN-R
SSViT-T15.0IN-1k : Sup. : 300— : — : —2.483.0/—72.3/—32.6/—45.6/—
SSViT-S27.0IN-1k : Sup. : 300— : — : —4.484.4/—74.1/—41.6/—51.0/—
SSViT-B57.0IN-1k : Sup. : 300— : — : —9.685.3/—75.7/—49.4/—55.6/—
SSViT-L100.0IN-1k : Sup. : 300— : — : —18.285.7/—76.1/—55.0/—59.2/—

COCO (val)

modelpretrainheadtraingflopsmAPbAPb50APb75mAPbsmAPbmmAPbl
SSViT-TIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12223.047.369.151.7
SSViT-TIN-1k : Sup. : 300RetinaNetCOCO (train) : 12205.045.666.549.328.650.160.5
SSViT-SIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36266.051.272.056.0
SSViT-SIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12266.049.470.854.1
SSViT-SIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36745.053.872.458.1
SSViT-SIN-1k : Sup. : 300RetinaNetCOCO (train) : 12248.047.568.650.830.152.263.3
SSViT-BIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36382.052.673.257.7
SSViT-BIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12382.051.072.555.8
SSViT-BIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36861.054.973.759.7
SSViT-BIN-1k : Sup. : 300RetinaNetCOCO (train) : 12363.049.070.252.932.453.464.8
SSViT-LIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12572.051.672.956.6
SSViT-LIN-1k : Sup. : 300RetinaNetCOCO (train) : 12553.050.071.453.833.254.665.0

COCO (val)

modelpretrainheadtraingflopsmAPmAPm50APm75mAPmsmAPmmmAPml
SSViT-TIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12223.042.666.245.8
SSViT-SIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36266.045.469.749.0
SSViT-SIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12266.044.067.747.3
SSViT-SIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36745.046.670.150.4
SSViT-BIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36382.046.470.950.3
SSViT-BIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12382.045.469.748.9
SSViT-BIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36861.047.671.651.5
SSViT-LIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12572.046.070.149.8

ADE20K (val)

modelpretrainheadtraingflopsmIoUmspAccmsmAccmsmIoUsspAccssmAccss
SSViT-TIN-1k : Sup. : 300Panoptic FPNADE20K (train) : 640 : 51235.046.8
SSViT-SIN-1k : Sup. : 300UPerNetADE20K (train) : 160 : 512941.050.1
SSViT-SIN-1k : Sup. : 300Panoptic FPNADE20K (train) : 640 : 512184.049.6
SSViT-BIN-1k : Sup. : 300UPerNetADE20K (train) : 160 : 5121060.052.2
SSViT-BIN-1k : Sup. : 300Panoptic FPNADE20K (train) : 640 : 512303.051.0
SSViT-LIN-1k : Sup. : 300UPerNetADE20K (train) : 160 : 5121256.053.3
SSViT-LIN-1k : Sup. : 300Panoptic FPNADE20K (train) : 640 : 512497.051.5