Heedless Backbones

MA ViT Family

Select an option
Results
Parameters (M)
Images / Second
Publication Date
Select an option
---------
Object Detection
Instance Segmentation
Classification
Semantic Segmentation
Panoptic Segmentation
Select an option
---------
Cityscapes (val)
Cityscapes (test)
ADE20K (val)
ADE20K (test)
PASCAL VOC 2007 (val)
PASCAL VOC 2007 (test)
Select an option
mIoUms
pAccms
mAccms
mIoUss
pAccss
mAccss
GFLOPs
Select an option
---------
UPerNet
Mask2Former
Panoptic FPN
SETR
Select an option
----------
512x2048
640x2560
Select an option
Results
Parameters (M)
Images / Second
GFLOPs
Publication Date
Select an option
---------
ImageNet-1k
ImageNet-22k
JFT-300M
JFT-3B
MegData73M
Select an option
----------
Supervised
Sup. + TL
FCMAE
MAE
CL
MAP
Select an option
Family
Pretrain Dataset
Semantic Segmentation Head
Semantic Segmentation Resolution
Semantic Segmentation Training Epochs
Select an option
----------
Family
Pretrain Method
Semantic Segmentation Head
Semantic Segmentation Resolution
Semantic Segmentation Training Epochs

No Results

modelparams (m)pretrainfinetunegflopsIN-1k
MA ViT-T16.0IN-1k : Sup. : 300— : — : —2.582.9/—
MA ViT-S27.0IN-1k : Sup. : 300— : — : —4.684.7/—
MA ViT-B50.0IN-1k : Sup. : 300— : — : —9.985.7/—
MA ViT-L98.0IN-1k : Sup. : 300— : — : —16.186.0/—

COCO (val)

modelpretrainheadtraingflopsmAPbAPb50APb75mAPbsmAPbmmAPbl
MA ViT-TIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12219.047.669.552.5
MA ViT-TIN-1k : Sup. : 300RetinaNetCOCO (train) : 12201.045.666.748.928.949.761.1
MA ViT-SIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12262.050.271.755.3
MA ViT-SIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36262.051.472.656.2
MA ViT-SIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36741.054.272.658.6
MA ViT-SIN-1k : Sup. : 300RetinaNetCOCO (train) : 12244.048.369.452.231.852.664.0
MA ViT-BIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12372.051.773.357.0
MA ViT-BIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36372.053.274.158.5
MA ViT-BIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36851.055.574.060.4
MA ViT-BIN-1k : Sup. : 300RetinaNetCOCO (train) : 12353.049.971.153.833.754.565.5
MA ViT-LIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12501.052.573.657.8
MA ViT-LIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36501.053.674.358.7
MA ViT-LIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36979.056.074.660.9
MA ViT-LIN-1k : Sup. : 300RetinaNetCOCO (train) : 12482.050.671.754.934.155.365.6

COCO (val)

modelpretrainheadtraingflopsmAPmAPm50APm75mAPmsmAPmmmAPml
MA ViT-TIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12219.042.966.546.4
MA ViT-SIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12262.044.768.747.9
MA ViT-SIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36262.045.569.849.2
MA ViT-SIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36741.047.070.551.1
MA ViT-BIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12372.046.170.650.1
MA ViT-BIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36372.047.071.551.1
MA ViT-BIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36851.048.071.752.5
MA ViT-LIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 12501.046.571.050.6
MA ViT-LIN-1k : Sup. : 300Mask R-CNNCOCO (train) : 36501.047.271.551.4
MA ViT-LIN-1k : Sup. : 300Cascade Mask R-CNNCOCO (train) : 36979.048.472.452.9

ADE20K (val)

modelpretrainheadtraingflopsmIoUmspAccmsmAccmsmIoUsspAccssmAccss
MA ViT-TIN-1k : Sup. : 300UPerNetADE20K (train) : 160 : 512893.048.4
MA ViT-TIN-1k : Sup. : 300Panoptic FPNADE20K (train) : 80 : 512136.047.6
MA ViT-SIN-1k : Sup. : 300UPerNetADE20K (train) : 160 : 512937.051.0
MA ViT-SIN-1k : Sup. : 300Panoptic FPNADE20K (train) : 80 : 512180.050.7
MA ViT-BIN-1k : Sup. : 300UPerNetADE20K (train) : 160 : 5121050.052.8
MA ViT-BIN-1k : Sup. : 300Panoptic FPNADE20K (train) : 80 : 512292.051.5
MA ViT-LIN-1k : Sup. : 300UPerNetADE20K (train) : 160 : 5121182.053.6
MA ViT-LIN-1k : Sup. : 300Panoptic FPNADE20K (train) : 80 : 512424.052.8