master c413dae1f053 cached
106 files
1.0 MB
316.3k tokens
1441 symbols
1 requests
Download .txt
Showing preview only (1,107K chars total). Download the full file or copy to clipboard to get everything.
Repository: xmu-xiaoma666/External-Attention-pytorch
Branch: master
Commit: c413dae1f053
Files: 106
Total size: 1.0 MB

Directory structure:
gitextract_gcvwg1wd/

├── LICENSE
├── README.md
├── README_EN.md
├── README_pip.md
├── main.py
├── model/
│   ├── .vscode/
│   │   └── settings.json
│   ├── __init__.py
│   ├── analysis/
│   │   ├── Attention.md
│   │   ├── 注意力机制.md
│   │   └── 重参数机制.md
│   ├── attention/
│   │   ├── A2Atttention.py
│   │   ├── ACmixAttention.py
│   │   ├── AFT.py
│   │   ├── Axial_attention.py
│   │   ├── BAM.py
│   │   ├── CBAM.py
│   │   ├── CoAtNet.py
│   │   ├── CoTAttention.py
│   │   ├── CoordAttention.py
│   │   ├── CrissCrossAttention.py
│   │   ├── Crossformer.py
│   │   ├── DANet.py
│   │   ├── DAT.py
│   │   ├── ECAAttention.py
│   │   ├── EMSA.py
│   │   ├── ExternalAttention.py
│   │   ├── HaloAttention.py
│   │   ├── MOATransformer.py
│   │   ├── MUSEAttention.py
│   │   ├── MobileViTAttention.py
│   │   ├── MobileViTv2Attention.py
│   │   ├── OutlookAttention.py
│   │   ├── PSA.py
│   │   ├── ParNetAttention.py
│   │   ├── PolarizedSelfAttention.py
│   │   ├── ResidualAttention.py
│   │   ├── S2Attention.py
│   │   ├── SEAttention.py
│   │   ├── SGE.py
│   │   ├── SKAttention.py
│   │   ├── SelfAttention.py
│   │   ├── ShuffleAttention.py
│   │   ├── SimAM.py
│   │   ├── SimplifiedSelfAttention.py
│   │   ├── TripletAttention.py
│   │   ├── UFOAttention.py
│   │   ├── ViP.py
│   │   └── gfnet.py
│   ├── backbone/
│   │   ├── CMT.py
│   │   ├── CPVT.py
│   │   ├── CaiT.py
│   │   ├── CeiT.py
│   │   ├── CoaT.py
│   │   ├── ConTNet.py
│   │   ├── ConViT.py
│   │   ├── Container.py
│   │   ├── ConvMixer.py
│   │   ├── CrossViT.py
│   │   ├── DViT.py
│   │   ├── DeiT.py
│   │   ├── EfficientFormer.py
│   │   ├── HATNet.py
│   │   ├── LeViT.py
│   │   ├── MobileNetV3.py
│   │   ├── MobileViT.py
│   │   ├── PIT.py
│   │   ├── PVT.py
│   │   ├── PatchConvnet.py
│   │   ├── ShuffleTransformer.py
│   │   ├── TnT.py
│   │   ├── VOLO.py
│   │   ├── convnextv2.py
│   │   ├── resnet.py
│   │   ├── resnext.py
│   │   ├── swin_transformer.py
│   │   ├── swin_transformer_v2.py
│   │   └── swin_transformer_v2_cr.py
│   ├── conv/
│   │   ├── CondConv.py
│   │   ├── DepthwiseSeparableConvolution.py
│   │   ├── DynamicConv.py
│   │   ├── HorNet.py
│   │   ├── Involution.py
│   │   └── MBConv.py
│   ├── fighingcv.egg-info/
│   │   ├── PKG-INFO
│   │   ├── SOURCES.txt
│   │   ├── dependency_links.txt
│   │   ├── entry_points.txt
│   │   ├── requires.txt
│   │   └── top_level.txt
│   ├── huggingface_hub.egg-info/
│   │   ├── PKG-INFO
│   │   ├── SOURCES.txt
│   │   ├── dependency_links.txt
│   │   ├── entry_points.txt
│   │   ├── requires.txt
│   │   └── top_level.txt
│   ├── mlp/
│   │   ├── g_mlp.py
│   │   ├── mlp_mixer.py
│   │   ├── repmlp.py
│   │   ├── resmlp.py
│   │   ├── sMLP_block.py
│   │   └── vip-mlp.py
│   └── rep/
│       ├── acnet.py
│       ├── ddb.py
│       ├── mobileone.py
│       └── repvgg.py
└── setup.py

================================================
FILE CONTENTS
================================================

================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2021 xmu-xiaoma666

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: README.md
================================================

<img src="./FightingCVimg/LOGO.gif" height="200" width="400"/>

简体中文 | [English](./README_EN.md)

# FightingCV 代码库, 包含 [***Attention***](#attention-series),[***Backbone***](#backbone-series), [***MLP***](#mlp-series), [***Re-parameter***](#re-parameter-series), [**Convolution**](#convolution-series)

![](https://img.shields.io/badge/fightingcv-v0.0.1-brightgreen)
![](https://img.shields.io/badge/python->=v3.0-blue)
![](https://img.shields.io/badge/pytorch->=v1.4-red)

<!--
-------
*If this project is helpful to you, welcome to give a ***star***.* 

*Don't forget to ***follow*** me to learn about project updates.*

-->


<!--


Hello,大家好,我是小马🚀🚀🚀

***For 小白(Like Me):***
最近在读论文的时候会发现一个问题,有时候论文核心思想非常简单,核心代码可能也就十几行。但是打开作者release的源码时,却发现提出的模块嵌入到分类、检测、分割等任务框架中,导致代码比较冗余,对于特定任务框架不熟悉的我,**很难找到核心代码**,导致在论文和网络思想的理解上会有一定困难。

***For 进阶者(Like You):***
如果把Conv、FC、RNN这些基本单元看做小的Lego积木,把Transformer、ResNet这些结构看成已经搭好的Lego城堡。那么本项目提供的模块就是一个个具有完整语义信息的Lego组件。**让科研工作者们避免反复造轮子**,只需思考如何利用这些“Lego组件”,搭建出更多绚烂多彩的作品。

***For 大神(May Be Like You):***
能力有限,**不喜轻喷**!!!

***For All:***
本项目致力于实现一个既能**让深度学习小白也能搞懂**,又能**服务科研和工业社区**的代码库。

-->

<!--

作为[**FightingCV公众号**](https://mp.weixin.qq.com/s/m9RiivbbDPdjABsTd6q8FA)和 **[FightingCV-Paper-Reading](https://github.com/xmu-xiaoma666/FightingCV-Paper-Reading)** 的补充,本项目的宗旨是从代码角度,实现🚀**让世界上没有难读的论文**🚀。


(同时也非常欢迎各位科研工作者将自己的工作的核心代码整理到本项目中,推动科研社区的发展,会在readme中注明代码的作者~)




## 技术交流 <img title="" src="https://user-images.githubusercontent.com/48054808/157800467-2a9946ad-30d1-49a9-b9db-ba33413d9c90.png" alt="" width="20">

欢迎大家关注公众号:**FightingCV**



| FightingCV公众号 | 小助手微信 (备注【**公司/学校+方向+ID**】)|
:-------------------------:|:-------------------------:
<img src='./FightingCVimg/FightingCV.jpg' width='200px'>  |  <img src='./FightingCVimg/xiaozhushou.jpg' width='200px'> 

- 公众号**每天**都会进行**论文、算法和代码的干货分享**哦~

- **交流群每天分享一些最新的论文和解析**,欢迎大家一起**学习交流**哈~~~

- 强烈推荐大家关注[**知乎**](https://www.zhihu.com/people/jason-14-58-38/posts)账号和[**FightingCV公众号**](https://mp.weixin.qq.com/s/m9RiivbbDPdjABsTd6q8FA),可以快速了解到最新优质的干货资源。


-------


-->

## 🌟 Star History


[![Star History Chart](https://api.star-history.com/svg?repos=xmu-xiaoma666/External-Attention-pytorch&type=Date)](https://star-history.com/#xmu-xiaoma666/External-Attention-pytorch&Date)

## 使用

### 安装

 直接通过 pip 安装

  ```shell
  pip install fightingcv-attention
  ```


或克隆该仓库

  ```shell
  git clone https://github.com/xmu-xiaoma666/External-Attention-pytorch.git

  cd External-Attention-pytorch
  ```

### 演示

#### 使用 pip 方式
```python
import torch
from torch import nn
from torch.nn import functional as F

# 使用 pip 方式

from fightingcv_attention.attention.MobileViTv2Attention import *

if __name__ == '__main__':
    input=torch.randn(50,49,512)
    sa = MobileViTv2Attention(d_model=512)
    output=sa(input)
    print(output.shape)
```

 - pip包 内置模块使用参考: [fightingcv-attention 说明文档](./README_pip.md)

#### 使用 git 方式
```python
import torch
from torch import nn
from torch.nn import functional as F

# 与 pip方式 区别在于 将 `fightingcv_attention` 替换 `model`

from model.attention.MobileViTv2Attention import *

if __name__ == '__main__':
    input=torch.randn(50,49,512)
    sa = MobileViTv2Attention(d_model=512)
    output=sa(input)
    print(output.shape)
```

-------



# 目录

- [Attention Series](#attention-series)
    - [1. External Attention Usage](#1-external-attention-usage)

    - [2. Self Attention Usage](#2-self-attention-usage)

    - [3. Simplified Self Attention Usage](#3-simplified-self-attention-usage)

    - [4. Squeeze-and-Excitation Attention Usage](#4-squeeze-and-excitation-attention-usage)

    - [5. SK Attention Usage](#5-sk-attention-usage)

    - [6. CBAM Attention Usage](#6-cbam-attention-usage)

    - [7. BAM Attention Usage](#7-bam-attention-usage)
    
    - [8. ECA Attention Usage](#8-eca-attention-usage)

    - [9. DANet Attention Usage](#9-danet-attention-usage)

    - [10. Pyramid Split Attention (PSA) Usage](#10-Pyramid-Split-Attention-Usage)

    - [11. Efficient Multi-Head Self-Attention(EMSA) Usage](#11-Efficient-Multi-Head-Self-Attention-Usage)

    - [12. Shuffle Attention Usage](#12-Shuffle-Attention-Usage)
    
    - [13. MUSE Attention Usage](#13-MUSE-Attention-Usage)
  
    - [14. SGE Attention Usage](#14-SGE-Attention-Usage)

    - [15. A2 Attention Usage](#15-A2-Attention-Usage)

    - [16. AFT Attention Usage](#16-AFT-Attention-Usage)

    - [17. Outlook Attention Usage](#17-Outlook-Attention-Usage)

    - [18. ViP Attention Usage](#18-ViP-Attention-Usage)

    - [19. CoAtNet Attention Usage](#19-CoAtNet-Attention-Usage)

    - [20. HaloNet Attention Usage](#20-HaloNet-Attention-Usage)

    - [21. Polarized Self-Attention Usage](#21-Polarized-Self-Attention-Usage)

    - [22. CoTAttention Usage](#22-CoTAttention-Usage)

    - [23. Residual Attention Usage](#23-Residual-Attention-Usage)
  
    - [24. S2 Attention Usage](#24-S2-Attention-Usage)

    - [25. GFNet Attention Usage](#25-GFNet-Attention-Usage)

    - [26. Triplet Attention Usage](#26-TripletAttention-Usage)

    - [27. Coordinate Attention Usage](#27-Coordinate-Attention-Usage)

    - [28. MobileViT Attention Usage](#28-MobileViT-Attention-Usage)

    - [29. ParNet Attention Usage](#29-ParNet-Attention-Usage)

    - [30. UFO Attention Usage](#30-UFO-Attention-Usage)

    - [31. ACmix Attention Usage](#31-Acmix-Attention-Usage)
  
    - [32. MobileViTv2 Attention Usage](#32-MobileViTv2-Attention-Usage)

    - [33. DAT Attention Usage](#33-DAT-Attention-Usage)

    - [34. CrossFormer Attention Usage](#34-CrossFormer-Attention-Usage)

    - [35. MOATransformer Attention Usage](#35-MOATransformer-Attention-Usage)

    - [36. CrissCrossAttention Attention Usage](#36-CrissCrossAttention-Attention-Usage)

    - [37. Axial_attention Attention Usage](#37-Axial_attention-Attention-Usage)

- [Backbone Series](#Backbone-series)

    - [1. ResNet Usage](#1-ResNet-Usage)

    - [2. ResNeXt Usage](#2-ResNeXt-Usage)

    - [3. MobileViT Usage](#3-MobileViT-Usage)

    - [4. ConvMixer Usage](#4-ConvMixer-Usage)

    - [5. ShuffleTransformer Usage](#5-ShuffleTransformer-Usage)

    - [6. ConTNet Usage](#6-ConTNet-Usage)

    - [7. HATNet Usage](#7-HATNet-Usage)

    - [8. CoaT Usage](#8-CoaT-Usage)

    - [9. PVT Usage](#9-PVT-Usage)

    - [10. CPVT Usage](#10-CPVT-Usage)

    - [11. PIT Usage](#11-PIT-Usage)

    - [12. CrossViT Usage](#12-CrossViT-Usage)

    - [13. TnT Usage](#13-TnT-Usage)

    - [14. DViT Usage](#14-DViT-Usage)

    - [15. CeiT Usage](#15-CeiT-Usage)

    - [16. ConViT Usage](#16-ConViT-Usage)

    - [17. CaiT Usage](#17-CaiT-Usage)

    - [18. PatchConvnet Usage](#18-PatchConvnet-Usage)

    - [19. DeiT Usage](#19-DeiT-Usage)

    - [20. LeViT Usage](#20-LeViT-Usage)

    - [21. VOLO Usage](#21-VOLO-Usage)
    
    - [22. Container Usage](#22-Container-Usage)

    - [23. CMT Usage](#23-CMT-Usage)

    - [24. EfficientFormer Usage](#24-EfficientFormer-Usage)

    - [25. ConvNeXtV2 Usage](#25-ConvNeXtV2-Usage)



- [MLP Series](#mlp-series)

    - [1. RepMLP Usage](#1-RepMLP-Usage)

    - [2. MLP-Mixer Usage](#2-MLP-Mixer-Usage)

    - [3. ResMLP Usage](#3-ResMLP-Usage)

    - [4. gMLP Usage](#4-gMLP-Usage)

    - [5. sMLP Usage](#5-sMLP-Usage)

    - [6. vip-mlp Usage](#6-vip-mlp-Usage)

- [Re-Parameter(ReP) Series](#Re-Parameter-series)

    - [1. RepVGG Usage](#1-RepVGG-Usage)

    - [2. ACNet Usage](#2-ACNet-Usage)

    - [3. Diverse Branch Block(DDB) Usage](#3-Diverse-Branch-Block-Usage)

- [Convolution Series](#Convolution-series)

    - [1. Depthwise Separable Convolution Usage](#1-Depthwise-Separable-Convolution-Usage)

    - [2. MBConv Usage](#2-MBConv-Usage)

    - [3. Involution Usage](#3-Involution-Usage)

    - [4. DynamicConv Usage](#4-DynamicConv-Usage)

    - [5. CondConv Usage](#5-CondConv-Usage)

***



# Attention Series

- Pytorch implementation of ["Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks---arXiv 2021.05.05"](https://arxiv.org/abs/2105.02358)

- Pytorch implementation of ["Attention Is All You Need---NIPS2017"](https://arxiv.org/pdf/1706.03762.pdf)

- Pytorch implementation of ["Squeeze-and-Excitation Networks---CVPR2018"](https://arxiv.org/abs/1709.01507)

- Pytorch implementation of ["Selective Kernel Networks---CVPR2019"](https://arxiv.org/pdf/1903.06586.pdf)

- Pytorch implementation of ["CBAM: Convolutional Block Attention Module---ECCV2018"](https://openaccess.thecvf.com/content_ECCV_2018/papers/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.pdf)

- Pytorch implementation of ["BAM: Bottleneck Attention Module---BMCV2018"](https://arxiv.org/pdf/1807.06514.pdf)

- Pytorch implementation of ["ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks---CVPR2020"](https://arxiv.org/pdf/1910.03151.pdf)

- Pytorch implementation of ["Dual Attention Network for Scene Segmentation---CVPR2019"](https://arxiv.org/pdf/1809.02983.pdf)

- Pytorch implementation of ["EPSANet: An Efficient Pyramid Split Attention Block on Convolutional Neural Network---arXiv 2021.05.30"](https://arxiv.org/pdf/2105.14447.pdf)

- Pytorch implementation of ["ResT: An Efficient Transformer for Visual Recognition---arXiv 2021.05.28"](https://arxiv.org/abs/2105.13677)

- Pytorch implementation of ["SA-NET: SHUFFLE ATTENTION FOR DEEP CONVOLUTIONAL NEURAL NETWORKS---ICASSP 2021"](https://arxiv.org/pdf/2102.00240.pdf)

- Pytorch implementation of ["MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning---arXiv 2019.11.17"](https://arxiv.org/abs/1911.09483)

- Pytorch implementation of ["Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks---arXiv 2019.05.23"](https://arxiv.org/pdf/1905.09646.pdf)

- Pytorch implementation of ["A2-Nets: Double Attention Networks---NIPS2018"](https://arxiv.org/pdf/1810.11579.pdf)


- Pytorch implementation of ["An Attention Free Transformer---ICLR2021 (Apple New Work)"](https://arxiv.org/pdf/2105.14103v1.pdf)


- Pytorch implementation of [VOLO: Vision Outlooker for Visual Recognition---arXiv 2021.06.24"](https://arxiv.org/abs/2106.13112) 
  [【论文解析】](https://zhuanlan.zhihu.com/p/385561050)


- Pytorch implementation of [Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition---arXiv 2021.06.23](https://arxiv.org/abs/2106.12368) 
  [【论文解析】](https://mp.weixin.qq.com/s/5gonUQgBho_m2O54jyXF_Q)


- Pytorch implementation of [CoAtNet: Marrying Convolution and Attention for All Data Sizes---arXiv 2021.06.09](https://arxiv.org/abs/2106.04803) 
  [【论文解析】](https://zhuanlan.zhihu.com/p/385578588)


- Pytorch implementation of [Scaling Local Self-Attention for Parameter Efficient Visual Backbones---CVPR2021 Oral](https://arxiv.org/pdf/2103.12731.pdf)  [【论文解析】](https://zhuanlan.zhihu.com/p/388598744)



- Pytorch implementation of [Polarized Self-Attention: Towards High-quality Pixel-wise Regression---arXiv 2021.07.02](https://arxiv.org/abs/2107.00782)  [【论文解析】](https://zhuanlan.zhihu.com/p/389770482) 


- Pytorch implementation of [Contextual Transformer Networks for Visual Recognition---arXiv 2021.07.26](https://arxiv.org/abs/2107.12292)  [【论文解析】](https://zhuanlan.zhihu.com/p/394795481) 


- Pytorch implementation of [Residual Attention: A Simple but Effective Method for Multi-Label Recognition---ICCV2021](https://arxiv.org/abs/2108.02456) 


- Pytorch implementation of [S²-MLPv2: Improved Spatial-Shift MLP Architecture for Vision---arXiv 2021.08.02](https://arxiv.org/abs/2108.01072) [【论文解析】](https://zhuanlan.zhihu.com/p/397003638) 

- Pytorch implementation of [Global Filter Networks for Image Classification---arXiv 2021.07.01](https://arxiv.org/abs/2107.00645) 

- Pytorch implementation of [Rotate to Attend: Convolutional Triplet Attention Module---WACV 2021](https://arxiv.org/abs/2010.03045) 

- Pytorch implementation of [Coordinate Attention for Efficient Mobile Network Design ---CVPR 2021](https://arxiv.org/abs/2103.02907)

- Pytorch implementation of [MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2021.10.05](https://arxiv.org/abs/2110.02178)

- Pytorch implementation of [Non-deep Networks---ArXiv 2021.10.20](https://arxiv.org/abs/2110.07641)

- Pytorch implementation of [UFO-ViT: High Performance Linear Vision Transformer without Softmax---ArXiv 2021.09.29](https://arxiv.org/abs/2109.14382)

- Pytorch implementation of [Separable Self-attention for Mobile Vision Transformers---ArXiv 2022.06.06](https://arxiv.org/abs/2206.02680)

- Pytorch implementation of [On the Integration of Self-Attention and Convolution---ArXiv 2022.03.14](https://arxiv.org/pdf/2111.14556.pdf)

- Pytorch implementation of [CROSSFORMER: A VERSATILE VISION TRANSFORMER HINGING ON CROSS-SCALE ATTENTION---ICLR 2022](https://arxiv.org/pdf/2108.00154.pdf)

- Pytorch implementation of [Aggregating Global Features into Local Vision Transformer](https://arxiv.org/abs/2201.12903)

- Pytorch implementation of [CCNet: Criss-Cross Attention for Semantic Segmentation](https://arxiv.org/abs/1811.11721)

- Pytorch implementation of [Axial Attention in Multidimensional Transformers](https://arxiv.org/abs/1912.12180)
***


### 1. External Attention Usage
#### 1.1. Paper
["Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks"](https://arxiv.org/abs/2105.02358)

#### 1.2. Overview
![](./model/img/External_Attention.png)

#### 1.3. Usage Code
```python
from model.attention.ExternalAttention import ExternalAttention
import torch

input=torch.randn(50,49,512)
ea = ExternalAttention(d_model=512,S=8)
output=ea(input)
print(output.shape)
```

***


### 2. Self Attention Usage
#### 2.1. Paper
["Attention Is All You Need"](https://arxiv.org/pdf/1706.03762.pdf)

#### 1.2. Overview
![](./model/img/SA.png)

#### 1.3. Usage Code
```python
from model.attention.SelfAttention import ScaledDotProductAttention
import torch

input=torch.randn(50,49,512)
sa = ScaledDotProductAttention(d_model=512, d_k=512, d_v=512, h=8)
output=sa(input,input,input)
print(output.shape)
```

***

### 3. Simplified Self Attention Usage
#### 3.1. Paper
[None]()

#### 3.2. Overview
![](./model/img/SSA.png)

#### 3.3. Usage Code
```python
from model.attention.SimplifiedSelfAttention import SimplifiedScaledDotProductAttention
import torch

input=torch.randn(50,49,512)
ssa = SimplifiedScaledDotProductAttention(d_model=512, h=8)
output=ssa(input,input,input)
print(output.shape)

```

***

### 4. Squeeze-and-Excitation Attention Usage
#### 4.1. Paper
["Squeeze-and-Excitation Networks"](https://arxiv.org/abs/1709.01507)

#### 4.2. Overview
![](./model/img/SE.png)

#### 4.3. Usage Code
```python
from model.attention.SEAttention import SEAttention
import torch

input=torch.randn(50,512,7,7)
se = SEAttention(channel=512,reduction=8)
output=se(input)
print(output.shape)

```

***

### 5. SK Attention Usage
#### 5.1. Paper
["Selective Kernel Networks"](https://arxiv.org/pdf/1903.06586.pdf)

#### 5.2. Overview
![](./model/img/SK.png)

#### 5.3. Usage Code
```python
from model.attention.SKAttention import SKAttention
import torch

input=torch.randn(50,512,7,7)
se = SKAttention(channel=512,reduction=8)
output=se(input)
print(output.shape)

```
***

### 6. CBAM Attention Usage
#### 6.1. Paper
["CBAM: Convolutional Block Attention Module"](https://openaccess.thecvf.com/content_ECCV_2018/papers/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.pdf)

#### 6.2. Overview
![](./model/img/CBAM1.png)

![](./model/img/CBAM2.png)

#### 6.3. Usage Code
```python
from model.attention.CBAM import CBAMBlock
import torch

input=torch.randn(50,512,7,7)
kernel_size=input.shape[2]
cbam = CBAMBlock(channel=512,reduction=16,kernel_size=kernel_size)
output=cbam(input)
print(output.shape)

```

***

### 7. BAM Attention Usage
#### 7.1. Paper
["BAM: Bottleneck Attention Module"](https://arxiv.org/pdf/1807.06514.pdf)

#### 7.2. Overview
![](./model/img/BAM.png)

#### 7.3. Usage Code
```python
from model.attention.BAM import BAMBlock
import torch

input=torch.randn(50,512,7,7)
bam = BAMBlock(channel=512,reduction=16,dia_val=2)
output=bam(input)
print(output.shape)

```

***

### 8. ECA Attention Usage
#### 8.1. Paper
["ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks"](https://arxiv.org/pdf/1910.03151.pdf)

#### 8.2. Overview
![](./model/img/ECA.png)

#### 8.3. Usage Code
```python
from model.attention.ECAAttention import ECAAttention
import torch

input=torch.randn(50,512,7,7)
eca = ECAAttention(kernel_size=3)
output=eca(input)
print(output.shape)

```

***

### 9. DANet Attention Usage
#### 9.1. Paper
["Dual Attention Network for Scene Segmentation"](https://arxiv.org/pdf/1809.02983.pdf)

#### 9.2. Overview
![](./model/img/danet.png)

#### 9.3. Usage Code
```python
from model.attention.DANet import DAModule
import torch

input=torch.randn(50,512,7,7)
danet=DAModule(d_model=512,kernel_size=3,H=7,W=7)
print(danet(input).shape)

```

***

### 10. Pyramid Split Attention Usage

#### 10.1. Paper
["EPSANet: An Efficient Pyramid Split Attention Block on Convolutional Neural Network"](https://arxiv.org/pdf/2105.14447.pdf)

#### 10.2. Overview
![](./model/img/psa.png)

#### 10.3. Usage Code
```python
from model.attention.PSA import PSA
import torch

input=torch.randn(50,512,7,7)
psa = PSA(channel=512,reduction=8)
output=psa(input)
print(output.shape)

```

***


### 11. Efficient Multi-Head Self-Attention Usage

#### 11.1. Paper
["ResT: An Efficient Transformer for Visual Recognition"](https://arxiv.org/abs/2105.13677)

#### 11.2. Overview
![](./model/img/EMSA.png)

#### 11.3. Usage Code
```python

from model.attention.EMSA import EMSA
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,64,512)
emsa = EMSA(d_model=512, d_k=512, d_v=512, h=8,H=8,W=8,ratio=2,apply_transform=True)
output=emsa(input,input,input)
print(output.shape)
    
```

***


### 12. Shuffle Attention Usage

#### 12.1. Paper
["SA-NET: SHUFFLE ATTENTION FOR DEEP CONVOLUTIONAL NEURAL NETWORKS"](https://arxiv.org/pdf/2102.00240.pdf)

#### 12.2. Overview
![](./model/img/ShuffleAttention.jpg)

#### 12.3. Usage Code
```python

from model.attention.ShuffleAttention import ShuffleAttention
import torch
from torch import nn
from torch.nn import functional as F


input=torch.randn(50,512,7,7)
se = ShuffleAttention(channel=512,G=8)
output=se(input)
print(output.shape)

    
```


***


### 13. MUSE Attention Usage

#### 13.1. Paper
["MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning"](https://arxiv.org/abs/1911.09483)

#### 13.2. Overview
![](./model/img/MUSE.png)

#### 13.3. Usage Code
```python
from model.attention.MUSEAttention import MUSEAttention
import torch
from torch import nn
from torch.nn import functional as F


input=torch.randn(50,49,512)
sa = MUSEAttention(d_model=512, d_k=512, d_v=512, h=8)
output=sa(input,input,input)
print(output.shape)

```

***


### 14. SGE Attention Usage

#### 14.1. Paper
[Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks](https://arxiv.org/pdf/1905.09646.pdf)

#### 14.2. Overview
![](./model/img/SGE.png)

#### 14.3. Usage Code
```python
from model.attention.SGE import SpatialGroupEnhance
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
sge = SpatialGroupEnhance(groups=8)
output=sge(input)
print(output.shape)

```

***


### 15. A2 Attention Usage

#### 15.1. Paper
[A2-Nets: Double Attention Networks](https://arxiv.org/pdf/1810.11579.pdf)

#### 15.2. Overview
![](./model/img/A2.png)

#### 15.3. Usage Code
```python
from model.attention.A2Atttention import DoubleAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
a2 = DoubleAttention(512,128,128,True)
output=a2(input)
print(output.shape)

```



### 16. AFT Attention Usage

#### 16.1. Paper
[An Attention Free Transformer](https://arxiv.org/pdf/2105.14103v1.pdf)

#### 16.2. Overview
![](./model/img/AFT.jpg)

#### 16.3. Usage Code
```python
from model.attention.AFT import AFT_FULL
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,49,512)
aft_full = AFT_FULL(d_model=512, n=49)
output=aft_full(input)
print(output.shape)

```






### 17. Outlook Attention Usage

#### 17.1. Paper


[VOLO: Vision Outlooker for Visual Recognition"](https://arxiv.org/abs/2106.13112)


#### 17.2. Overview
![](./model/img/OutlookAttention.png)

#### 17.3. Usage Code
```python
from model.attention.OutlookAttention import OutlookAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,28,28,512)
outlook = OutlookAttention(dim=512)
output=outlook(input)
print(output.shape)

```


***






### 18. ViP Attention Usage

#### 18.1. Paper


[Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition"](https://arxiv.org/abs/2106.12368)


#### 18.2. Overview
![](./model/img/ViP.png)

#### 18.3. Usage Code
```python

from model.attention.ViP import WeightedPermuteMLP
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(64,8,8,512)
seg_dim=8
vip=WeightedPermuteMLP(512,seg_dim)
out=vip(input)
print(out.shape)

```


***





### 19. CoAtNet Attention Usage

#### 19.1. Paper


[CoAtNet: Marrying Convolution and Attention for All Data Sizes"](https://arxiv.org/abs/2106.04803) 


#### 19.2. Overview
None


#### 19.3. Usage Code
```python

from model.attention.CoAtNet import CoAtNet
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,3,224,224)
mbconv=CoAtNet(in_ch=3,image_size=224)
out=mbconv(input)
print(out.shape)

```


***






### 20. HaloNet Attention Usage

#### 20.1. Paper


[Scaling Local Self-Attention for Parameter Efficient Visual Backbones"](https://arxiv.org/pdf/2103.12731.pdf) 


#### 20.2. Overview

![](./model/img/HaloNet.png)

#### 20.3. Usage Code
```python

from model.attention.HaloAttention import HaloAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,512,8,8)
halo = HaloAttention(dim=512,
    block_size=2,
    halo_size=1,)
output=halo(input)
print(output.shape)

```


***

### 21. Polarized Self-Attention Usage

#### 21.1. Paper

[Polarized Self-Attention: Towards High-quality Pixel-wise Regression"](https://arxiv.org/abs/2107.00782)  


#### 21.2. Overview

![](./model/img/PoSA.png)

#### 21.3. Usage Code
```python

from model.attention.PolarizedSelfAttention import ParallelPolarizedSelfAttention,SequentialPolarizedSelfAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,512,7,7)
psa = SequentialPolarizedSelfAttention(channel=512)
output=psa(input)
print(output.shape)


```


***


### 22. CoTAttention Usage

#### 22.1. Paper

[Contextual Transformer Networks for Visual Recognition---arXiv 2021.07.26](https://arxiv.org/abs/2107.12292) 


#### 22.2. Overview

![](./model/img/CoT.png)

#### 22.3. Usage Code
```python

from model.attention.CoTAttention import CoTAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
cot = CoTAttention(dim=512,kernel_size=3)
output=cot(input)
print(output.shape)



```

***


### 23. Residual Attention Usage

#### 23.1. Paper

[Residual Attention: A Simple but Effective Method for Multi-Label Recognition---ICCV2021](https://arxiv.org/abs/2108.02456) 


#### 23.2. Overview

![](./model/img/ResAtt.png)

#### 23.3. Usage Code
```python

from model.attention.ResidualAttention import ResidualAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
resatt = ResidualAttention(channel=512,num_class=1000,la=0.2)
output=resatt(input)
print(output.shape)



```

***



### 24. S2 Attention Usage

#### 24.1. Paper

[S²-MLPv2: Improved Spatial-Shift MLP Architecture for Vision---arXiv 2021.08.02](https://arxiv.org/abs/2108.01072) 


#### 24.2. Overview

![](./model/img/S2Attention.png)

#### 24.3. Usage Code
```python
from model.attention.S2Attention import S2Attention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
s2att = S2Attention(channels=512)
output=s2att(input)
print(output.shape)

```

***



### 25. GFNet Attention Usage

#### 25.1. Paper

[Global Filter Networks for Image Classification---arXiv 2021.07.01](https://arxiv.org/abs/2107.00645) 


#### 25.2. Overview

![](./model/img/GFNet.jpg)

#### 25.3. Usage Code - Implemented by [Wenliang Zhao (Author)](https://scholar.google.com/citations?user=lyPWvuEAAAAJ&hl=en)

```python
from model.attention.gfnet import GFNet
import torch
from torch import nn
from torch.nn import functional as F

x = torch.randn(1, 3, 224, 224)
gfnet = GFNet(embed_dim=384, img_size=224, patch_size=16, num_classes=1000)
out = gfnet(x)
print(out.shape)

```

***


### 26. TripletAttention Usage

#### 26.1. Paper

[Rotate to Attend: Convolutional Triplet Attention Module---CVPR 2021](https://arxiv.org/abs/2010.03045) 

#### 26.2. Overview

![](./model/img/triplet.png)

#### 26.3. Usage Code - Implemented by [digantamisra98](https://github.com/digantamisra98)

```python
from model.attention.TripletAttention import TripletAttention
import torch
from torch import nn
from torch.nn import functional as F
input=torch.randn(50,512,7,7)
triplet = TripletAttention()
output=triplet(input)
print(output.shape)
```


***


### 27. Coordinate Attention Usage

#### 27.1. Paper

[Coordinate Attention for Efficient Mobile Network Design---CVPR 2021](https://arxiv.org/abs/2103.02907)


#### 27.2. Overview

![](./model/img/CoordAttention.png)

#### 27.3. Usage Code - Implemented by [Andrew-Qibin](https://github.com/Andrew-Qibin)

```python
from model.attention.CoordAttention import CoordAtt
import torch
from torch import nn
from torch.nn import functional as F

inp=torch.rand([2, 96, 56, 56])
inp_dim, oup_dim = 96, 96
reduction=32

coord_attention = CoordAtt(inp_dim, oup_dim, reduction=reduction)
output=coord_attention(inp)
print(output.shape)
```

***


### 28. MobileViT Attention Usage

#### 28.1. Paper

[MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2021.10.05](https://arxiv.org/abs/2103.02907)


#### 28.2. Overview

![](./model/img/MobileViTAttention.png)

#### 28.3. Usage Code

```python
from model.attention.MobileViTAttention import MobileViTAttention
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    m=MobileViTAttention()
    input=torch.randn(1,3,49,49)
    output=m(input)
    print(output.shape)  #output:(1,3,49,49)
    
```

***


### 29. ParNet Attention Usage

#### 29.1. Paper

[Non-deep Networks---ArXiv 2021.10.20](https://arxiv.org/abs/2110.07641)


#### 29.2. Overview

![](./model/img/ParNet.png)

#### 29.3. Usage Code

```python
from model.attention.ParNetAttention import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,512,7,7)
    pna = ParNetAttention(channel=512)
    output=pna(input)
    print(output.shape) #50,512,7,7
    
```

***


### 30. UFO Attention Usage

#### 30.1. Paper

[UFO-ViT: High Performance Linear Vision Transformer without Softmax---ArXiv 2021.09.29](https://arxiv.org/abs/2110.07641)


#### 30.2. Overview

![](./model/img/UFO.png)

#### 30.3. Usage Code

```python
from model.attention.UFOAttention import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,49,512)
    ufo = UFOAttention(d_model=512, d_k=512, d_v=512, h=8)
    output=ufo(input,input,input)
    print(output.shape) #[50, 49, 512]
    
```

-

### 31. ACmix Attention Usage

#### 31.1. Paper

[On the Integration of Self-Attention and Convolution](https://arxiv.org/pdf/2111.14556.pdf)

#### 31.2. Usage Code

```python
from model.attention.ACmix import ACmix
import torch

if __name__ == '__main__':
    input=torch.randn(50,256,7,7)
    acmix = ACmix(in_planes=256, out_planes=256)
    output=acmix(input)
    print(output.shape)
    
```

### 32. MobileViTv2 Attention Usage

#### 32.1. Paper

[Separable Self-attention for Mobile Vision Transformers---ArXiv 2022.06.06](https://arxiv.org/abs/2206.02680)


#### 32.2. Overview

![](./model/img/MobileViTv2.png)

#### 32.3. Usage Code

```python
from model.attention.MobileViTv2Attention import MobileViTv2Attention
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,49,512)
    sa = MobileViTv2Attention(d_model=512)
    output=sa(input)
    print(output.shape)
    
```

### 33. DAT Attention Usage

#### 33.1. Paper

[Vision Transformer with Deformable Attention---CVPR2022](https://arxiv.org/abs/2201.00520)

#### 33.2. Usage Code

```python
from model.attention.DAT import DAT
import torch

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = DAT(
        img_size=224,
        patch_size=4,
        num_classes=1000,
        expansion=4,
        dim_stem=96,
        dims=[96, 192, 384, 768],
        depths=[2, 2, 6, 2],
        stage_spec=[['L', 'S'], ['L', 'S'], ['L', 'D', 'L', 'D', 'L', 'D'], ['L', 'D']],
        heads=[3, 6, 12, 24],
        window_sizes=[7, 7, 7, 7] ,
        groups=[-1, -1, 3, 6],
        use_pes=[False, False, True, True],
        dwc_pes=[False, False, False, False],
        strides=[-1, -1, 1, 1],
        sr_ratios=[-1, -1, -1, -1],
        offset_range_factor=[-1, -1, 2, 2],
        no_offs=[False, False, False, False],
        fixed_pes=[False, False, False, False],
        use_dwc_mlps=[False, False, False, False],
        use_conv_patches=False,
        drop_rate=0.0,
        attn_drop_rate=0.0,
        drop_path_rate=0.2,
    )
    output=model(input)
    print(output[0].shape)
    
```

### 34. CrossFormer Attention Usage

#### 34.1. Paper

[CROSSFORMER: A VERSATILE VISION TRANSFORMER HINGING ON CROSS-SCALE ATTENTION---ICLR 2022](https://arxiv.org/pdf/2108.00154.pdf)

#### 34.2. Usage Code

```python
from model.attention.Crossformer import CrossFormer
import torch

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CrossFormer(img_size=224,
        patch_size=[4, 8, 16, 32],
        in_chans= 3,
        num_classes=1000,
        embed_dim=48,
        depths=[2, 2, 6, 2],
        num_heads=[3, 6, 12, 24],
        group_size=[7, 7, 7, 7],
        mlp_ratio=4.,
        qkv_bias=True,
        qk_scale=None,
        drop_rate=0.0,
        drop_path_rate=0.1,
        ape=False,
        patch_norm=True,
        use_checkpoint=False,
        merge_size=[[2, 4], [2,4], [2, 4]]
    )
    output=model(input)
    print(output.shape)
    
```

### 35. MOATransformer Attention Usage

#### 35.1. Paper

[Aggregating Global Features into Local Vision Transformer](https://arxiv.org/abs/2201.12903)

#### 35.2. Usage Code

```python
from model.attention.MOATransformer import MOATransformer
import torch

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = MOATransformer(
        img_size=224,
        patch_size=4,
        in_chans=3,
        num_classes=1000,
        embed_dim=96,
        depths=[2, 2, 6],
        num_heads=[3, 6, 12],
        window_size=14,
        mlp_ratio=4.,
        qkv_bias=True,
        qk_scale=None,
        drop_rate=0.0,
        drop_path_rate=0.1,
        ape=False,
        patch_norm=True,
        use_checkpoint=False
    )
    output=model(input)
    print(output.shape)
    
```

### 36. CrissCrossAttention Attention Usage

#### 36.1. Paper

[CCNet: Criss-Cross Attention for Semantic Segmentation](https://arxiv.org/abs/1811.11721)

#### 36.2. Usage Code

```python
from model.attention.CrissCrossAttention import CrissCrossAttention
import torch

if __name__ == '__main__':
    input=torch.randn(3, 64, 7, 7)
    model = CrissCrossAttention(64)
    outputs = model(input)
    print(outputs.shape)
    
```

### 37. Axial_attention Attention Usage

#### 37.1. Paper

[Axial Attention in Multidimensional Transformers](https://arxiv.org/abs/1912.12180)

#### 37.2. Usage Code

```python
from model.attention.Axial_attention import AxialImageTransformer
import torch

if __name__ == '__main__':
    input=torch.randn(3, 128, 7, 7)
    model = AxialImageTransformer(
        dim = 128,
        depth = 12,
        reversible = True
    )
    outputs = model(input)
    print(outputs.shape)
    
```

***


# Backbone Series

- Pytorch implementation of ["Deep Residual Learning for Image Recognition---CVPR2016 Best Paper"](https://arxiv.org/pdf/1512.03385.pdf)

- Pytorch implementation of ["Aggregated Residual Transformations for Deep Neural Networks---CVPR2017"](https://arxiv.org/abs/1611.05431v2)

- Pytorch implementation of [MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2020.10.05](https://arxiv.org/abs/2103.02907)

- Pytorch implementation of [Patches Are All You Need?---ICLR2022 (Under Review)](https://openreview.net/forum?id=TVHS5Y4dNvM)

- Pytorch implementation of [Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer---ArXiv 2021.06.07](https://arxiv.org/abs/2106.03650)

- Pytorch implementation of [ConTNet: Why not use convolution and transformer at the same time?---ArXiv 2021.04.27](https://arxiv.org/abs/2104.13497)

- Pytorch implementation of [Vision Transformers with Hierarchical Attention---ArXiv 2022.06.15](https://arxiv.org/abs/2106.03180)

- Pytorch implementation of [Co-Scale Conv-Attentional Image Transformers---ArXiv 2021.08.26](https://arxiv.org/abs/2104.06399)

- Pytorch implementation of [Conditional Positional Encodings for Vision Transformers](https://arxiv.org/abs/2102.10882)

- Pytorch implementation of [Rethinking Spatial Dimensions of Vision Transformers---ICCV 2021](https://arxiv.org/abs/2103.16302)

- Pytorch implementation of [CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification---ICCV 2021](https://arxiv.org/abs/2103.14899)

- Pytorch implementation of [Transformer in Transformer---NeurIPS 2021](https://arxiv.org/abs/2103.00112)

- Pytorch implementation of [DeepViT: Towards Deeper Vision Transformer](https://arxiv.org/abs/2103.11886)

- Pytorch implementation of [Incorporating Convolution Designs into Visual Transformers](https://arxiv.org/abs/2103.11816)
***

- Pytorch implementation of [ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases](https://arxiv.org/abs/2103.10697)

- Pytorch implementation of [Augmenting Convolutional networks with attention-based aggregation](https://arxiv.org/abs/2112.13692)

- Pytorch implementation of [Going deeper with Image Transformers---ICCV 2021 (Oral)](https://arxiv.org/abs/2103.17239)

- Pytorch implementation of [Training data-efficient image transformers & distillation through attention---ICML 2021](https://arxiv.org/abs/2012.12877)

- Pytorch implementation of [LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference](https://arxiv.org/abs/2104.01136)

- Pytorch implementation of [VOLO: Vision Outlooker for Visual Recognition](https://arxiv.org/abs/2106.13112)

- Pytorch implementation of [Container: Context Aggregation Network---NeuIPS 2021](https://arxiv.org/abs/2106.01401)

- Pytorch implementation of [CMT: Convolutional Neural Networks Meet Vision Transformers---CVPR 2022](https://arxiv.org/abs/2107.06263)

- Pytorch implementation of [Vision Transformer with Deformable Attention---CVPR 2022](https://arxiv.org/abs/2201.00520)

- Pytorch implementation of [EfficientFormer: Vision Transformers at MobileNet Speed](https://arxiv.org/abs/2206.01191)

- Pytorch implementation of [ConvNeXtV2: Co-designing and Scaling ConvNets with Masked Autoencoders](https://arxiv.org/abs/2301.00808)


### 1. ResNet Usage
#### 1.1. Paper
["Deep Residual Learning for Image Recognition---CVPR2016 Best Paper"](https://arxiv.org/pdf/1512.03385.pdf)

#### 1.2. Overview
![](./model/img/resnet.png)
![](./model/img/resnet2.jpg)

#### 1.3. Usage Code
```python

from model.backbone.resnet import ResNet50,ResNet101,ResNet152
import torch
if __name__ == '__main__':
    input=torch.randn(50,3,224,224)
    resnet50=ResNet50(1000)
    # resnet101=ResNet101(1000)
    # resnet152=ResNet152(1000)
    out=resnet50(input)
    print(out.shape)

```


### 2. ResNeXt Usage
#### 2.1. Paper

["Aggregated Residual Transformations for Deep Neural Networks---CVPR2017"](https://arxiv.org/abs/1611.05431v2)

#### 2.2. Overview
![](./model/img/resnext.png)

#### 2.3. Usage Code
```python

from model.backbone.resnext import ResNeXt50,ResNeXt101,ResNeXt152
import torch

if __name__ == '__main__':
    input=torch.randn(50,3,224,224)
    resnext50=ResNeXt50(1000)
    # resnext101=ResNeXt101(1000)
    # resnext152=ResNeXt152(1000)
    out=resnext50(input)
    print(out.shape)


```



### 3. MobileViT Usage
#### 3.1. Paper

[MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2020.10.05](https://arxiv.org/abs/2103.02907)

#### 3.2. Overview
![](./model/img/mobileViT.jpg)

#### 3.3. Usage Code
```python

from model.backbone.MobileViT import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)

    ### mobilevit_xxs
    mvit_xxs=mobilevit_xxs()
    out=mvit_xxs(input)
    print(out.shape)

    ### mobilevit_xs
    mvit_xs=mobilevit_xs()
    out=mvit_xs(input)
    print(out.shape)


    ### mobilevit_s
    mvit_s=mobilevit_s()
    out=mvit_s(input)
    print(out.shape)

```





### 4. ConvMixer Usage
#### 4.1. Paper
[Patches Are All You Need?---ICLR2022 (Under Review)](https://openreview.net/forum?id=TVHS5Y4dNvM)
#### 4.2. Overview
![](./model/img/ConvMixer.png)

#### 4.3. Usage Code
```python

from model.backbone.ConvMixer import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    x=torch.randn(1,3,224,224)
    convmixer=ConvMixer(dim=512,depth=12)
    out=convmixer(x)
    print(out.shape)  #[1, 1000]


```

### 5. ShuffleTransformer Usage
#### 5.1. Paper
[Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer](https://arxiv.org/pdf/2106.03650.pdf)

#### 5.2. Usage Code
```python

from model.backbone.ShuffleTransformer import ShuffleTransformer
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    sft = ShuffleTransformer()
    output=sft(input)
    print(output.shape)


```

### 6. ConTNet Usage
#### 6.1. Paper
[ConTNet: Why not use convolution and transformer at the same time?](https://arxiv.org/abs/2104.13497)

#### 6.2. Usage Code
```python

from model.backbone.ConTNet import ConTNet
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == "__main__":
    model = build_model(use_avgdown=True, relative=True, qkv_bias=True, pre_norm=True)
    input = torch.randn(1, 3, 224, 224)
    out = model(input)
    print(out.shape)


```

### 7 HATNet Usage
#### 7.1. Paper
[Vision Transformers with Hierarchical Attention](https://arxiv.org/abs/2106.03180)

#### 7.2. Usage Code
```python

from model.backbone.HATNet import HATNet
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    hat = HATNet(dims=[48, 96, 240, 384], head_dim=48, expansions=[8, 8, 4, 4],
        grid_sizes=[8, 7, 7, 1], ds_ratios=[8, 4, 2, 1], depths=[2, 2, 6, 3])
    output=hat(input)
    print(output.shape)


```

### 8 CoaT Usage
#### 8.1. Paper
[Co-Scale Conv-Attentional Image Transformers](https://arxiv.org/abs/2104.06399)

#### 8.2. Usage Code
```python

from model.backbone.CoaT import CoaT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CoaT(patch_size=4, embed_dims=[152, 152, 152, 152], serial_depths=[2, 2, 2, 2], parallel_depth=6, num_heads=8, mlp_ratios=[4, 4, 4, 4])
    output=model(input)
    print(output.shape) # torch.Size([1, 1000])

```

### 9 PVT Usage
#### 9.1. Paper
[PVT v2: Improved Baselines with Pyramid Vision Transformer](https://arxiv.org/pdf/2106.13797.pdf)

#### 9.2. Usage Code
```python

from model.backbone.PVT import PyramidVisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = PyramidVisionTransformer(
        patch_size=4, embed_dims=[64, 128, 320, 512], num_heads=[1, 2, 5, 8], mlp_ratios=[8, 8, 4, 4], qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6), depths=[2, 2, 2, 2], sr_ratios=[8, 4, 2, 1])
    output=model(input)
    print(output.shape)

```


### 10 CPVT Usage
#### 10.1. Paper
[Conditional Positional Encodings for Vision Transformers](https://arxiv.org/abs/2102.10882)

#### 10.2. Usage Code
```python

from model.backbone.CPVT import CPVTV2
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CPVTV2(
        patch_size=4, embed_dims=[64, 128, 320, 512], num_heads=[1, 2, 5, 8], mlp_ratios=[8, 8, 4, 4], qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6), depths=[3, 4, 6, 3], sr_ratios=[8, 4, 2, 1])
    output=model(input)
    print(output.shape)

```

### 11 PIT Usage
#### 11.1. Paper
[Rethinking Spatial Dimensions of Vision Transformers](https://arxiv.org/abs/2103.16302)

#### 11.2. Usage Code
```python

from model.backbone.PIT import PoolingTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = PoolingTransformer(
        image_size=224,
        patch_size=14,
        stride=7,
        base_dims=[64, 64, 64],
        depth=[3, 6, 4],
        heads=[4, 8, 16],
        mlp_ratio=4
    )
    output=model(input)
    print(output.shape)

```

### 12 CrossViT Usage
#### 12.1. Paper
[CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification](https://arxiv.org/abs/2103.14899)

#### 12.2. Usage Code
```python

from model.backbone.CrossViT import VisionTransformer
import torch
from torch import nn

if __name__ == "__main__":
    input=torch.randn(1,3,224,224)
    model = VisionTransformer(
        img_size=[240, 224],
        patch_size=[12, 16], 
        embed_dim=[192, 384], 
        depth=[[1, 4, 0], [1, 4, 0], [1, 4, 0]],
        num_heads=[6, 6], 
        mlp_ratio=[4, 4, 1], 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
    )
    output=model(input)
    print(output.shape)

```

### 13 TnT Usage
#### 13.1. Paper
[Transformer in Transformer](https://arxiv.org/abs/2103.00112)

#### 13.2. Usage Code
```python

from model.backbone.TnT import TNT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = TNT(
        img_size=224, 
        patch_size=16, 
        outer_dim=384, 
        inner_dim=24, 
        depth=12,
        outer_num_heads=6, 
        inner_num_heads=4, 
        qkv_bias=False,
        inner_stride=4)
    output=model(input)
    print(output.shape)

```

### 14 DViT Usage
#### 14.1. Paper
[DeepViT: Towards Deeper Vision Transformer](https://arxiv.org/abs/2103.11886)

#### 14.2. Usage Code
```python

from model.backbone.DViT import DeepVisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = DeepVisionTransformer(
        patch_size=16, embed_dim=384, 
        depth=[False] * 16, 
        apply_transform=[False] * 0 + [True] * 32, 
        num_heads=12, 
        mlp_ratio=3, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6),
        )
    output=model(input)
    print(output.shape)

```

### 15 CeiT Usage
#### 15.1. Paper
[Incorporating Convolution Designs into Visual Transformers](https://arxiv.org/abs/2103.11816)

#### 15.2. Usage Code
```python

from model.backbone.CeiT import CeIT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CeIT(
        hybrid_backbone=Image2Tokens(),
        patch_size=4, 
        embed_dim=192, 
        depth=12, 
        num_heads=3, 
        mlp_ratio=4, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
        )
    output=model(input)
    print(output.shape)

```

### 16 ConViT Usage
#### 16.1. Paper
[ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases](https://arxiv.org/abs/2103.10697)

#### 16.2. Usage Code
```python

from model.backbone.ConViT import VisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VisionTransformer(
        num_heads=16,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
        )
    output=model(input)
    print(output.shape)

```

### 17 CaiT Usage
#### 17.1. Paper
[Going deeper with Image Transformers](https://arxiv.org/abs/2103.17239)

#### 17.2. Usage Code
```python

from model.backbone.CaiT import CaiT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CaiT(
        img_size= 224,
        patch_size=16, 
        embed_dim=192, 
        depth=24, 
        num_heads=4, 
        mlp_ratio=4, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6),
        init_scale=1e-5,
        depth_token_only=2
        )
    output=model(input)
    print(output.shape)

```

### 18 PatchConvnet Usage
#### 18.1. Paper
[Augmenting Convolutional networks with attention-based aggregation](https://arxiv.org/abs/2112.13692)

#### 18.2. Usage Code
```python

from model.backbone.PatchConvnet import PatchConvnet
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = PatchConvnet(
        patch_size=16,
        embed_dim=384,
        depth=60,
        num_heads=1,
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6),
        Patch_layer=ConvStem,
        Attention_block=Conv_blocks_se,
        depth_token_only=1,
        mlp_ratio_clstk=3.0,
    )
    output=model(input)
    print(output.shape)

```

### 19 DeiT Usage
#### 19.1. Paper
[Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877)

#### 19.2. Usage Code
```python

from model.backbone.DeiT import DistilledVisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = DistilledVisionTransformer(
        patch_size=16, 
        embed_dim=384, 
        depth=12, 
        num_heads=6, 
        mlp_ratio=4, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
        )
    output=model(input)
    print(output[0].shape)

```

### 20 LeViT Usage
#### 20.1. Paper
[LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference](https://arxiv.org/abs/2104.01136)

#### 20.2. Usage Code
```python

from model.backbone.LeViT import *
import torch
from torch import nn

if __name__ == '__main__':
    for name in specification:
        input=torch.randn(1,3,224,224)
        model = globals()[name](fuse=True, pretrained=False)
        model.eval()
        output = model(input)
        print(output.shape)

```

### 21 VOLO Usage
#### 21.1. Paper
[VOLO: Vision Outlooker for Visual Recognition](https://arxiv.org/abs/2106.13112)

#### 21.2. Usage Code
```python

from model.backbone.VOLO import VOLO
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VOLO([4, 4, 8, 2],
                 embed_dims=[192, 384, 384, 384],
                 num_heads=[6, 12, 12, 12],
                 mlp_ratios=[3, 3, 3, 3],
                 downsamples=[True, False, False, False],
                 outlook_attention=[True, False, False, False ],
                 post_layers=['ca', 'ca'],
                 )
    output=model(input)
    print(output[0].shape)

```

### 22 Container Usage
#### 22.1. Paper
[Container: Context Aggregation Network](https://arxiv.org/abs/2106.01401)

#### 22.2. Usage Code
```python

from model.backbone.Container import VisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VisionTransformer(
        img_size=[224, 56, 28, 14], 
        patch_size=[4, 2, 2, 2], 
        embed_dim=[64, 128, 320, 512], 
        depth=[3, 4, 8, 3], 
        num_heads=16, 
        mlp_ratio=[8, 8, 4, 4], 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6))
    output=model(input)
    print(output.shape)

```

### 23 CMT Usage
#### 23.1. Paper
[CMT: Convolutional Neural Networks Meet Vision Transformers](https://arxiv.org/abs/2107.06263)

#### 23.2. Usage Code
```python

from model.backbone.CMT import CMT_Tiny
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CMT_Tiny()
    output=model(input)
    print(output[0].shape)

```

### 24 EfficientFormer Usage
#### 24.1. Paper
[EfficientFormer: Vision Transformers at MobileNet Speed](https://arxiv.org/abs/2206.01191)

#### 24.2. Usage Code
```python

from model.backbone.EfficientFormer import EfficientFormer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = EfficientFormer(
        layers=EfficientFormer_depth['l1'],
        embed_dims=EfficientFormer_width['l1'],
        downsamples=[True, True, True, True],
        vit_num=1,
    )
    output=model(input)
    print(output[0].shape)

```

### 25 ConvNeXtV2 Usage
#### 25.1. Paper
[ConvNeXtV2: Co-designing and Scaling ConvNets with Masked Autoencoders](https://arxiv.org/abs/2301.00808)

#### 25.2. Usage Code
```python

from model.backbone.convnextv2 import convnextv2_atto
import torch
from torch import nn

if __name__ == "__main__":
    model = convnextv2_atto()
    input = torch.randn(1, 3, 224, 224)
    out = model(input)
    print(out.shape)

```





# MLP Series

- Pytorch implementation of ["RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition---arXiv 2021.05.05"](https://arxiv.org/pdf/2105.01883v1.pdf)

- Pytorch implementation of ["MLP-Mixer: An all-MLP Architecture for Vision---arXiv 2021.05.17"](https://arxiv.org/pdf/2105.01601.pdf)

- Pytorch implementation of ["ResMLP: Feedforward networks for image classification with data-efficient training---arXiv 2021.05.07"](https://arxiv.org/pdf/2105.03404.pdf)

- Pytorch implementation of ["Pay Attention to MLPs---arXiv 2021.05.17"](https://arxiv.org/abs/2105.08050)


- Pytorch implementation of ["Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?---arXiv 2021.09.12"](https://arxiv.org/abs/2109.05422)

### 1. RepMLP Usage
#### 1.1. Paper
["RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition"](https://arxiv.org/pdf/2105.01883v1.pdf)

#### 1.2. Overview
![](./model/img/repmlp.png)

#### 1.3. Usage Code
```python
from model.mlp.repmlp import RepMLP
import torch
from torch import nn

N=4 #batch size
C=512 #input dim
O=1024 #output dim
H=14 #image height
W=14 #image width
h=7 #patch height
w=7 #patch width
fc1_fc2_reduction=1 #reduction ratio
fc3_groups=8 # groups
repconv_kernels=[1,3,5,7] #kernel list
repmlp=RepMLP(C,O,H,W,h,w,fc1_fc2_reduction,fc3_groups,repconv_kernels=repconv_kernels)
x=torch.randn(N,C,H,W)
repmlp.eval()
for module in repmlp.modules():
    if isinstance(module, nn.BatchNorm2d) or isinstance(module, nn.BatchNorm1d):
        nn.init.uniform_(module.running_mean, 0, 0.1)
        nn.init.uniform_(module.running_var, 0, 0.1)
        nn.init.uniform_(module.weight, 0, 0.1)
        nn.init.uniform_(module.bias, 0, 0.1)

#training result
out=repmlp(x)
#inference result
repmlp.switch_to_deploy()
deployout = repmlp(x)

print(((deployout-out)**2).sum())
```

### 2. MLP-Mixer Usage
#### 2.1. Paper
["MLP-Mixer: An all-MLP Architecture for Vision"](https://arxiv.org/pdf/2105.01601.pdf)

#### 2.2. Overview
![](./model/img/mlpmixer.png)

#### 2.3. Usage Code
```python
from model.mlp.mlp_mixer import MlpMixer
import torch
mlp_mixer=MlpMixer(num_classes=1000,num_blocks=10,patch_size=10,tokens_hidden_dim=32,channels_hidden_dim=1024,tokens_mlp_dim=16,channels_mlp_dim=1024)
input=torch.randn(50,3,40,40)
output=mlp_mixer(input)
print(output.shape)
```

***

### 3. ResMLP Usage
#### 3.1. Paper
["ResMLP: Feedforward networks for image classification with data-efficient training"](https://arxiv.org/pdf/2105.03404.pdf)

#### 3.2. Overview
![](./model/img/resmlp.png)

#### 3.3. Usage Code
```python
from model.mlp.resmlp import ResMLP
import torch

input=torch.randn(50,3,14,14)
resmlp=ResMLP(dim=128,image_size=14,patch_size=7,class_num=1000)
out=resmlp(input)
print(out.shape) #the last dimention is class_num
```

***

### 4. gMLP Usage
#### 4.1. Paper
["Pay Attention to MLPs"](https://arxiv.org/abs/2105.08050)

#### 4.2. Overview
![](./model/img/gMLP.jpg)

#### 4.3. Usage Code
```python
from model.mlp.g_mlp import gMLP
import torch

num_tokens=10000
bs=50
len_sen=49
num_layers=6
input=torch.randint(num_tokens,(bs,len_sen)) #bs,len_sen
gmlp = gMLP(num_tokens=num_tokens,len_sen=len_sen,dim=512,d_ff=1024)
output=gmlp(input)
print(output.shape)
```

***

### 5. sMLP Usage
#### 5.1. Paper
["Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?"](https://arxiv.org/abs/2109.05422)

#### 5.2. Overview
![](./model/img/sMLP.jpg)

#### 5.3. Usage Code
```python
from model.mlp.sMLP_block import sMLPBlock
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,3,224,224)
    smlp=sMLPBlock(h=224,w=224)
    out=smlp(input)
    print(out.shape)
```

### 6. vip-mlp Usage
#### 6.1. Paper
["Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition"](https://arxiv.org/abs/2106.12368)

#### 6.2. Usage Code
```python
from model.mlp.vip-mlp import VisionPermutator
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VisionPermutator(
        layers=[4, 3, 8, 3], 
        embed_dims=[384, 384, 384, 384], 
        patch_size=14, 
        transitions=[False, False, False, False],
        segment_dim=[16, 16, 16, 16], 
        mlp_ratios=[3, 3, 3, 3], 
        mlp_fn=WeightedPermuteMLP
    )
    output=model(input)
    print(output.shape)
```


# Re-Parameter Series

- Pytorch implementation of ["RepVGG: Making VGG-style ConvNets Great Again---CVPR2021"](https://arxiv.org/abs/2101.03697)

- Pytorch implementation of ["ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks---ICCV2019"](https://arxiv.org/abs/1908.03930)

- Pytorch implementation of ["Diverse Branch Block: Building a Convolution as an Inception-like Unit---CVPR2021"](https://arxiv.org/abs/2103.13425)


***

### 1. RepVGG Usage
#### 1.1. Paper
["RepVGG: Making VGG-style ConvNets Great Again"](https://arxiv.org/abs/2101.03697)

#### 1.2. Overview
![](./model/img/repvgg.png)

#### 1.3. Usage Code
```python

from model.rep.repvgg import RepBlock
import torch


input=torch.randn(50,512,49,49)
repblock=RepBlock(512,512)
repblock.eval()
out=repblock(input)
repblock._switch_to_deploy()
out2=repblock(input)
print('difference between vgg and repvgg')
print(((out2-out)**2).sum())
```



***

### 2. ACNet Usage
#### 2.1. Paper
["ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks"](https://arxiv.org/abs/1908.03930)

#### 2.2. Overview
![](./model/img/acnet.png)

#### 2.3. Usage Code
```python
from model.rep.acnet import ACNet
import torch
from torch import nn

input=torch.randn(50,512,49,49)
acnet=ACNet(512,512)
acnet.eval()
out=acnet(input)
acnet._switch_to_deploy()
out2=acnet(input)
print('difference:')
print(((out2-out)**2).sum())

```



***

### 2. Diverse Branch Block Usage
#### 2.1. Paper
["Diverse Branch Block: Building a Convolution as an Inception-like Unit"](https://arxiv.org/abs/2103.13425)

#### 2.2. Overview
![](./model/img/ddb.png)

#### 2.3. Usage Code
##### 2.3.1 Transform I
```python
from model.rep.ddb import transI_conv_bn
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)
#conv+bn
conv1=nn.Conv2d(64,64,3,padding=1)
bn1=nn.BatchNorm2d(64)
bn1.eval()
out1=bn1(conv1(input))

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transI_conv_bn(conv1,bn1)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.2 Transform II
```python
from model.rep.ddb import transII_conv_branch
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1=nn.Conv2d(64,64,3,padding=1)
conv2=nn.Conv2d(64,64,3,padding=1)
out1=conv1(input)+conv2(input)

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transII_conv_branch(conv1,conv2)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.3 Transform III
```python
from model.rep.ddb import transIII_conv_sequential
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1=nn.Conv2d(64,64,1,padding=0,bias=False)
conv2=nn.Conv2d(64,64,3,padding=1,bias=False)
out1=conv2(conv1(input))


#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1,bias=False)
conv_fuse.weight.data=transIII_conv_sequential(conv1,conv2)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.4 Transform IV
```python
from model.rep.ddb import transIV_conv_concat
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1=nn.Conv2d(64,32,3,padding=1)
conv2=nn.Conv2d(64,32,3,padding=1)
out1=torch.cat([conv1(input),conv2(input)],dim=1)

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transIV_conv_concat(conv1,conv2)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.5 Transform V
```python
from model.rep.ddb import transV_avg
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

avg=nn.AvgPool2d(kernel_size=3,stride=1)
out1=avg(input)

conv=transV_avg(64,3)
out2=conv(input)

print("difference:",((out2-out1)**2).sum().item())
```


##### 2.3.6 Transform VI
```python
from model.rep.ddb import transVI_conv_scale
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1x1=nn.Conv2d(64,64,1)
conv1x3=nn.Conv2d(64,64,(1,3),padding=(0,1))
conv3x1=nn.Conv2d(64,64,(3,1),padding=(1,0))
out1=conv1x1(input)+conv1x3(input)+conv3x1(input)

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transVI_conv_scale(conv1x1,conv1x3,conv3x1)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```





# Convolution Series

- Pytorch implementation of ["MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications---CVPR2017"](https://arxiv.org/abs/1704.04861)

- Pytorch implementation of ["Efficientnet: Rethinking model scaling for convolutional neural networks---PMLR2019"](http://proceedings.mlr.press/v97/tan19a.html)

- Pytorch implementation of ["Involution: Inverting the Inherence of Convolution for Visual Recognition---CVPR2021"](https://arxiv.org/abs/2103.06255)

- Pytorch implementation of ["Dynamic Convolution: Attention over Convolution Kernels---CVPR2020 Oral"](https://arxiv.org/abs/1912.03458)

- Pytorch implementation of ["CondConv: Conditionally Parameterized Convolutions for Efficient Inference---NeurIPS2019"](https://arxiv.org/abs/1904.04971)

***

### 1. Depthwise Separable Convolution Usage
#### 1.1. Paper
["MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications"](https://arxiv.org/abs/1704.04861)

#### 1.2. Overview
![](./model/img/DepthwiseSeparableConv.png)

#### 1.3. Usage Code
```python
from model.conv.DepthwiseSeparableConvolution import DepthwiseSeparableConvolution
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,3,224,224)
dsconv=DepthwiseSeparableConvolution(3,64)
out=dsconv(input)
print(out.shape)
```

***


### 2. MBConv Usage
#### 2.1. Paper
["Efficientnet: Rethinking model scaling for convolutional neural networks"](http://proceedings.mlr.press/v97/tan19a.html)

#### 2.2. Overview
![](./model/img/MBConv.jpg)

#### 2.3. Usage Code
```python
from model.conv.MBConv import MBConvBlock
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,3,224,224)
mbconv=MBConvBlock(ksize=3,input_filters=3,output_filters=512,image_size=224)
out=mbconv(input)
print(out.shape)


```

***


### 3. Involution Usage
#### 3.1. Paper
["Involution: Inverting the Inherence of Convolution for Visual Recognition"](https://arxiv.org/abs/2103.06255)

#### 3.2. Overview
![](./model/img/Involution.png)

#### 3.3. Usage Code
```python
from model.conv.Involution import Involution
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,4,64,64)
involution=Involution(kernel_size=3,in_channel=4,stride=2)
out=involution(input)
print(out.shape)
```

***


### 4. DynamicConv Usage
#### 4.1. Paper
["Dynamic Convolution: Attention over Convolution Kernels"](https://arxiv.org/abs/1912.03458)

#### 4.2. Overview
![](./model/img/DynamicConv.png)

#### 4.3. Usage Code
```python
from model.conv.DynamicConv import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(2,32,64,64)
    m=DynamicConv(in_planes=32,out_planes=64,kernel_size=3,stride=1,padding=1,bias=False)
    out=m(input)
    print(out.shape) # 2,32,64,64

```

***


### 5. CondConv Usage
#### 5.1. Paper
["CondConv: Conditionally Parameterized Convolutions for Efficient Inference"](https://arxiv.org/abs/1904.04971)

#### 5.2. Overview
![](./model/img/CondConv.png)

#### 5.3. Usage Code
```python
from model.conv.CondConv import *
import torch
from torch import nn
from torch.nn import functional as F





if __name__ == '__main__':
    input=torch.randn(2,32,64,64)
    m=CondConv(in_planes=32,out_planes=64,kernel_size=3,stride=1,padding=1,bias=False)
    out=m(input)
    print(out.shape)

```



## 其他项目推荐

-------

🔥🔥🔥 重磅!!!作为项目补充,更多论文层面的解析,可以关注新开源的项目 **[FightingCV-Paper-Reading](https://github.com/xmu-xiaoma666/FightingCV-Paper-Reading)** ,里面汇集和整理了各大顶会顶刊的论文解析



🔥🔥🔥重磅!!! 最近为大家整理了网上的各种AI相关的视频教程和必读论文 **[FightingCV-Course
](https://github.com/xmu-xiaoma666/FightingCV-Course)**


🔥🔥🔥 重磅!!!最近全新开源了一个 **[YOLOAir](https://github.com/iscyy/yoloair)** 目标检测代码库 ,里面集成了多种YOLO模型,包括YOLOv5, YOLOv7,YOLOR, YOLOX,YOLOv4, YOLOv3以及其他YOLO模型,还包括多种现有Attention机制。


🔥🔥🔥 **ECCV2022论文汇总:[ECCV2022-Paper-List](https://github.com/xmu-xiaoma666/ECCV2022-Paper-List/blob/master/README.md)**


<!-- ![image](https://user-images.githubusercontent.com/33897496/184842902-9acff374-b3e7-401a-80fd-9d484e40c637.png) -->


================================================
FILE: README_EN.md
================================================

<img src="./FightingCVimg/LOGO.gif" height="200" width="400"/>

English | [简体中文](./README.md)

# FightingCV Codebase For [***Attention***](#attention-series),[***Backbone***](#backbone-series), [***MLP***](#mlp-series), [***Re-parameter***](#re-parameter-series), [**Convolution**](#convolution-series)

![](https://img.shields.io/badge/fightingcv-v0.0.1-brightgreen)
![](https://img.shields.io/badge/python->=v3.0-blue)
![](https://img.shields.io/badge/pytorch->=v1.4-red)

<!--
-------
*If this project is helpful to you, welcome to give a*star***.* 

*Don't forget to*follow*me to learn about project updates.*

-->

-------


🔥🔥🔥As a supplement to the project, a object detection codebase [YOLOAir](https://github.com/iscyy/yoloair) has recently been newly opened, which integrates various attention mechanisms in the object detection algorithm. The code is simple and easy to read. Welcome to play and star🌟!**


<!-- ![image](https://user-images.githubusercontent.com/33897496/184842902-9acff374-b3e7-401a-80fd-9d484e40c637.png) -->



-------

Hello, everyone, I'm Xiaoma 🚀🚀🚀

***For beginners (like me):***
Recently, I found a problem when reading the paper. Sometimes the core idea of the paper is very simple, and the core code may be just a dozen lines. However, when I open the source code of the author's release, I find that the proposed module is embedded in the task framework such as classification, detection and segmentation, resulting in redundant code. For me who is not familiar with the specific task framework, it is difficult to find the core code, resulting in some difficulties in understanding the paper and network ideas.

***For advanced (like you):***
If the basic units conv, FC and RNN are regarded as small Lego blocks, and the structures transformer and RESNET are regarded as LEGO castles that have been built. The modules provided by this project are LEGO components with complete semantic informationLet scientific researchers avoid repeatedly building wheels, just think about how to use these "LEGO components" to build more colorful works.

***For proficient (may be like you):***
Limited capacity, do not like light spraying!!!

***For All:***
This project aims to realize a code base that can make beginners of deep learning understand and serve scientific research and industrial communities. As [fightingcv wechat official account]( https://mp.weixin.qq.com/s/m9RiivbbDPdjABsTd6q8FA )The purpose of this project is to achieve 🚀Let there be no hard to read papers in the world🚀。
(at the same time, we also welcome all scientific researchers to sort out the core code of their work into this project, promote the development of the scientific research community, and indicate the author of the code in readme ~)


<!--


## Wechat Official account &  communication group



Welcome to pay attention to wechat official account: **fightingcv**



The official account shares papers, algorithms and codes every day Oh~




**Share some recent papers and analysis in the group every day. Welcome to study and exchange ha~~~

(if you can't add it, you can add wechat: **775629340**, remember the remarks **[company / school + direction + ID])**

![](./FightingCVimg/wechat.jpg)

We strongly recommend that you pay attention to [Zhihu]( https://www.zhihu.com/people/jason-14-58-38/posts )Account number and **[fightingcv Wechat official account**]( https://mp.weixin.qq.com/s/m9RiivbbDPdjABsTd6q8FA )** to quickly learn about the latest high-quality dry goods resources.

-->

***

# Contents

- [Attention Series](#attention-series)
    - [1. External Attention Usage](#1-external-attention-usage)

    - [2. Self Attention Usage](#2-self-attention-usage)

    - [3. Simplified Self Attention Usage](#3-simplified-self-attention-usage)

    - [4. Squeeze-and-Excitation Attention Usage](#4-squeeze-and-excitation-attention-usage)

    - [5. SK Attention Usage](#5-sk-attention-usage)

    - [6. CBAM Attention Usage](#6-cbam-attention-usage)

    - [7. BAM Attention Usage](#7-bam-attention-usage)
    
    - [8. ECA Attention Usage](#8-eca-attention-usage)

    - [9. DANet Attention Usage](#9-danet-attention-usage)

    - [10. Pyramid Split Attention (PSA) Usage](#10-Pyramid-Split-Attention-Usage)

    - [11. Efficient Multi-Head Self-Attention(EMSA) Usage](#11-Efficient-Multi-Head-Self-Attention-Usage)

    - [12. Shuffle Attention Usage](#12-Shuffle-Attention-Usage)
    
    - [13. MUSE Attention Usage](#13-MUSE-Attention-Usage)
  
    - [14. SGE Attention Usage](#14-SGE-Attention-Usage)

    - [15. A2 Attention Usage](#15-A2-Attention-Usage)

    - [16. AFT Attention Usage](#16-AFT-Attention-Usage)

    - [17. Outlook Attention Usage](#17-Outlook-Attention-Usage)

    - [18. ViP Attention Usage](#18-ViP-Attention-Usage)

    - [19. CoAtNet Attention Usage](#19-CoAtNet-Attention-Usage)

    - [20. HaloNet Attention Usage](#20-HaloNet-Attention-Usage)

    - [21. Polarized Self-Attention Usage](#21-Polarized-Self-Attention-Usage)

    - [22. CoTAttention Usage](#22-CoTAttention-Usage)

    - [23. Residual Attention Usage](#23-Residual-Attention-Usage)
  
    - [24. S2 Attention Usage](#24-S2-Attention-Usage)

    - [25. GFNet Attention Usage](#25-GFNet-Attention-Usage)

    - [26. Triplet Attention Usage](#26-TripletAttention-Usage)

    - [27. Coordinate Attention Usage](#27-Coordinate-Attention-Usage)

    - [28. MobileViT Attention Usage](#28-MobileViT-Attention-Usage)

    - [29. ParNet Attention Usage](#29-ParNet-Attention-Usage)

    - [30. UFO Attention Usage](#30-UFO-Attention-Usage)

    - [31. ACmix Attention Usage](#31-Acmix-Attention-Usage)
  
    - [32. MobileViTv2 Attention Usage](#32-MobileViTv2-Attention-Usage)

    - [33. DAT Attention Usage](#33-DAT-Attention-Usage)

    - [34. CrossFormer Attention Usage](#34-CrossFormer-Attention-Usage)

    - [35. MOATransformer Attention Usage](#35-MOATransformer-Attention-Usage)

    - [36. CrissCrossAttention Attention Usage](#36-CrissCrossAttention-Attention-Usage)

    - [37. Axial_attention Attention Usage](#37-Axial_attention-Attention-Usage)

- [Backbone Series](#Backbone-series)

    - [1. ResNet Usage](#1-ResNet-Usage)

    - [2. ResNeXt Usage](#2-ResNeXt-Usage)

    - [3. MobileViT Usage](#3-MobileViT-Usage)

    - [4. ConvMixer Usage](#4-ConvMixer-Usage)

    - [5. ShuffleTransformer Usage](#5-ShuffleTransformer-Usage)

    - [6. ConTNet Usage](#6-ConTNet-Usage)

    - [7. HATNet Usage](#7-HATNet-Usage)

    - [8. CoaT Usage](#8-CoaT-Usage)

    - [9. PVT Usage](#9-PVT-Usage)

    - [10. CPVT Usage](#10-CPVT-Usage)

    - [11. PIT Usage](#11-PIT-Usage)

    - [12. CrossViT Usage](#12-CrossViT-Usage)

    - [13. TnT Usage](#13-TnT-Usage)

    - [14. DViT Usage](#14-DViT-Usage)

    - [15. CeiT Usage](#15-CeiT-Usage)

    - [16. ConViT Usage](#16-ConViT-Usage)

    - [17. CaiT Usage](#17-CaiT-Usage)

    - [18. PatchConvnet Usage](#18-PatchConvnet-Usage)

    - [19. DeiT Usage](#19-DeiT-Usage)

    - [20. LeViT Usage](#20-LeViT-Usage)

    - [21. VOLO Usage](#21-VOLO-Usage)
    
    - [22. Container Usage](#22-Container-Usage)

    - [23. CMT Usage](#23-CMT-Usage)


- [MLP Series](#mlp-series)

    - [1. RepMLP Usage](#1-RepMLP-Usage)

    - [2. MLP-Mixer Usage](#2-MLP-Mixer-Usage)

    - [3. ResMLP Usage](#3-ResMLP-Usage)

    - [4. gMLP Usage](#4-gMLP-Usage)

    - [5. sMLP Usage](#5-sMLP-Usage)

    - [6. vip-mlp Usage](#6-vip-mlp-Usage)

- [Re-Parameter(ReP) Series](#Re-Parameter-series)

    - [1. RepVGG Usage](#1-RepVGG-Usage)

    - [2. ACNet Usage](#2-ACNet-Usage)

    - [3. Diverse Branch Block(DDB) Usage](#3-Diverse-Branch-Block-Usage)

- [Convolution Series](#Convolution-series)

    - [1. Depthwise Separable Convolution Usage](#1-Depthwise-Separable-Convolution-Usage)

    - [2. MBConv Usage](#2-MBConv-Usage)

    - [3. Involution Usage](#3-Involution-Usage)

    - [4. DynamicConv Usage](#4-DynamicConv-Usage)

    - [5. CondConv Usage](#5-CondConv-Usage)

***


# Attention Series

- Pytorch implementation of ["Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks---arXiv 2021.05.05"](https://arxiv.org/abs/2105.02358)

- Pytorch implementation of ["Attention Is All You Need---NIPS2017"](https://arxiv.org/pdf/1706.03762.pdf)

- Pytorch implementation of ["Squeeze-and-Excitation Networks---CVPR2018"](https://arxiv.org/abs/1709.01507)

- Pytorch implementation of ["Selective Kernel Networks---CVPR2019"](https://arxiv.org/pdf/1903.06586.pdf)

- Pytorch implementation of ["CBAM: Convolutional Block Attention Module---ECCV2018"](https://openaccess.thecvf.com/content_ECCV_2018/papers/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.pdf)

- Pytorch implementation of ["BAM: Bottleneck Attention Module---BMCV2018"](https://arxiv.org/pdf/1807.06514.pdf)

- Pytorch implementation of ["ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks---CVPR2020"](https://arxiv.org/pdf/1910.03151.pdf)

- Pytorch implementation of ["Dual Attention Network for Scene Segmentation---CVPR2019"](https://arxiv.org/pdf/1809.02983.pdf)

- Pytorch implementation of ["EPSANet: An Efficient Pyramid Split Attention Block on Convolutional Neural Network---arXiv 2021.05.30"](https://arxiv.org/pdf/2105.14447.pdf)

- Pytorch implementation of ["ResT: An Efficient Transformer for Visual Recognition---arXiv 2021.05.28"](https://arxiv.org/abs/2105.13677)

- Pytorch implementation of ["SA-NET: SHUFFLE ATTENTION FOR DEEP CONVOLUTIONAL NEURAL NETWORKS---ICASSP 2021"](https://arxiv.org/pdf/2102.00240.pdf)

- Pytorch implementation of ["MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning---arXiv 2019.11.17"](https://arxiv.org/abs/1911.09483)

- Pytorch implementation of ["Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks---arXiv 2019.05.23"](https://arxiv.org/pdf/1905.09646.pdf)

- Pytorch implementation of ["A2-Nets: Double Attention Networks---NIPS2018"](https://arxiv.org/pdf/1810.11579.pdf)


- Pytorch implementation of ["An Attention Free Transformer---ICLR2021 (Apple New Work)"](https://arxiv.org/pdf/2105.14103v1.pdf)


- Pytorch implementation of [VOLO: Vision Outlooker for Visual Recognition---arXiv 2021.06.24"](https://arxiv.org/abs/2106.13112) 
  [【论文解析】](https://zhuanlan.zhihu.com/p/385561050)


- Pytorch implementation of [Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition---arXiv 2021.06.23](https://arxiv.org/abs/2106.12368) 
  [【论文解析】](https://mp.weixin.qq.com/s/5gonUQgBho_m2O54jyXF_Q)


- Pytorch implementation of [CoAtNet: Marrying Convolution and Attention for All Data Sizes---arXiv 2021.06.09](https://arxiv.org/abs/2106.04803) 
  [【论文解析】](https://zhuanlan.zhihu.com/p/385578588)


- Pytorch implementation of [Scaling Local Self-Attention for Parameter Efficient Visual Backbones---CVPR2021 Oral](https://arxiv.org/pdf/2103.12731.pdf)  [【论文解析】](https://zhuanlan.zhihu.com/p/388598744)



- Pytorch implementation of [Polarized Self-Attention: Towards High-quality Pixel-wise Regression---arXiv 2021.07.02](https://arxiv.org/abs/2107.00782)  [【论文解析】](https://zhuanlan.zhihu.com/p/389770482) 


- Pytorch implementation of [Contextual Transformer Networks for Visual Recognition---arXiv 2021.07.26](https://arxiv.org/abs/2107.12292)  [【论文解析】](https://zhuanlan.zhihu.com/p/394795481) 


- Pytorch implementation of [Residual Attention: A Simple but Effective Method for Multi-Label Recognition---ICCV2021](https://arxiv.org/abs/2108.02456) 


- Pytorch implementation of [S²-MLPv2: Improved Spatial-Shift MLP Architecture for Vision---arXiv 2021.08.02](https://arxiv.org/abs/2108.01072) [【论文解析】](https://zhuanlan.zhihu.com/p/397003638) 

- Pytorch implementation of [Global Filter Networks for Image Classification---arXiv 2021.07.01](https://arxiv.org/abs/2107.00645) 

- Pytorch implementation of [Rotate to Attend: Convolutional Triplet Attention Module---WACV 2021](https://arxiv.org/abs/2010.03045) 

- Pytorch implementation of [Coordinate Attention for Efficient Mobile Network Design ---CVPR 2021](https://arxiv.org/abs/2103.02907)

- Pytorch implementation of [MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2021.10.05](https://arxiv.org/abs/2110.02178)

- Pytorch implementation of [Non-deep Networks---ArXiv 2021.10.20](https://arxiv.org/abs/2110.07641)

- Pytorch implementation of [UFO-ViT: High Performance Linear Vision Transformer without Softmax---ArXiv 2021.09.29](https://arxiv.org/abs/2109.14382)

- Pytorch implementation of [Separable Self-attention for Mobile Vision Transformers---ArXiv 2022.06.06](https://arxiv.org/abs/2206.02680)

- Pytorch implementation of [On the Integration of Self-Attention and Convolution---ArXiv 2022.03.14](https://arxiv.org/pdf/2111.14556.pdf)

- Pytorch implementation of [CROSSFORMER: A VERSATILE VISION TRANSFORMER HINGING ON CROSS-SCALE ATTENTION---ICLR 2022](https://arxiv.org/pdf/2108.00154.pdf)

- Pytorch implementation of [Aggregating Global Features into Local Vision Transformer](https://arxiv.org/abs/2201.12903)

- Pytorch implementation of [CCNet: Criss-Cross Attention for Semantic Segmentation](https://arxiv.org/abs/1811.11721)

- Pytorch implementation of [Axial Attention in Multidimensional Transformers](https://arxiv.org/abs/1912.12180)
***


### 1. External Attention Usage
#### 1.1. Paper
["Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks"](https://arxiv.org/abs/2105.02358)

#### 1.2. Overview
![](./model/img/External_Attention.png)

#### 1.3. Usage Code
```python
from model.attention.ExternalAttention import ExternalAttention
import torch

input=torch.randn(50,49,512)
ea = ExternalAttention(d_model=512,S=8)
output=ea(input)
print(output.shape)
```

***


### 2. Self Attention Usage
#### 2.1. Paper
["Attention Is All You Need"](https://arxiv.org/pdf/1706.03762.pdf)

#### 1.2. Overview
![](./model/img/SA.png)

#### 1.3. Usage Code
```python
from model.attention.SelfAttention import ScaledDotProductAttention
import torch

input=torch.randn(50,49,512)
sa = ScaledDotProductAttention(d_model=512, d_k=512, d_v=512, h=8)
output=sa(input,input,input)
print(output.shape)
```

***

### 3. Simplified Self Attention Usage
#### 3.1. Paper
[None]()

#### 3.2. Overview
![](./model/img/SSA.png)

#### 3.3. Usage Code
```python
from model.attention.SimplifiedSelfAttention import SimplifiedScaledDotProductAttention
import torch

input=torch.randn(50,49,512)
ssa = SimplifiedScaledDotProductAttention(d_model=512, h=8)
output=ssa(input,input,input)
print(output.shape)

```

***

### 4. Squeeze-and-Excitation Attention Usage
#### 4.1. Paper
["Squeeze-and-Excitation Networks"](https://arxiv.org/abs/1709.01507)

#### 4.2. Overview
![](./model/img/SE.png)

#### 4.3. Usage Code
```python
from model.attention.SEAttention import SEAttention
import torch

input=torch.randn(50,512,7,7)
se = SEAttention(channel=512,reduction=8)
output=se(input)
print(output.shape)

```

***

### 5. SK Attention Usage
#### 5.1. Paper
["Selective Kernel Networks"](https://arxiv.org/pdf/1903.06586.pdf)

#### 5.2. Overview
![](./model/img/SK.png)

#### 5.3. Usage Code
```python
from model.attention.SKAttention import SKAttention
import torch

input=torch.randn(50,512,7,7)
se = SKAttention(channel=512,reduction=8)
output=se(input)
print(output.shape)

```
***

### 6. CBAM Attention Usage
#### 6.1. Paper
["CBAM: Convolutional Block Attention Module"](https://openaccess.thecvf.com/content_ECCV_2018/papers/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.pdf)

#### 6.2. Overview
![](./model/img/CBAM1.png)

![](./model/img/CBAM2.png)

#### 6.3. Usage Code
```python
from model.attention.CBAM import CBAMBlock
import torch

input=torch.randn(50,512,7,7)
kernel_size=input.shape[2]
cbam = CBAMBlock(channel=512,reduction=16,kernel_size=kernel_size)
output=cbam(input)
print(output.shape)

```

***

### 7. BAM Attention Usage
#### 7.1. Paper
["BAM: Bottleneck Attention Module"](https://arxiv.org/pdf/1807.06514.pdf)

#### 7.2. Overview
![](./model/img/BAM.png)

#### 7.3. Usage Code
```python
from model.attention.BAM import BAMBlock
import torch

input=torch.randn(50,512,7,7)
bam = BAMBlock(channel=512,reduction=16,dia_val=2)
output=bam(input)
print(output.shape)

```

***

### 8. ECA Attention Usage
#### 8.1. Paper
["ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks"](https://arxiv.org/pdf/1910.03151.pdf)

#### 8.2. Overview
![](./model/img/ECA.png)

#### 8.3. Usage Code
```python
from model.attention.ECAAttention import ECAAttention
import torch

input=torch.randn(50,512,7,7)
eca = ECAAttention(kernel_size=3)
output=eca(input)
print(output.shape)

```

***

### 9. DANet Attention Usage
#### 9.1. Paper
["Dual Attention Network for Scene Segmentation"](https://arxiv.org/pdf/1809.02983.pdf)

#### 9.2. Overview
![](./model/img/danet.png)

#### 9.3. Usage Code
```python
from model.attention.DANet import DAModule
import torch

input=torch.randn(50,512,7,7)
danet=DAModule(d_model=512,kernel_size=3,H=7,W=7)
print(danet(input).shape)

```

***

### 10. Pyramid Split Attention Usage

#### 10.1. Paper
["EPSANet: An Efficient Pyramid Split Attention Block on Convolutional Neural Network"](https://arxiv.org/pdf/2105.14447.pdf)

#### 10.2. Overview
![](./model/img/psa.png)

#### 10.3. Usage Code
```python
from model.attention.PSA import PSA
import torch

input=torch.randn(50,512,7,7)
psa = PSA(channel=512,reduction=8)
output=psa(input)
print(output.shape)

```

***


### 11. Efficient Multi-Head Self-Attention Usage

#### 11.1. Paper
["ResT: An Efficient Transformer for Visual Recognition"](https://arxiv.org/abs/2105.13677)

#### 11.2. Overview
![](./model/img/EMSA.png)

#### 11.3. Usage Code
```python

from model.attention.EMSA import EMSA
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,64,512)
emsa = EMSA(d_model=512, d_k=512, d_v=512, h=8,H=8,W=8,ratio=2,apply_transform=True)
output=emsa(input,input,input)
print(output.shape)
    
```

***


### 12. Shuffle Attention Usage

#### 12.1. Paper
["SA-NET: SHUFFLE ATTENTION FOR DEEP CONVOLUTIONAL NEURAL NETWORKS"](https://arxiv.org/pdf/2102.00240.pdf)

#### 12.2. Overview
![](./model/img/ShuffleAttention.jpg)

#### 12.3. Usage Code
```python

from model.attention.ShuffleAttention import ShuffleAttention
import torch
from torch import nn
from torch.nn import functional as F


input=torch.randn(50,512,7,7)
se = ShuffleAttention(channel=512,G=8)
output=se(input)
print(output.shape)

    
```


***


### 13. MUSE Attention Usage

#### 13.1. Paper
["MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning"](https://arxiv.org/abs/1911.09483)

#### 13.2. Overview
![](./model/img/MUSE.png)

#### 13.3. Usage Code
```python
from model.attention.MUSEAttention import MUSEAttention
import torch
from torch import nn
from torch.nn import functional as F


input=torch.randn(50,49,512)
sa = MUSEAttention(d_model=512, d_k=512, d_v=512, h=8)
output=sa(input,input,input)
print(output.shape)

```

***


### 14. SGE Attention Usage

#### 14.1. Paper
[Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks](https://arxiv.org/pdf/1905.09646.pdf)

#### 14.2. Overview
![](./model/img/SGE.png)

#### 14.3. Usage Code
```python
from model.attention.SGE import SpatialGroupEnhance
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
sge = SpatialGroupEnhance(groups=8)
output=sge(input)
print(output.shape)

```

***


### 15. A2 Attention Usage

#### 15.1. Paper
[A2-Nets: Double Attention Networks](https://arxiv.org/pdf/1810.11579.pdf)

#### 15.2. Overview
![](./model/img/A2.png)

#### 15.3. Usage Code
```python
from model.attention.A2Atttention import DoubleAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
a2 = DoubleAttention(512,128,128,True)
output=a2(input)
print(output.shape)

```



### 16. AFT Attention Usage

#### 16.1. Paper
[An Attention Free Transformer](https://arxiv.org/pdf/2105.14103v1.pdf)

#### 16.2. Overview
![](./model/img/AFT.jpg)

#### 16.3. Usage Code
```python
from model.attention.AFT import AFT_FULL
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,49,512)
aft_full = AFT_FULL(d_model=512, n=49)
output=aft_full(input)
print(output.shape)

```






### 17. Outlook Attention Usage

#### 17.1. Paper


[VOLO: Vision Outlooker for Visual Recognition"](https://arxiv.org/abs/2106.13112)


#### 17.2. Overview
![](./model/img/OutlookAttention.png)

#### 17.3. Usage Code
```python
from model.attention.OutlookAttention import OutlookAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,28,28,512)
outlook = OutlookAttention(dim=512)
output=outlook(input)
print(output.shape)

```


***






### 18. ViP Attention Usage

#### 18.1. Paper


[Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition"](https://arxiv.org/abs/2106.12368)


#### 18.2. Overview
![](./model/img/ViP.png)

#### 18.3. Usage Code
```python

from model.attention.ViP import WeightedPermuteMLP
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(64,8,8,512)
seg_dim=8
vip=WeightedPermuteMLP(512,seg_dim)
out=vip(input)
print(out.shape)

```


***





### 19. CoAtNet Attention Usage

#### 19.1. Paper


[CoAtNet: Marrying Convolution and Attention for All Data Sizes"](https://arxiv.org/abs/2106.04803) 


#### 19.2. Overview
None


#### 19.3. Usage Code
```python

from model.attention.CoAtNet import CoAtNet
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,3,224,224)
mbconv=CoAtNet(in_ch=3,image_size=224)
out=mbconv(input)
print(out.shape)

```


***






### 20. HaloNet Attention Usage

#### 20.1. Paper


[Scaling Local Self-Attention for Parameter Efficient Visual Backbones"](https://arxiv.org/pdf/2103.12731.pdf) 


#### 20.2. Overview

![](./model/img/HaloNet.png)

#### 20.3. Usage Code
```python

from model.attention.HaloAttention import HaloAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,512,8,8)
halo = HaloAttention(dim=512,
    block_size=2,
    halo_size=1,)
output=halo(input)
print(output.shape)

```


***

### 21. Polarized Self-Attention Usage

#### 21.1. Paper

[Polarized Self-Attention: Towards High-quality Pixel-wise Regression"](https://arxiv.org/abs/2107.00782)  


#### 21.2. Overview

![](./model/img/PoSA.png)

#### 21.3. Usage Code
```python

from model.attention.PolarizedSelfAttention import ParallelPolarizedSelfAttention,SequentialPolarizedSelfAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,512,7,7)
psa = SequentialPolarizedSelfAttention(channel=512)
output=psa(input)
print(output.shape)


```


***


### 22. CoTAttention Usage

#### 22.1. Paper

[Contextual Transformer Networks for Visual Recognition---arXiv 2021.07.26](https://arxiv.org/abs/2107.12292) 


#### 22.2. Overview

![](./model/img/CoT.png)

#### 22.3. Usage Code
```python

from model.attention.CoTAttention import CoTAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
cot = CoTAttention(dim=512,kernel_size=3)
output=cot(input)
print(output.shape)



```

***


### 23. Residual Attention Usage

#### 23.1. Paper

[Residual Attention: A Simple but Effective Method for Multi-Label Recognition---ICCV2021](https://arxiv.org/abs/2108.02456) 


#### 23.2. Overview

![](./model/img/ResAtt.png)

#### 23.3. Usage Code
```python

from model.attention.ResidualAttention import ResidualAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
resatt = ResidualAttention(channel=512,num_class=1000,la=0.2)
output=resatt(input)
print(output.shape)



```

***



### 24. S2 Attention Usage

#### 24.1. Paper

[S²-MLPv2: Improved Spatial-Shift MLP Architecture for Vision---arXiv 2021.08.02](https://arxiv.org/abs/2108.01072) 


#### 24.2. Overview

![](./model/img/S2Attention.png)

#### 24.3. Usage Code
```python
from model.attention.S2Attention import S2Attention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
s2att = S2Attention(channels=512)
output=s2att(input)
print(output.shape)

```

***



### 25. GFNet Attention Usage

#### 25.1. Paper

[Global Filter Networks for Image Classification---arXiv 2021.07.01](https://arxiv.org/abs/2107.00645) 


#### 25.2. Overview

![](./model/img/GFNet.jpg)

#### 25.3. Usage Code - Implemented by [Wenliang Zhao (Author)](https://scholar.google.com/citations?user=lyPWvuEAAAAJ&hl=en)

```python
from model.attention.gfnet import GFNet
import torch
from torch import nn
from torch.nn import functional as F

x = torch.randn(1, 3, 224, 224)
gfnet = GFNet(embed_dim=384, img_size=224, patch_size=16, num_classes=1000)
out = gfnet(x)
print(out.shape)

```

***


### 26. TripletAttention Usage

#### 26.1. Paper

[Rotate to Attend: Convolutional Triplet Attention Module---CVPR 2021](https://arxiv.org/abs/2010.03045) 

#### 26.2. Overview

![](./model/img/triplet.png)

#### 26.3. Usage Code - Implemented by [digantamisra98](https://github.com/digantamisra98)

```python
from model.attention.TripletAttention import TripletAttention
import torch
from torch import nn
from torch.nn import functional as F
input=torch.randn(50,512,7,7)
triplet = TripletAttention()
output=triplet(input)
print(output.shape)
```


***


### 27. Coordinate Attention Usage

#### 27.1. Paper

[Coordinate Attention for Efficient Mobile Network Design---CVPR 2021](https://arxiv.org/abs/2103.02907)


#### 27.2. Overview

![](./model/img/CoordAttention.png)

#### 27.3. Usage Code - Implemented by [Andrew-Qibin](https://github.com/Andrew-Qibin)

```python
from model.attention.CoordAttention import CoordAtt
import torch
from torch import nn
from torch.nn import functional as F

inp=torch.rand([2, 96, 56, 56])
inp_dim, oup_dim = 96, 96
reduction=32

coord_attention = CoordAtt(inp_dim, oup_dim, reduction=reduction)
output=coord_attention(inp)
print(output.shape)
```

***


### 28. MobileViT Attention Usage

#### 28.1. Paper

[MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2021.10.05](https://arxiv.org/abs/2103.02907)


#### 28.2. Overview

![](./model/img/MobileViTAttention.png)

#### 28.3. Usage Code

```python
from model.attention.MobileViTAttention import MobileViTAttention
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    m=MobileViTAttention()
    input=torch.randn(1,3,49,49)
    output=m(input)
    print(output.shape)  #output:(1,3,49,49)
    
```

***


### 29. ParNet Attention Usage

#### 29.1. Paper

[Non-deep Networks---ArXiv 2021.10.20](https://arxiv.org/abs/2110.07641)


#### 29.2. Overview

![](./model/img/ParNet.png)

#### 29.3. Usage Code

```python
from model.attention.ParNetAttention import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,512,7,7)
    pna = ParNetAttention(channel=512)
    output=pna(input)
    print(output.shape) #50,512,7,7
    
```

***


### 30. UFO Attention Usage

#### 30.1. Paper

[UFO-ViT: High Performance Linear Vision Transformer without Softmax---ArXiv 2021.09.29](https://arxiv.org/abs/2110.07641)


#### 30.2. Overview

![](./model/img/UFO.png)

#### 30.3. Usage Code

```python
from model.attention.UFOAttention import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,49,512)
    ufo = UFOAttention(d_model=512, d_k=512, d_v=512, h=8)
    output=ufo(input,input,input)
    print(output.shape) #[50, 49, 512]
    
```

-

### 31. ACmix Attention Usage

#### 31.1. Paper

[On the Integration of Self-Attention and Convolution](https://arxiv.org/pdf/2111.14556.pdf)

#### 31.2. Usage Code

```python
from model.attention.ACmix import ACmix
import torch

if __name__ == '__main__':
    input=torch.randn(50,256,7,7)
    acmix = ACmix(in_planes=256, out_planes=256)
    output=acmix(input)
    print(output.shape)
    
```

### 32. MobileViTv2 Attention Usage

#### 32.1. Paper

[Separable Self-attention for Mobile Vision Transformers---ArXiv 2022.06.06](https://arxiv.org/abs/2206.02680)


#### 32.2. Overview

![](./model/img/MobileViTv2.png)

#### 32.3. Usage Code

```python
from model.attention.MobileViTv2Attention import MobileViTv2Attention
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,49,512)
    sa = MobileViTv2Attention(d_model=512)
    output=sa(input)
    print(output.shape)
    
```

### 33. DAT Attention Usage

#### 33.1. Paper

[Vision Transformer with Deformable Attention---CVPR2022](https://arxiv.org/abs/2201.00520)

#### 33.2. Usage Code

```python
from model.attention.DAT import DAT
import torch

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = DAT(
        img_size=224,
        patch_size=4,
        num_classes=1000,
        expansion=4,
        dim_stem=96,
        dims=[96, 192, 384, 768],
        depths=[2, 2, 6, 2],
        stage_spec=[['L', 'S'], ['L', 'S'], ['L', 'D', 'L', 'D', 'L', 'D'], ['L', 'D']],
        heads=[3, 6, 12, 24],
        window_sizes=[7, 7, 7, 7] ,
        groups=[-1, -1, 3, 6],
        use_pes=[False, False, True, True],
        dwc_pes=[False, False, False, False],
        strides=[-1, -1, 1, 1],
        sr_ratios=[-1, -1, -1, -1],
        offset_range_factor=[-1, -1, 2, 2],
        no_offs=[False, False, False, False],
        fixed_pes=[False, False, False, False],
        use_dwc_mlps=[False, False, False, False],
        use_conv_patches=False,
        drop_rate=0.0,
        attn_drop_rate=0.0,
        drop_path_rate=0.2,
    )
    output=model(input)
    print(output[0].shape)
    
```

### 34. CrossFormer Attention Usage

#### 34.1. Paper

[CROSSFORMER: A VERSATILE VISION TRANSFORMER HINGING ON CROSS-SCALE ATTENTION---ICLR 2022](https://arxiv.org/pdf/2108.00154.pdf)

#### 34.2. Usage Code

```python
from model.attention.Crossformer import CrossFormer
import torch

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CrossFormer(img_size=224,
        patch_size=[4, 8, 16, 32],
        in_chans= 3,
        num_classes=1000,
        embed_dim=48,
        depths=[2, 2, 6, 2],
        num_heads=[3, 6, 12, 24],
        group_size=[7, 7, 7, 7],
        mlp_ratio=4.,
        qkv_bias=True,
        qk_scale=None,
        drop_rate=0.0,
        drop_path_rate=0.1,
        ape=False,
        patch_norm=True,
        use_checkpoint=False,
        merge_size=[[2, 4], [2,4], [2, 4]]
    )
    output=model(input)
    print(output.shape)
    
```

### 35. MOATransformer Attention Usage

#### 35.1. Paper

[Aggregating Global Features into Local Vision Transformer](https://arxiv.org/abs/2201.12903)

#### 35.2. Usage Code

```python
from model.attention.MOATransformer import MOATransformer
import torch

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = MOATransformer(
        img_size=224,
        patch_size=4,
        in_chans=3,
        num_classes=1000,
        embed_dim=96,
        depths=[2, 2, 6],
        num_heads=[3, 6, 12],
        window_size=14,
        mlp_ratio=4.,
        qkv_bias=True,
        qk_scale=None,
        drop_rate=0.0,
        drop_path_rate=0.1,
        ape=False,
        patch_norm=True,
        use_checkpoint=False
    )
    output=model(input)
    print(output.shape)
    
```

### 36. CrissCrossAttention Attention Usage

#### 36.1. Paper

[CCNet: Criss-Cross Attention for Semantic Segmentation](https://arxiv.org/abs/1811.11721)

#### 36.2. Usage Code

```python
from model.attention.CrissCrossAttention import CrissCrossAttention
import torch

if __name__ == '__main__':
    input=torch.randn(3, 64, 7, 7)
    model = CrissCrossAttention(64)
    outputs = model(input)
    print(outputs.shape)
    
```

### 37. Axial_attention Attention Usage

#### 37.1. Paper

[Axial Attention in Multidimensional Transformers](https://arxiv.org/abs/1912.12180)

#### 37.2. Usage Code

```python
from model.attention.Axial_attention import AxialImageTransformer
import torch

if __name__ == '__main__':
    input=torch.randn(3, 128, 7, 7)
    model = AxialImageTransformer(
        dim = 128,
        depth = 12,
        reversible = True
    )
    outputs = model(input)
    print(outputs.shape)
    
```

***


# Backbone Series

- Pytorch implementation of ["Deep Residual Learning for Image Recognition---CVPR2016 Best Paper"](https://arxiv.org/pdf/1512.03385.pdf)

- Pytorch implementation of ["Aggregated Residual Transformations for Deep Neural Networks---CVPR2017"](https://arxiv.org/abs/1611.05431v2)

- Pytorch implementation of [MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2020.10.05](https://arxiv.org/abs/2103.02907)

- Pytorch implementation of [Patches Are All You Need?---ICLR2022 (Under Review)](https://openreview.net/forum?id=TVHS5Y4dNvM)

- Pytorch implementation of [Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer---ArXiv 2021.06.07](https://arxiv.org/abs/2106.03650)

- Pytorch implementation of [ConTNet: Why not use convolution and transformer at the same time?---ArXiv 2021.04.27](https://arxiv.org/abs/2104.13497)

- Pytorch implementation of [Vision Transformers with Hierarchical Attention---ArXiv 2022.06.15](https://arxiv.org/abs/2106.03180)

- Pytorch implementation of [Co-Scale Conv-Attentional Image Transformers---ArXiv 2021.08.26](https://arxiv.org/abs/2104.06399)

- Pytorch implementation of [Conditional Positional Encodings for Vision Transformers](https://arxiv.org/abs/2102.10882)

- Pytorch implementation of [Rethinking Spatial Dimensions of Vision Transformers---ICCV 2021](https://arxiv.org/abs/2103.16302)

- Pytorch implementation of [CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification---ICCV 2021](https://arxiv.org/abs/2103.14899)

- Pytorch implementation of [Transformer in Transformer---NeurIPS 2021](https://arxiv.org/abs/2103.00112)

- Pytorch implementation of [DeepViT: Towards Deeper Vision Transformer](https://arxiv.org/abs/2103.11886)

- Pytorch implementation of [Incorporating Convolution Designs into Visual Transformers](https://arxiv.org/abs/2103.11816)
***

- Pytorch implementation of [ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases](https://arxiv.org/abs/2103.10697)

- Pytorch implementation of [Augmenting Convolutional networks with attention-based aggregation](https://arxiv.org/abs/2112.13692)

- Pytorch implementation of [Going deeper with Image Transformers---ICCV 2021 (Oral)](https://arxiv.org/abs/2103.17239)

- Pytorch implementation of [Training data-efficient image transformers & distillation through attention---ICML 2021](https://arxiv.org/abs/2012.12877)

- Pytorch implementation of [LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference](https://arxiv.org/abs/2104.01136)

- Pytorch implementation of [VOLO: Vision Outlooker for Visual Recognition](https://arxiv.org/abs/2106.13112)

- Pytorch implementation of [Container: Context Aggregation Network---NeuIPS 2021](https://arxiv.org/abs/2106.01401)

- Pytorch implementation of [CMT: Convolutional Neural Networks Meet Vision Transformers---CVPR 2022](https://arxiv.org/abs/2107.06263)

- Pytorch implementation of [Vision Transformer with Deformable Attention---CVPR 2022](https://arxiv.org/abs/2201.00520)


### 1. ResNet Usage
#### 1.1. Paper
["Deep Residual Learning for Image Recognition---CVPR2016 Best Paper"](https://arxiv.org/pdf/1512.03385.pdf)

#### 1.2. Overview
![](./model/img/resnet.png)
![](./model/img/resnet2.jpg)

#### 1.3. Usage Code
```python

from model.backbone.resnet import ResNet50,ResNet101,ResNet152
import torch
if __name__ == '__main__':
    input=torch.randn(50,3,224,224)
    resnet50=ResNet50(1000)
    # resnet101=ResNet101(1000)
    # resnet152=ResNet152(1000)
    out=resnet50(input)
    print(out.shape)

```


### 2. ResNeXt Usage
#### 2.1. Paper

["Aggregated Residual Transformations for Deep Neural Networks---CVPR2017"](https://arxiv.org/abs/1611.05431v2)

#### 2.2. Overview
![](./model/img/resnext.png)

#### 2.3. Usage Code
```python

from model.backbone.resnext import ResNeXt50,ResNeXt101,ResNeXt152
import torch

if __name__ == '__main__':
    input=torch.randn(50,3,224,224)
    resnext50=ResNeXt50(1000)
    # resnext101=ResNeXt101(1000)
    # resnext152=ResNeXt152(1000)
    out=resnext50(input)
    print(out.shape)


```



### 3. MobileViT Usage
#### 3.1. Paper

[MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2020.10.05](https://arxiv.org/abs/2103.02907)

#### 3.2. Overview
![](./model/img/mobileViT.jpg)

#### 3.3. Usage Code
```python

from model.backbone.MobileViT import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)

    ### mobilevit_xxs
    mvit_xxs=mobilevit_xxs()
    out=mvit_xxs(input)
    print(out.shape)

    ### mobilevit_xs
    mvit_xs=mobilevit_xs()
    out=mvit_xs(input)
    print(out.shape)


    ### mobilevit_s
    mvit_s=mobilevit_s()
    out=mvit_s(input)
    print(out.shape)

```





### 4. ConvMixer Usage
#### 4.1. Paper
[Patches Are All You Need?---ICLR2022 (Under Review)](https://openreview.net/forum?id=TVHS5Y4dNvM)
#### 4.2. Overview
![](./model/img/ConvMixer.png)

#### 4.3. Usage Code
```python

from model.backbone.ConvMixer import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    x=torch.randn(1,3,224,224)
    convmixer=ConvMixer(dim=512,depth=12)
    out=convmixer(x)
    print(out.shape)  #[1, 1000]


```

### 5. ShuffleTransformer Usage
#### 5.1. Paper
[Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer](https://arxiv.org/pdf/2106.03650.pdf)

#### 5.2. Usage Code
```python

from model.backbone.ShuffleTransformer import ShuffleTransformer
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    sft = ShuffleTransformer()
    output=sft(input)
    print(output.shape)


```

### 6. ConTNet Usage
#### 6.1. Paper
[ConTNet: Why not use convolution and transformer at the same time?](https://arxiv.org/abs/2104.13497)

#### 6.2. Usage Code
```python

from model.backbone.ConTNet import ConTNet
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == "__main__":
    model = build_model(use_avgdown=True, relative=True, qkv_bias=True, pre_norm=True)
    input = torch.randn(1, 3, 224, 224)
    out = model(input)
    print(out.shape)


```

### 7 HATNet Usage
#### 7.1. Paper
[Vision Transformers with Hierarchical Attention](https://arxiv.org/abs/2106.03180)

#### 7.2. Usage Code
```python

from model.backbone.HATNet import HATNet
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    hat = HATNet(dims=[48, 96, 240, 384], head_dim=48, expansions=[8, 8, 4, 4],
        grid_sizes=[8, 7, 7, 1], ds_ratios=[8, 4, 2, 1], depths=[2, 2, 6, 3])
    output=hat(input)
    print(output.shape)


```

### 8 CoaT Usage
#### 8.1. Paper
[Co-Scale Conv-Attentional Image Transformers](https://arxiv.org/abs/2104.06399)

#### 8.2. Usage Code
```python

from model.backbone.CoaT import CoaT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CoaT(patch_size=4, embed_dims=[152, 152, 152, 152], serial_depths=[2, 2, 2, 2], parallel_depth=6, num_heads=8, mlp_ratios=[4, 4, 4, 4])
    output=model(input)
    print(output.shape) # torch.Size([1, 1000])

```

### 9 PVT Usage
#### 9.1. Paper
[PVT v2: Improved Baselines with Pyramid Vision Transformer](https://arxiv.org/pdf/2106.13797.pdf)

#### 9.2. Usage Code
```python

from model.backbone.PVT import PyramidVisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = PyramidVisionTransformer(
        patch_size=4, embed_dims=[64, 128, 320, 512], num_heads=[1, 2, 5, 8], mlp_ratios=[8, 8, 4, 4], qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6), depths=[2, 2, 2, 2], sr_ratios=[8, 4, 2, 1])
    output=model(input)
    print(output.shape)

```


### 10 CPVT Usage
#### 10.1. Paper
[Conditional Positional Encodings for Vision Transformers](https://arxiv.org/abs/2102.10882)

#### 10.2. Usage Code
```python

from model.backbone.CPVT import CPVTV2
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CPVTV2(
        patch_size=4, embed_dims=[64, 128, 320, 512], num_heads=[1, 2, 5, 8], mlp_ratios=[8, 8, 4, 4], qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6), depths=[3, 4, 6, 3], sr_ratios=[8, 4, 2, 1])
    output=model(input)
    print(output.shape)

```

### 11 PIT Usage
#### 11.1. Paper
[Rethinking Spatial Dimensions of Vision Transformers](https://arxiv.org/abs/2103.16302)

#### 11.2. Usage Code
```python

from model.backbone.PIT import PoolingTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = PoolingTransformer(
        image_size=224,
        patch_size=14,
        stride=7,
        base_dims=[64, 64, 64],
        depth=[3, 6, 4],
        heads=[4, 8, 16],
        mlp_ratio=4
    )
    output=model(input)
    print(output.shape)

```

### 12 CrossViT Usage
#### 12.1. Paper
[CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification](https://arxiv.org/abs/2103.14899)

#### 12.2. Usage Code
```python

from model.backbone.CrossViT import VisionTransformer
import torch
from torch import nn

if __name__ == "__main__":
    input=torch.randn(1,3,224,224)
    model = VisionTransformer(
        img_size=[240, 224],
        patch_size=[12, 16], 
        embed_dim=[192, 384], 
        depth=[[1, 4, 0], [1, 4, 0], [1, 4, 0]],
        num_heads=[6, 6], 
        mlp_ratio=[4, 4, 1], 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
    )
    output=model(input)
    print(output.shape)

```

### 13 TnT Usage
#### 13.1. Paper
[Transformer in Transformer](https://arxiv.org/abs/2103.00112)

#### 13.2. Usage Code
```python

from model.backbone.TnT import TNT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = TNT(
        img_size=224, 
        patch_size=16, 
        outer_dim=384, 
        inner_dim=24, 
        depth=12,
        outer_num_heads=6, 
        inner_num_heads=4, 
        qkv_bias=False,
        inner_stride=4)
    output=model(input)
    print(output.shape)

```

### 14 DViT Usage
#### 14.1. Paper
[DeepViT: Towards Deeper Vision Transformer](https://arxiv.org/abs/2103.11886)

#### 14.2. Usage Code
```python

from model.backbone.DViT import DeepVisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = DeepVisionTransformer(
        patch_size=16, embed_dim=384, 
        depth=[False] * 16, 
        apply_transform=[False] * 0 + [True] * 32, 
        num_heads=12, 
        mlp_ratio=3, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6),
        )
    output=model(input)
    print(output.shape)

```

### 15 CeiT Usage
#### 15.1. Paper
[Incorporating Convolution Designs into Visual Transformers](https://arxiv.org/abs/2103.11816)

#### 15.2. Usage Code
```python

from model.backbone.CeiT import CeIT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CeIT(
        hybrid_backbone=Image2Tokens(),
        patch_size=4, 
        embed_dim=192, 
        depth=12, 
        num_heads=3, 
        mlp_ratio=4, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
        )
    output=model(input)
    print(output.shape)

```

### 16 ConViT Usage
#### 16.1. Paper
[ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases](https://arxiv.org/abs/2103.10697)

#### 16.2. Usage Code
```python

from model.backbone.ConViT import VisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VisionTransformer(
        num_heads=16,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
        )
    output=model(input)
    print(output.shape)

```

### 17 CaiT Usage
#### 17.1. Paper
[Going deeper with Image Transformers](https://arxiv.org/abs/2103.17239)

#### 17.2. Usage Code
```python

from model.backbone.CaiT import CaiT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CaiT(
        img_size= 224,
        patch_size=16, 
        embed_dim=192, 
        depth=24, 
        num_heads=4, 
        mlp_ratio=4, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6),
        init_scale=1e-5,
        depth_token_only=2
        )
    output=model(input)
    print(output.shape)

```

### 18 PatchConvnet Usage
#### 18.1. Paper
[Augmenting Convolutional networks with attention-based aggregation](https://arxiv.org/abs/2112.13692)

#### 18.2. Usage Code
```python

from model.backbone.PatchConvnet import PatchConvnet
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = PatchConvnet(
        patch_size=16,
        embed_dim=384,
        depth=60,
        num_heads=1,
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6),
        Patch_layer=ConvStem,
        Attention_block=Conv_blocks_se,
        depth_token_only=1,
        mlp_ratio_clstk=3.0,
    )
    output=model(input)
    print(output.shape)

```

### 19 DeiT Usage
#### 19.1. Paper
[Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877)

#### 19.2. Usage Code
```python

from model.backbone.DeiT import DistilledVisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = DistilledVisionTransformer(
        patch_size=16, 
        embed_dim=384, 
        depth=12, 
        num_heads=6, 
        mlp_ratio=4, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
        )
    output=model(input)
    print(output[0].shape)

```

### 20 LeViT Usage
#### 20.1. Paper
[LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference](https://arxiv.org/abs/2104.01136)

#### 20.2. Usage Code
```python

from model.backbone.LeViT import *
import torch
from torch import nn

if __name__ == '__main__':
    for name in specification:
        input=torch.randn(1,3,224,224)
        model = globals()[name](fuse=True, pretrained=False)
        model.eval()
        output = model(input)
        print(output.shape)

```

### 21 VOLO Usage
#### 21.1. Paper
[VOLO: Vision Outlooker for Visual Recognition](https://arxiv.org/abs/2106.13112)

#### 21.2. Usage Code
```python

from model.backbone.VOLO import VOLO
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VOLO([4, 4, 8, 2],
                 embed_dims=[192, 384, 384, 384],
                 num_heads=[6, 12, 12, 12],
                 mlp_ratios=[3, 3, 3, 3],
                 downsamples=[True, False, False, False],
                 outlook_attention=[True, False, False, False ],
                 post_layers=['ca', 'ca'],
                 )
    output=model(input)
    print(output[0].shape)

```

### 22 Container Usage
#### 22.1. Paper
[Container: Context Aggregation Network](https://arxiv.org/abs/2106.01401)

#### 22.2. Usage Code
```python

from model.backbone.Container import VisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VisionTransformer(
        img_size=[224, 56, 28, 14], 
        patch_size=[4, 2, 2, 2], 
        embed_dim=[64, 128, 320, 512], 
        depth=[3, 4, 8, 3], 
        num_heads=16, 
        mlp_ratio=[8, 8, 4, 4], 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6))
    output=model(input)
    print(output.shape)

```

### 23 CMT Usage
#### 23.1. Paper
[CMT: Convolutional Neural Networks Meet Vision Transformers](https://arxiv.org/abs/2107.06263)

#### 23.2. Usage Code
```python

from model.backbone.CMT import CMT_Tiny
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CMT_Tiny()
    output=model(input)
    print(output[0].shape)

```






# MLP Series

- Pytorch implementation of ["RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition---arXiv 2021.05.05"](https://arxiv.org/pdf/2105.01883v1.pdf)

- Pytorch implementation of ["MLP-Mixer: An all-MLP Architecture for Vision---arXiv 2021.05.17"](https://arxiv.org/pdf/2105.01601.pdf)

- Pytorch implementation of ["ResMLP: Feedforward networks for image classification with data-efficient training---arXiv 2021.05.07"](https://arxiv.org/pdf/2105.03404.pdf)

- Pytorch implementation of ["Pay Attention to MLPs---arXiv 2021.05.17"](https://arxiv.org/abs/2105.08050)


- Pytorch implementation of ["Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?---arXiv 2021.09.12"](https://arxiv.org/abs/2109.05422)

### 1. RepMLP Usage
#### 1.1. Paper
["RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition"](https://arxiv.org/pdf/2105.01883v1.pdf)

#### 1.2. Overview
![](./model/img/repmlp.png)

#### 1.3. Usage Code
```python
from model.mlp.repmlp import RepMLP
import torch
from torch import nn

N=4 #batch size
C=512 #input dim
O=1024 #output dim
H=14 #image height
W=14 #image width
h=7 #patch height
w=7 #patch width
fc1_fc2_reduction=1 #reduction ratio
fc3_groups=8 # groups
repconv_kernels=[1,3,5,7] #kernel list
repmlp=RepMLP(C,O,H,W,h,w,fc1_fc2_reduction,fc3_groups,repconv_kernels=repconv_kernels)
x=torch.randn(N,C,H,W)
repmlp.eval()
for module in repmlp.modules():
    if isinstance(module, nn.BatchNorm2d) or isinstance(module, nn.BatchNorm1d):
        nn.init.uniform_(module.running_mean, 0, 0.1)
        nn.init.uniform_(module.running_var, 0, 0.1)
        nn.init.uniform_(module.weight, 0, 0.1)
        nn.init.uniform_(module.bias, 0, 0.1)

#training result
out=repmlp(x)
#inference result
repmlp.switch_to_deploy()
deployout = repmlp(x)

print(((deployout-out)**2).sum())
```

### 2. MLP-Mixer Usage
#### 2.1. Paper
["MLP-Mixer: An all-MLP Architecture for Vision"](https://arxiv.org/pdf/2105.01601.pdf)

#### 2.2. Overview
![](./model/img/mlpmixer.png)

#### 2.3. Usage Code
```python
from model.mlp.mlp_mixer import MlpMixer
import torch
mlp_mixer=MlpMixer(num_classes=1000,num_blocks=10,patch_size=10,tokens_hidden_dim=32,channels_hidden_dim=1024,tokens_mlp_dim=16,channels_mlp_dim=1024)
input=torch.randn(50,3,40,40)
output=mlp_mixer(input)
print(output.shape)
```

***

### 3. ResMLP Usage
#### 3.1. Paper
["ResMLP: Feedforward networks for image classification with data-efficient training"](https://arxiv.org/pdf/2105.03404.pdf)

#### 3.2. Overview
![](./model/img/resmlp.png)

#### 3.3. Usage Code
```python
from model.mlp.resmlp import ResMLP
import torch

input=torch.randn(50,3,14,14)
resmlp=ResMLP(dim=128,image_size=14,patch_size=7,class_num=1000)
out=resmlp(input)
print(out.shape) #the last dimention is class_num
```

***

### 4. gMLP Usage
#### 4.1. Paper
["Pay Attention to MLPs"](https://arxiv.org/abs/2105.08050)

#### 4.2. Overview
![](./model/img/gMLP.jpg)

#### 4.3. Usage Code
```python
from model.mlp.g_mlp import gMLP
import torch

num_tokens=10000
bs=50
len_sen=49
num_layers=6
input=torch.randint(num_tokens,(bs,len_sen)) #bs,len_sen
gmlp = gMLP(num_tokens=num_tokens,len_sen=len_sen,dim=512,d_ff=1024)
output=gmlp(input)
print(output.shape)
```

***

### 5. sMLP Usage
#### 5.1. Paper
["Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?"](https://arxiv.org/abs/2109.05422)

#### 5.2. Overview
![](./model/img/sMLP.jpg)

#### 5.3. Usage Code
```python
from model.mlp.sMLP_block import sMLPBlock
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,3,224,224)
    smlp=sMLPBlock(h=224,w=224)
    out=smlp(input)
    print(out.shape)
```

### 6. vip-mlp Usage
#### 6.1. Paper
["Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition"](https://arxiv.org/abs/2106.12368)

#### 6.2. Usage Code
```python
from model.mlp.vip-mlp import VisionPermutator
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VisionPermutator(
        layers=[4, 3, 8, 3], 
        embed_dims=[384, 384, 384, 384], 
        patch_size=14, 
        transitions=[False, False, False, False],
        segment_dim=[16, 16, 16, 16], 
        mlp_ratios=[3, 3, 3, 3], 
        mlp_fn=WeightedPermuteMLP
    )
    output=model(input)
    print(output.shape)
```


# Re-Parameter Series

- Pytorch implementation of ["RepVGG: Making VGG-style ConvNets Great Again---CVPR2021"](https://arxiv.org/abs/2101.03697)

- Pytorch implementation of ["ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks---ICCV2019"](https://arxiv.org/abs/1908.03930)

- Pytorch implementation of ["Diverse Branch Block: Building a Convolution as an Inception-like Unit---CVPR2021"](https://arxiv.org/abs/2103.13425)


***

### 1. RepVGG Usage
#### 1.1. Paper
["RepVGG: Making VGG-style ConvNets Great Again"](https://arxiv.org/abs/2101.03697)

#### 1.2. Overview
![](./model/img/repvgg.png)

#### 1.3. Usage Code
```python

from model.rep.repvgg import RepBlock
import torch


input=torch.randn(50,512,49,49)
repblock=RepBlock(512,512)
repblock.eval()
out=repblock(input)
repblock._switch_to_deploy()
out2=repblock(input)
print('difference between vgg and repvgg')
print(((out2-out)**2).sum())
```



***

### 2. ACNet Usage
#### 2.1. Paper
["ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks"](https://arxiv.org/abs/1908.03930)

#### 2.2. Overview
![](./model/img/acnet.png)

#### 2.3. Usage Code
```python
from model.rep.acnet import ACNet
import torch
from torch import nn

input=torch.randn(50,512,49,49)
acnet=ACNet(512,512)
acnet.eval()
out=acnet(input)
acnet._switch_to_deploy()
out2=acnet(input)
print('difference:')
print(((out2-out)**2).sum())

```



***

### 2. Diverse Branch Block Usage
#### 2.1. Paper
["Diverse Branch Block: Building a Convolution as an Inception-like Unit"](https://arxiv.org/abs/2103.13425)

#### 2.2. Overview
![](./model/img/ddb.png)

#### 2.3. Usage Code
##### 2.3.1 Transform I
```python
from model.rep.ddb import transI_conv_bn
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)
#conv+bn
conv1=nn.Conv2d(64,64,3,padding=1)
bn1=nn.BatchNorm2d(64)
bn1.eval()
out1=bn1(conv1(input))

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transI_conv_bn(conv1,bn1)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.2 Transform II
```python
from model.rep.ddb import transII_conv_branch
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1=nn.Conv2d(64,64,3,padding=1)
conv2=nn.Conv2d(64,64,3,padding=1)
out1=conv1(input)+conv2(input)

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transII_conv_branch(conv1,conv2)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.3 Transform III
```python
from model.rep.ddb import transIII_conv_sequential
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1=nn.Conv2d(64,64,1,padding=0,bias=False)
conv2=nn.Conv2d(64,64,3,padding=1,bias=False)
out1=conv2(conv1(input))


#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1,bias=False)
conv_fuse.weight.data=transIII_conv_sequential(conv1,conv2)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.4 Transform IV
```python
from model.rep.ddb import transIV_conv_concat
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1=nn.Conv2d(64,32,3,padding=1)
conv2=nn.Conv2d(64,32,3,padding=1)
out1=torch.cat([conv1(input),conv2(input)],dim=1)

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transIV_conv_concat(conv1,conv2)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.5 Transform V
```python
from model.rep.ddb import transV_avg
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

avg=nn.AvgPool2d(kernel_size=3,stride=1)
out1=avg(input)

conv=transV_avg(64,3)
out2=conv(input)

print("difference:",((out2-out1)**2).sum().item())
```


##### 2.3.6 Transform VI
```python
from model.rep.ddb import transVI_conv_scale
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1x1=nn.Conv2d(64,64,1)
conv1x3=nn.Conv2d(64,64,(1,3),padding=(0,1))
conv3x1=nn.Conv2d(64,64,(3,1),padding=(1,0))
out1=conv1x1(input)+conv1x3(input)+conv3x1(input)

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transVI_conv_scale(conv1x1,conv1x3,conv3x1)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```





# Convolution Series

- Pytorch implementation of ["MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications---CVPR2017"](https://arxiv.org/abs/1704.04861)

- Pytorch implementation of ["Efficientnet: Rethinking model scaling for convolutional neural networks---PMLR2019"](http://proceedings.mlr.press/v97/tan19a.html)

- Pytorch implementation of ["Involution: Inverting the Inherence of Convolution for Visual Recognition---CVPR2021"](https://arxiv.org/abs/2103.06255)

- Pytorch implementation of ["Dynamic Convolution: Attention over Convolution Kernels---CVPR2020 Oral"](https://arxiv.org/abs/1912.03458)

- Pytorch implementation of ["CondConv: Conditionally Parameterized Convolutions for Efficient Inference---NeurIPS2019"](https://arxiv.org/abs/1904.04971)

***

### 1. Depthwise Separable Convolution Usage
#### 1.1. Paper
["MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications"](https://arxiv.org/abs/1704.04861)

#### 1.2. Overview
![](./model/img/DepthwiseSeparableConv.png)

#### 1.3. Usage Code
```python
from model.conv.DepthwiseSeparableConvolution import DepthwiseSeparableConvolution
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,3,224,224)
dsconv=DepthwiseSeparableConvolution(3,64)
out=dsconv(input)
print(out.shape)
```

***


### 2. MBConv Usage
#### 2.1. Paper
["Efficientnet: Rethinking model scaling for convolutional neural networks"](http://proceedings.mlr.press/v97/tan19a.html)

#### 2.2. Overview
![](./model/img/MBConv.jpg)

#### 2.3. Usage Code
```python
from model.conv.MBConv import MBConvBlock
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,3,224,224)
mbconv=MBConvBlock(ksize=3,input_filters=3,output_filters=512,image_size=224)
out=mbconv(input)
print(out.shape)


```

***


### 3. Involution Usage
#### 3.1. Paper
["Involution: Inverting the Inherence of Convolution for Visual Recognition"](https://arxiv.org/abs/2103.06255)

#### 3.2. Overview
![](./model/img/Involution.png)

#### 3.3. Usage Code
```python
from model.conv.Involution import Involution
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,4,64,64)
involution=Involution(kernel_size=3,in_channel=4,stride=2)
out=involution(input)
print(out.shape)
```

***


### 4. DynamicConv Usage
#### 4.1. Paper
["Dynamic Convolution: Attention over Convolution Kernels"](https://arxiv.org/abs/1912.03458)

#### 4.2. Overview
![](./model/img/DynamicConv.png)

#### 4.3. Usage Code
```python
from model.conv.DynamicConv import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(2,32,64,64)
    m=DynamicConv(in_planes=32,out_planes=64,kernel_size=3,stride=1,padding=1,bias=False)
    out=m(input)
    print(out.shape) # 2,32,64,64

```

***


### 5. CondConv Usage
#### 5.1. Paper
["CondConv: Conditionally Parameterized Convolutions for Efficient Inference"](https://arxiv.org/abs/1904.04971)

#### 5.2. Overview
![](./model/img/CondConv.png)

#### 5.3. Usage Code
```python
from model.conv.CondConv import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(2,32,64,64)
    m=CondConv(in_planes=32,out_planes=64,kernel_size=3,stride=1,padding=1,bias=False)
    out=m(input)
    print(out.shape)

```

***


================================================
FILE: README_pip.md
================================================
## pip使用文档

### 安装

 直接通过 pip 安装,可直接在其他任务中使用

  ```shell
  pip install fightingcv-attention
  ```

### 演示

#### 使用 pip 方式
```python
import torch
from torch import nn
from torch.nn import functional as F

# 使用 pip 方式

from fightingcv_attention.attention.MobileViTv2Attention import *

if __name__ == '__main__':
    input=torch.randn(50,49,512)
    sa = MobileViTv2Attention(d_model=512)
    output=sa(input)
    print(output.shape)
```

## pip包 fightingcv-attention 包含以下模块

# 目录

- [Attention Series](#attention-series)
    - [1. External Attention Usage](#1-external-attention-usage)

    - [2. Self Attention Usage](#2-self-attention-usage)

    - [3. Simplified Self Attention Usage](#3-simplified-self-attention-usage)

    - [4. Squeeze-and-Excitation Attention Usage](#4-squeeze-and-excitation-attention-usage)

    - [5. SK Attention Usage](#5-sk-attention-usage)

    - [6. CBAM Attention Usage](#6-cbam-attention-usage)

    - [7. BAM Attention Usage](#7-bam-attention-usage)
    
    - [8. ECA Attention Usage](#8-eca-attention-usage)

    - [9. DANet Attention Usage](#9-danet-attention-usage)

    - [10. Pyramid Split Attention (PSA) Usage](#10-Pyramid-Split-Attention-Usage)

    - [11. Efficient Multi-Head Self-Attention(EMSA) Usage](#11-Efficient-Multi-Head-Self-Attention-Usage)

    - [12. Shuffle Attention Usage](#12-Shuffle-Attention-Usage)
    
    - [13. MUSE Attention Usage](#13-MUSE-Attention-Usage)
  
    - [14. SGE Attention Usage](#14-SGE-Attention-Usage)

    - [15. A2 Attention Usage](#15-A2-Attention-Usage)

    - [16. AFT Attention Usage](#16-AFT-Attention-Usage)

    - [17. Outlook Attention Usage](#17-Outlook-Attention-Usage)

    - [18. ViP Attention Usage](#18-ViP-Attention-Usage)

    - [19. CoAtNet Attention Usage](#19-CoAtNet-Attention-Usage)

    - [20. HaloNet Attention Usage](#20-HaloNet-Attention-Usage)

    - [21. Polarized Self-Attention Usage](#21-Polarized-Self-Attention-Usage)

    - [22. CoTAttention Usage](#22-CoTAttention-Usage)

    - [23. Residual Attention Usage](#23-Residual-Attention-Usage)
  
    - [24. S2 Attention Usage](#24-S2-Attention-Usage)

    - [25. GFNet Attention Usage](#25-GFNet-Attention-Usage)

    - [26. Triplet Attention Usage](#26-TripletAttention-Usage)

    - [27. Coordinate Attention Usage](#27-Coordinate-Attention-Usage)

    - [28. MobileViT Attention Usage](#28-MobileViT-Attention-Usage)

    - [29. ParNet Attention Usage](#29-ParNet-Attention-Usage)

    - [30. UFO Attention Usage](#30-UFO-Attention-Usage)

    - [31. ACmix Attention Usage](#31-Acmix-Attention-Usage)
  
    - [32. MobileViTv2 Attention Usage](#32-MobileViTv2-Attention-Usage)

    - [33. DAT Attention Usage](#33-DAT-Attention-Usage)

    - [34. CrossFormer Attention Usage](#34-CrossFormer-Attention-Usage)

    - [35. MOATransformer Attention Usage](#35-MOATransformer-Attention-Usage)

    - [36. CrissCrossAttention Attention Usage](#36-CrissCrossAttention-Attention-Usage)

    - [37. Axial_attention Attention Usage](#37-Axial_attention-Attention-Usage)

- [Backbone Series](#Backbone-series)

    - [1. ResNet Usage](#1-ResNet-Usage)

    - [2. ResNeXt Usage](#2-ResNeXt-Usage)

    - [3. MobileViT Usage](#3-MobileViT-Usage)

    - [4. ConvMixer Usage](#4-ConvMixer-Usage)

    - [5. ShuffleTransformer Usage](#5-ShuffleTransformer-Usage)

    - [6. ConTNet Usage](#6-ConTNet-Usage)

    - [7. HATNet Usage](#7-HATNet-Usage)

    - [8. CoaT Usage](#8-CoaT-Usage)

    - [9. PVT Usage](#9-PVT-Usage)

    - [10. CPVT Usage](#10-CPVT-Usage)

    - [11. PIT Usage](#11-PIT-Usage)

    - [12. CrossViT Usage](#12-CrossViT-Usage)

    - [13. TnT Usage](#13-TnT-Usage)

    - [14. DViT Usage](#14-DViT-Usage)

    - [15. CeiT Usage](#15-CeiT-Usage)

    - [16. ConViT Usage](#16-ConViT-Usage)

    - [17. CaiT Usage](#17-CaiT-Usage)

    - [18. PatchConvnet Usage](#18-PatchConvnet-Usage)

    - [19. DeiT Usage](#19-DeiT-Usage)

    - [20. LeViT Usage](#20-LeViT-Usage)

    - [21. VOLO Usage](#21-VOLO-Usage)
    
    - [22. Container Usage](#22-Container-Usage)

    - [23. CMT Usage](#23-CMT-Usage)

    - [24. EfficientFormer Usage](#24-EfficientFormer-Usage)


- [MLP Series](#mlp-series)

    - [1. RepMLP Usage](#1-RepMLP-Usage)

    - [2. MLP-Mixer Usage](#2-MLP-Mixer-Usage)

    - [3. ResMLP Usage](#3-ResMLP-Usage)

    - [4. gMLP Usage](#4-gMLP-Usage)

    - [5. sMLP Usage](#5-sMLP-Usage)

    - [6. vip-mlp Usage](#6-vip-mlp-Usage)

- [Re-Parameter(ReP) Series](#Re-Parameter-series)

    - [1. RepVGG Usage](#1-RepVGG-Usage)

    - [2. ACNet Usage](#2-ACNet-Usage)

    - [3. Diverse Branch Block(DDB) Usage](#3-Diverse-Branch-Block-Usage)

- [Convolution Series](#Convolution-series)

    - [1. Depthwise Separable Convolution Usage](#1-Depthwise-Separable-Convolution-Usage)

    - [2. MBConv Usage](#2-MBConv-Usage)

    - [3. Involution Usage](#3-Involution-Usage)

    - [4. DynamicConv Usage](#4-DynamicConv-Usage)

    - [5. CondConv Usage](#5-CondConv-Usage)

***



# Attention Series

- Pytorch implementation of ["Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks---arXiv 2021.05.05"](https://arxiv.org/abs/2105.02358)

- Pytorch implementation of ["Attention Is All You Need---NIPS2017"](https://arxiv.org/pdf/1706.03762.pdf)

- Pytorch implementation of ["Squeeze-and-Excitation Networks---CVPR2018"](https://arxiv.org/abs/1709.01507)

- Pytorch implementation of ["Selective Kernel Networks---CVPR2019"](https://arxiv.org/pdf/1903.06586.pdf)

- Pytorch implementation of ["CBAM: Convolutional Block Attention Module---ECCV2018"](https://openaccess.thecvf.com/content_ECCV_2018/papers/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.pdf)

- Pytorch implementation of ["BAM: Bottleneck Attention Module---BMCV2018"](https://arxiv.org/pdf/1807.06514.pdf)

- Pytorch implementation of ["ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks---CVPR2020"](https://arxiv.org/pdf/1910.03151.pdf)

- Pytorch implementation of ["Dual Attention Network for Scene Segmentation---CVPR2019"](https://arxiv.org/pdf/1809.02983.pdf)

- Pytorch implementation of ["EPSANet: An Efficient Pyramid Split Attention Block on Convolutional Neural Network---arXiv 2021.05.30"](https://arxiv.org/pdf/2105.14447.pdf)

- Pytorch implementation of ["ResT: An Efficient Transformer for Visual Recognition---arXiv 2021.05.28"](https://arxiv.org/abs/2105.13677)

- Pytorch implementation of ["SA-NET: SHUFFLE ATTENTION FOR DEEP CONVOLUTIONAL NEURAL NETWORKS---ICASSP 2021"](https://arxiv.org/pdf/2102.00240.pdf)

- Pytorch implementation of ["MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning---arXiv 2019.11.17"](https://arxiv.org/abs/1911.09483)

- Pytorch implementation of ["Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks---arXiv 2019.05.23"](https://arxiv.org/pdf/1905.09646.pdf)

- Pytorch implementation of ["A2-Nets: Double Attention Networks---NIPS2018"](https://arxiv.org/pdf/1810.11579.pdf)


- Pytorch implementation of ["An Attention Free Transformer---ICLR2021 (Apple New Work)"](https://arxiv.org/pdf/2105.14103v1.pdf)


- Pytorch implementation of [VOLO: Vision Outlooker for Visual Recognition---arXiv 2021.06.24"](https://arxiv.org/abs/2106.13112) 
  [【论文解析】](https://zhuanlan.zhihu.com/p/385561050)


- Pytorch implementation of [Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition---arXiv 2021.06.23](https://arxiv.org/abs/2106.12368) 
  [【论文解析】](https://mp.weixin.qq.com/s/5gonUQgBho_m2O54jyXF_Q)


- Pytorch implementation of [CoAtNet: Marrying Convolution and Attention for All Data Sizes---arXiv 2021.06.09](https://arxiv.org/abs/2106.04803) 
  [【论文解析】](https://zhuanlan.zhihu.com/p/385578588)


- Pytorch implementation of [Scaling Local Self-Attention for Parameter Efficient Visual Backbones---CVPR2021 Oral](https://arxiv.org/pdf/2103.12731.pdf)  [【论文解析】](https://zhuanlan.zhihu.com/p/388598744)



- Pytorch implementation of [Polarized Self-Attention: Towards High-quality Pixel-wise Regression---arXiv 2021.07.02](https://arxiv.org/abs/2107.00782)  [【论文解析】](https://zhuanlan.zhihu.com/p/389770482) 


- Pytorch implementation of [Contextual Transformer Networks for Visual Recognition---arXiv 2021.07.26](https://arxiv.org/abs/2107.12292)  [【论文解析】](https://zhuanlan.zhihu.com/p/394795481) 


- Pytorch implementation of [Residual Attention: A Simple but Effective Method for Multi-Label Recognition---ICCV2021](https://arxiv.org/abs/2108.02456) 


- Pytorch implementation of [S²-MLPv2: Improved Spatial-Shift MLP Architecture for Vision---arXiv 2021.08.02](https://arxiv.org/abs/2108.01072) [【论文解析】](https://zhuanlan.zhihu.com/p/397003638) 

- Pytorch implementation of [Global Filter Networks for Image Classification---arXiv 2021.07.01](https://arxiv.org/abs/2107.00645) 

- Pytorch implementation of [Rotate to Attend: Convolutional Triplet Attention Module---WACV 2021](https://arxiv.org/abs/2010.03045) 

- Pytorch implementation of [Coordinate Attention for Efficient Mobile Network Design ---CVPR 2021](https://arxiv.org/abs/2103.02907)

- Pytorch implementation of [MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2021.10.05](https://arxiv.org/abs/2110.02178)

- Pytorch implementation of [Non-deep Networks---ArXiv 2021.10.20](https://arxiv.org/abs/2110.07641)

- Pytorch implementation of [UFO-ViT: High Performance Linear Vision Transformer without Softmax---ArXiv 2021.09.29](https://arxiv.org/abs/2109.14382)

- Pytorch implementation of [Separable Self-attention for Mobile Vision Transformers---ArXiv 2022.06.06](https://arxiv.org/abs/2206.02680)

- Pytorch implementation of [On the Integration of Self-Attention and Convolution---ArXiv 2022.03.14](https://arxiv.org/pdf/2111.14556.pdf)

- Pytorch implementation of [CROSSFORMER: A VERSATILE VISION TRANSFORMER HINGING ON CROSS-SCALE ATTENTION---ICLR 2022](https://arxiv.org/pdf/2108.00154.pdf)

- Pytorch implementation of [Aggregating Global Features into Local Vision Transformer](https://arxiv.org/abs/2201.12903)

- Pytorch implementation of [CCNet: Criss-Cross Attention for Semantic Segmentation](https://arxiv.org/abs/1811.11721)

- Pytorch implementation of [Axial Attention in Multidimensional Transformers](https://arxiv.org/abs/1912.12180)
***


### 1. External Attention Usage
#### 1.1. Paper
["Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks"](https://arxiv.org/abs/2105.02358)

#### 1.2. Overview
![](./model/img/External_Attention.png)

#### 1.3. Usage Code
```python
from fightingcv_attention.attention.ExternalAttention import ExternalAttention
import torch

input=torch.randn(50,49,512)
ea = ExternalAttention(d_model=512,S=8)
output=ea(input)
print(output.shape)
```

***


### 2. Self Attention Usage
#### 2.1. Paper
["Attention Is All You Need"](https://arxiv.org/pdf/1706.03762.pdf)

#### 1.2. Overview
![](./model/img/SA.png)

#### 1.3. Usage Code
```python
from fightingcv_attention.attention.SelfAttention import ScaledDotProductAttention
import torch

input=torch.randn(50,49,512)
sa = ScaledDotProductAttention(d_model=512, d_k=512, d_v=512, h=8)
output=sa(input,input,input)
print(output.shape)
```

***

### 3. Simplified Self Attention Usage
#### 3.1. Paper
[None]()

#### 3.2. Overview
![](./model/img/SSA.png)

#### 3.3. Usage Code
```python
from fightingcv_attention.attention.SimplifiedSelfAttention import SimplifiedScaledDotProductAttention
import torch

input=torch.randn(50,49,512)
ssa = SimplifiedScaledDotProductAttention(d_model=512, h=8)
output=ssa(input,input,input)
print(output.shape)

```

***

### 4. Squeeze-and-Excitation Attention Usage
#### 4.1. Paper
["Squeeze-and-Excitation Networks"](https://arxiv.org/abs/1709.01507)

#### 4.2. Overview
![](./model/img/SE.png)

#### 4.3. Usage Code
```python
from fightingcv_attention.attention.SEAttention import SEAttention
import torch

input=torch.randn(50,512,7,7)
se = SEAttention(channel=512,reduction=8)
output=se(input)
print(output.shape)

```

***

### 5. SK Attention Usage
#### 5.1. Paper
["Selective Kernel Networks"](https://arxiv.org/pdf/1903.06586.pdf)

#### 5.2. Overview
![](./model/img/SK.png)

#### 5.3. Usage Code
```python
from fightingcv_attention.attention.SKAttention import SKAttention
import torch

input=torch.randn(50,512,7,7)
se = SKAttention(channel=512,reduction=8)
output=se(input)
print(output.shape)

```
***

### 6. CBAM Attention Usage
#### 6.1. Paper
["CBAM: Convolutional Block Attention Module"](https://openaccess.thecvf.com/content_ECCV_2018/papers/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.pdf)

#### 6.2. Overview
![](./model/img/CBAM1.png)

![](./model/img/CBAM2.png)

#### 6.3. Usage Code
```python
from fightingcv_attention.attention.CBAM import CBAMBlock
import torch

input=torch.randn(50,512,7,7)
kernel_size=input.shape[2]
cbam = CBAMBlock(channel=512,reduction=16,kernel_size=kernel_size)
output=cbam(input)
print(output.shape)

```

***

### 7. BAM Attention Usage
#### 7.1. Paper
["BAM: Bottleneck Attention Module"](https://arxiv.org/pdf/1807.06514.pdf)

#### 7.2. Overview
![](./model/img/BAM.png)

#### 7.3. Usage Code
```python
from fightingcv_attention.attention.BAM import BAMBlock
import torch

input=torch.randn(50,512,7,7)
bam = BAMBlock(channel=512,reduction=16,dia_val=2)
output=bam(input)
print(output.shape)

```

***

### 8. ECA Attention Usage
#### 8.1. Paper
["ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks"](https://arxiv.org/pdf/1910.03151.pdf)

#### 8.2. Overview
![](./model/img/ECA.png)

#### 8.3. Usage Code
```python
from fightingcv_attention.attention.ECAAttention import ECAAttention
import torch

input=torch.randn(50,512,7,7)
eca = ECAAttention(kernel_size=3)
output=eca(input)
print(output.shape)

```

***

### 9. DANet Attention Usage
#### 9.1. Paper
["Dual Attention Network for Scene Segmentation"](https://arxiv.org/pdf/1809.02983.pdf)

#### 9.2. Overview
![](./model/img/danet.png)

#### 9.3. Usage Code
```python
from fightingcv_attention.attention.DANet import DAModule
import torch

input=torch.randn(50,512,7,7)
danet=DAModule(d_model=512,kernel_size=3,H=7,W=7)
print(danet(input).shape)

```

***

### 10. Pyramid Split Attention Usage

#### 10.1. Paper
["EPSANet: An Efficient Pyramid Split Attention Block on Convolutional Neural Network"](https://arxiv.org/pdf/2105.14447.pdf)

#### 10.2. Overview
![](./model/img/psa.png)

#### 10.3. Usage Code
```python
from fightingcv_attention.attention.PSA import PSA
import torch

input=torch.randn(50,512,7,7)
psa = PSA(channel=512,reduction=8)
output=psa(input)
print(output.shape)

```

***


### 11. Efficient Multi-Head Self-Attention Usage

#### 11.1. Paper
["ResT: An Efficient Transformer for Visual Recognition"](https://arxiv.org/abs/2105.13677)

#### 11.2. Overview
![](./model/img/EMSA.png)

#### 11.3. Usage Code
```python

from fightingcv_attention.attention.EMSA import EMSA
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,64,512)
emsa = EMSA(d_model=512, d_k=512, d_v=512, h=8,H=8,W=8,ratio=2,apply_transform=True)
output=emsa(input,input,input)
print(output.shape)
    
```

***


### 12. Shuffle Attention Usage

#### 12.1. Paper
["SA-NET: SHUFFLE ATTENTION FOR DEEP CONVOLUTIONAL NEURAL NETWORKS"](https://arxiv.org/pdf/2102.00240.pdf)

#### 12.2. Overview
![](./model/img/ShuffleAttention.jpg)

#### 12.3. Usage Code
```python

from fightingcv_attention.attention.ShuffleAttention import ShuffleAttention
import torch
from torch import nn
from torch.nn import functional as F


input=torch.randn(50,512,7,7)
se = ShuffleAttention(channel=512,G=8)
output=se(input)
print(output.shape)

    
```


***


### 13. MUSE Attention Usage

#### 13.1. Paper
["MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning"](https://arxiv.org/abs/1911.09483)

#### 13.2. Overview
![](./model/img/MUSE.png)

#### 13.3. Usage Code
```python
from fightingcv_attention.attention.MUSEAttention import MUSEAttention
import torch
from torch import nn
from torch.nn import functional as F


input=torch.randn(50,49,512)
sa = MUSEAttention(d_model=512, d_k=512, d_v=512, h=8)
output=sa(input,input,input)
print(output.shape)

```

***


### 14. SGE Attention Usage

#### 14.1. Paper
[Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks](https://arxiv.org/pdf/1905.09646.pdf)

#### 14.2. Overview
![](./model/img/SGE.png)

#### 14.3. Usage Code
```python
from fightingcv_attention.attention.SGE import SpatialGroupEnhance
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
sge = SpatialGroupEnhance(groups=8)
output=sge(input)
print(output.shape)

```

***


### 15. A2 Attention Usage

#### 15.1. Paper
[A2-Nets: Double Attention Networks](https://arxiv.org/pdf/1810.11579.pdf)

#### 15.2. Overview
![](./model/img/A2.png)

#### 15.3. Usage Code
```python
from fightingcv_attention.attention.A2Atttention import DoubleAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
a2 = DoubleAttention(512,128,128,True)
output=a2(input)
print(output.shape)

```



### 16. AFT Attention Usage

#### 16.1. Paper
[An Attention Free Transformer](https://arxiv.org/pdf/2105.14103v1.pdf)

#### 16.2. Overview
![](./model/img/AFT.jpg)

#### 16.3. Usage Code
```python
from fightingcv_attention.attention.AFT import AFT_FULL
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,49,512)
aft_full = AFT_FULL(d_model=512, n=49)
output=aft_full(input)
print(output.shape)

```






### 17. Outlook Attention Usage

#### 17.1. Paper


[VOLO: Vision Outlooker for Visual Recognition"](https://arxiv.org/abs/2106.13112)


#### 17.2. Overview
![](./model/img/OutlookAttention.png)

#### 17.3. Usage Code
```python
from fightingcv_attention.attention.OutlookAttention import OutlookAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,28,28,512)
outlook = OutlookAttention(dim=512)
output=outlook(input)
print(output.shape)

```


***






### 18. ViP Attention Usage

#### 18.1. Paper


[Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition"](https://arxiv.org/abs/2106.12368)


#### 18.2. Overview
![](./model/img/ViP.png)

#### 18.3. Usage Code
```python

from fightingcv_attention.attention.ViP import WeightedPermuteMLP
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(64,8,8,512)
seg_dim=8
vip=WeightedPermuteMLP(512,seg_dim)
out=vip(input)
print(out.shape)

```


***





### 19. CoAtNet Attention Usage

#### 19.1. Paper


[CoAtNet: Marrying Convolution and Attention for All Data Sizes"](https://arxiv.org/abs/2106.04803) 


#### 19.2. Overview
None


#### 19.3. Usage Code
```python

from fightingcv_attention.attention.CoAtNet import CoAtNet
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,3,224,224)
mbconv=CoAtNet(in_ch=3,image_size=224)
out=mbconv(input)
print(out.shape)

```


***






### 20. HaloNet Attention Usage

#### 20.1. Paper


[Scaling Local Self-Attention for Parameter Efficient Visual Backbones"](https://arxiv.org/pdf/2103.12731.pdf) 


#### 20.2. Overview

![](./model/img/HaloNet.png)

#### 20.3. Usage Code
```python

from fightingcv_attention.attention.HaloAttention import HaloAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,512,8,8)
halo = HaloAttention(dim=512,
    block_size=2,
    halo_size=1,)
output=halo(input)
print(output.shape)

```


***

### 21. Polarized Self-Attention Usage

#### 21.1. Paper

[Polarized Self-Attention: Towards High-quality Pixel-wise Regression"](https://arxiv.org/abs/2107.00782)  


#### 21.2. Overview

![](./model/img/PoSA.png)

#### 21.3. Usage Code
```python

from fightingcv_attention.attention.PolarizedSelfAttention import ParallelPolarizedSelfAttention,SequentialPolarizedSelfAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,512,7,7)
psa = SequentialPolarizedSelfAttention(channel=512)
output=psa(input)
print(output.shape)


```


***


### 22. CoTAttention Usage

#### 22.1. Paper

[Contextual Transformer Networks for Visual Recognition---arXiv 2021.07.26](https://arxiv.org/abs/2107.12292) 


#### 22.2. Overview

![](./model/img/CoT.png)

#### 22.3. Usage Code
```python

from fightingcv_attention.attention.CoTAttention import CoTAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
cot = CoTAttention(dim=512,kernel_size=3)
output=cot(input)
print(output.shape)



```

***


### 23. Residual Attention Usage

#### 23.1. Paper

[Residual Attention: A Simple but Effective Method for Multi-Label Recognition---ICCV2021](https://arxiv.org/abs/2108.02456) 


#### 23.2. Overview

![](./model/img/ResAtt.png)

#### 23.3. Usage Code
```python

from fightingcv_attention.attention.ResidualAttention import ResidualAttention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
resatt = ResidualAttention(channel=512,num_class=1000,la=0.2)
output=resatt(input)
print(output.shape)



```

***



### 24. S2 Attention Usage

#### 24.1. Paper

[S²-MLPv2: Improved Spatial-Shift MLP Architecture for Vision---arXiv 2021.08.02](https://arxiv.org/abs/2108.01072) 


#### 24.2. Overview

![](./model/img/S2Attention.png)

#### 24.3. Usage Code
```python
from fightingcv_attention.attention.S2Attention import S2Attention
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(50,512,7,7)
s2att = S2Attention(channels=512)
output=s2att(input)
print(output.shape)

```

***



### 25. GFNet Attention Usage

#### 25.1. Paper

[Global Filter Networks for Image Classification---arXiv 2021.07.01](https://arxiv.org/abs/2107.00645) 


#### 25.2. Overview

![](./model/img/GFNet.jpg)

#### 25.3. Usage Code - Implemented by [Wenliang Zhao (Author)](https://scholar.google.com/citations?user=lyPWvuEAAAAJ&hl=en)

```python
from fightingcv_attention.attention.gfnet import GFNet
import torch
from torch import nn
from torch.nn import functional as F

x = torch.randn(1, 3, 224, 224)
gfnet = GFNet(embed_dim=384, img_size=224, patch_size=16, num_classes=1000)
out = gfnet(x)
print(out.shape)

```

***


### 26. TripletAttention Usage

#### 26.1. Paper

[Rotate to Attend: Convolutional Triplet Attention Module---CVPR 2021](https://arxiv.org/abs/2010.03045) 

#### 26.2. Overview

![](./model/img/triplet.png)

#### 26.3. Usage Code - Implemented by [digantamisra98](https://github.com/digantamisra98)

```python
from fightingcv_attention.attention.TripletAttention import TripletAttention
import torch
from torch import nn
from torch.nn import functional as F
input=torch.randn(50,512,7,7)
triplet = TripletAttention()
output=triplet(input)
print(output.shape)
```


***


### 27. Coordinate Attention Usage

#### 27.1. Paper

[Coordinate Attention for Efficient Mobile Network Design---CVPR 2021](https://arxiv.org/abs/2103.02907)


#### 27.2. Overview

![](./model/img/CoordAttention.png)

#### 27.3. Usage Code - Implemented by [Andrew-Qibin](https://github.com/Andrew-Qibin)

```python
from fightingcv_attention.attention.CoordAttention import CoordAtt
import torch
from torch import nn
from torch.nn import functional as F

inp=torch.rand([2, 96, 56, 56])
inp_dim, oup_dim = 96, 96
reduction=32

coord_attention = CoordAtt(inp_dim, oup_dim, reduction=reduction)
output=coord_attention(inp)
print(output.shape)
```

***


### 28. MobileViT Attention Usage

#### 28.1. Paper

[MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2021.10.05](https://arxiv.org/abs/2103.02907)


#### 28.2. Overview

![](./model/img/MobileViTAttention.png)

#### 28.3. Usage Code

```python
from fightingcv_attention.attention.MobileViTAttention import MobileViTAttention
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    m=MobileViTAttention()
    input=torch.randn(1,3,49,49)
    output=m(input)
    print(output.shape)  #output:(1,3,49,49)
    
```

***


### 29. ParNet Attention Usage

#### 29.1. Paper

[Non-deep Networks---ArXiv 2021.10.20](https://arxiv.org/abs/2110.07641)


#### 29.2. Overview

![](./model/img/ParNet.png)

#### 29.3. Usage Code

```python
from fightingcv_attention.attention.ParNetAttention import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,512,7,7)
    pna = ParNetAttention(channel=512)
    output=pna(input)
    print(output.shape) #50,512,7,7
    
```

***


### 30. UFO Attention Usage

#### 30.1. Paper

[UFO-ViT: High Performance Linear Vision Transformer without Softmax---ArXiv 2021.09.29](https://arxiv.org/abs/2110.07641)


#### 30.2. Overview

![](./model/img/UFO.png)

#### 30.3. Usage Code

```python
from fightingcv_attention.attention.UFOAttention import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,49,512)
    ufo = UFOAttention(d_model=512, d_k=512, d_v=512, h=8)
    output=ufo(input,input,input)
    print(output.shape) #[50, 49, 512]
    
```

-

### 31. ACmix Attention Usage

#### 31.1. Paper

[On the Integration of Self-Attention and Convolution](https://arxiv.org/pdf/2111.14556.pdf)

#### 31.2. Usage Code

```python
from fightingcv_attention.attention.ACmix import ACmix
import torch

if __name__ == '__main__':
    input=torch.randn(50,256,7,7)
    acmix = ACmix(in_planes=256, out_planes=256)
    output=acmix(input)
    print(output.shape)
    
```

### 32. MobileViTv2 Attention Usage

#### 32.1. Paper

[Separable Self-attention for Mobile Vision Transformers---ArXiv 2022.06.06](https://arxiv.org/abs/2206.02680)


#### 32.2. Overview

![](./model/img/MobileViTv2.png)

#### 32.3. Usage Code

```python
from fightingcv_attention.attention.MobileViTv2Attention import MobileViTv2Attention
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,49,512)
    sa = MobileViTv2Attention(d_model=512)
    output=sa(input)
    print(output.shape)
    
```

### 33. DAT Attention Usage

#### 33.1. Paper

[Vision Transformer with Deformable Attention---CVPR2022](https://arxiv.org/abs/2201.00520)

#### 33.2. Usage Code

```python
from fightingcv_attention.attention.DAT import DAT
import torch

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = DAT(
        img_size=224,
        patch_size=4,
        num_classes=1000,
        expansion=4,
        dim_stem=96,
        dims=[96, 192, 384, 768],
        depths=[2, 2, 6, 2],
        stage_spec=[['L', 'S'], ['L', 'S'], ['L', 'D', 'L', 'D', 'L', 'D'], ['L', 'D']],
        heads=[3, 6, 12, 24],
        window_sizes=[7, 7, 7, 7] ,
        groups=[-1, -1, 3, 6],
        use_pes=[False, False, True, True],
        dwc_pes=[False, False, False, False],
        strides=[-1, -1, 1, 1],
        sr_ratios=[-1, -1, -1, -1],
        offset_range_factor=[-1, -1, 2, 2],
        no_offs=[False, False, False, False],
        fixed_pes=[False, False, False, False],
        use_dwc_mlps=[False, False, False, False],
        use_conv_patches=False,
        drop_rate=0.0,
        attn_drop_rate=0.0,
        drop_path_rate=0.2,
    )
    output=model(input)
    print(output[0].shape)
    
```

### 34. CrossFormer Attention Usage

#### 34.1. Paper

[CROSSFORMER: A VERSATILE VISION TRANSFORMER HINGING ON CROSS-SCALE ATTENTION---ICLR 2022](https://arxiv.org/pdf/2108.00154.pdf)

#### 34.2. Usage Code

```python
from fightingcv_attention.attention.Crossformer import CrossFormer
import torch

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CrossFormer(img_size=224,
        patch_size=[4, 8, 16, 32],
        in_chans= 3,
        num_classes=1000,
        embed_dim=48,
        depths=[2, 2, 6, 2],
        num_heads=[3, 6, 12, 24],
        group_size=[7, 7, 7, 7],
        mlp_ratio=4.,
        qkv_bias=True,
        qk_scale=None,
        drop_rate=0.0,
        drop_path_rate=0.1,
        ape=False,
        patch_norm=True,
        use_checkpoint=False,
        merge_size=[[2, 4], [2,4], [2, 4]]
    )
    output=model(input)
    print(output.shape)
    
```

### 35. MOATransformer Attention Usage

#### 35.1. Paper

[Aggregating Global Features into Local Vision Transformer](https://arxiv.org/abs/2201.12903)

#### 35.2. Usage Code

```python
from fightingcv_attention.attention.MOATransformer import MOATransformer
import torch

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = MOATransformer(
        img_size=224,
        patch_size=4,
        in_chans=3,
        num_classes=1000,
        embed_dim=96,
        depths=[2, 2, 6],
        num_heads=[3, 6, 12],
        window_size=14,
        mlp_ratio=4.,
        qkv_bias=True,
        qk_scale=None,
        drop_rate=0.0,
        drop_path_rate=0.1,
        ape=False,
        patch_norm=True,
        use_checkpoint=False
    )
    output=model(input)
    print(output.shape)
    
```

### 36. CrissCrossAttention Attention Usage

#### 36.1. Paper

[CCNet: Criss-Cross Attention for Semantic Segmentation](https://arxiv.org/abs/1811.11721)

#### 36.2. Usage Code

```python
from fightingcv_attention.attention.CrissCrossAttention import CrissCrossAttention
import torch

if __name__ == '__main__':
    input=torch.randn(3, 64, 7, 7)
    model = CrissCrossAttention(64)
    outputs = model(input)
    print(outputs.shape)
    
```

### 37. Axial_attention Attention Usage

#### 37.1. Paper

[Axial Attention in Multidimensional Transformers](https://arxiv.org/abs/1912.12180)

#### 37.2. Usage Code

```python
from fightingcv_attention.attention.Axial_attention import AxialImageTransformer
import torch

if __name__ == '__main__':
    input=torch.randn(3, 128, 7, 7)
    model = AxialImageTransformer(
        dim = 128,
        depth = 12,
        reversible = True
    )
    outputs = model(input)
    print(outputs.shape)
    
```

***


# Backbone Series

- Pytorch implementation of ["Deep Residual Learning for Image Recognition---CVPR2016 Best Paper"](https://arxiv.org/pdf/1512.03385.pdf)

- Pytorch implementation of ["Aggregated Residual Transformations for Deep Neural Networks---CVPR2017"](https://arxiv.org/abs/1611.05431v2)

- Pytorch implementation of [MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2020.10.05](https://arxiv.org/abs/2103.02907)

- Pytorch implementation of [Patches Are All You Need?---ICLR2022 (Under Review)](https://openreview.net/forum?id=TVHS5Y4dNvM)

- Pytorch implementation of [Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer---ArXiv 2021.06.07](https://arxiv.org/abs/2106.03650)

- Pytorch implementation of [ConTNet: Why not use convolution and transformer at the same time?---ArXiv 2021.04.27](https://arxiv.org/abs/2104.13497)

- Pytorch implementation of [Vision Transformers with Hierarchical Attention---ArXiv 2022.06.15](https://arxiv.org/abs/2106.03180)

- Pytorch implementation of [Co-Scale Conv-Attentional Image Transformers---ArXiv 2021.08.26](https://arxiv.org/abs/2104.06399)

- Pytorch implementation of [Conditional Positional Encodings for Vision Transformers](https://arxiv.org/abs/2102.10882)

- Pytorch implementation of [Rethinking Spatial Dimensions of Vision Transformers---ICCV 2021](https://arxiv.org/abs/2103.16302)

- Pytorch implementation of [CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification---ICCV 2021](https://arxiv.org/abs/2103.14899)

- Pytorch implementation of [Transformer in Transformer---NeurIPS 2021](https://arxiv.org/abs/2103.00112)

- Pytorch implementation of [DeepViT: Towards Deeper Vision Transformer](https://arxiv.org/abs/2103.11886)

- Pytorch implementation of [Incorporating Convolution Designs into Visual Transformers](https://arxiv.org/abs/2103.11816)
***

- Pytorch implementation of [ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases](https://arxiv.org/abs/2103.10697)

- Pytorch implementation of [Augmenting Convolutional networks with attention-based aggregation](https://arxiv.org/abs/2112.13692)

- Pytorch implementation of [Going deeper with Image Transformers---ICCV 2021 (Oral)](https://arxiv.org/abs/2103.17239)

- Pytorch implementation of [Training data-efficient image transformers & distillation through attention---ICML 2021](https://arxiv.org/abs/2012.12877)

- Pytorch implementation of [LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference](https://arxiv.org/abs/2104.01136)

- Pytorch implementation of [VOLO: Vision Outlooker for Visual Recognition](https://arxiv.org/abs/2106.13112)

- Pytorch implementation of [Container: Context Aggregation Network---NeuIPS 2021](https://arxiv.org/abs/2106.01401)

- Pytorch implementation of [CMT: Convolutional Neural Networks Meet Vision Transformers---CVPR 2022](https://arxiv.org/abs/2107.06263)

- Pytorch implementation of [Vision Transformer with Deformable Attention---CVPR 2022](https://arxiv.org/abs/2201.00520)

- Pytorch implementation of [EfficientFormer: Vision Transformers at MobileNet Speed](https://arxiv.org/abs/2206.01191)


### 1. ResNet Usage
#### 1.1. Paper
["Deep Residual Learning for Image Recognition---CVPR2016 Best Paper"](https://arxiv.org/pdf/1512.03385.pdf)

#### 1.2. Overview
![](./model/img/resnet.png)
![](./model/img/resnet2.jpg)

#### 1.3. Usage Code
```python

from fightingcv_attention.backbone.resnet import ResNet50,ResNet101,ResNet152
import torch
if __name__ == '__main__':
    input=torch.randn(50,3,224,224)
    resnet50=ResNet50(1000)
    # resnet101=ResNet101(1000)
    # resnet152=ResNet152(1000)
    out=resnet50(input)
    print(out.shape)

```


### 2. ResNeXt Usage
#### 2.1. Paper

["Aggregated Residual Transformations for Deep Neural Networks---CVPR2017"](https://arxiv.org/abs/1611.05431v2)

#### 2.2. Overview
![](./model/img/resnext.png)

#### 2.3. Usage Code
```python

from fightingcv_attention.backbone.resnext import ResNeXt50,ResNeXt101,ResNeXt152
import torch

if __name__ == '__main__':
    input=torch.randn(50,3,224,224)
    resnext50=ResNeXt50(1000)
    # resnext101=ResNeXt101(1000)
    # resnext152=ResNeXt152(1000)
    out=resnext50(input)
    print(out.shape)


```



### 3. MobileViT Usage
#### 3.1. Paper

[MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer---ArXiv 2020.10.05](https://arxiv.org/abs/2103.02907)

#### 3.2. Overview
![](./model/img/mobileViT.jpg)

#### 3.3. Usage Code
```python

from fightingcv_attention.backbone.MobileViT import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)

    ### mobilevit_xxs
    mvit_xxs=mobilevit_xxs()
    out=mvit_xxs(input)
    print(out.shape)

    ### mobilevit_xs
    mvit_xs=mobilevit_xs()
    out=mvit_xs(input)
    print(out.shape)


    ### mobilevit_s
    mvit_s=mobilevit_s()
    out=mvit_s(input)
    print(out.shape)

```





### 4. ConvMixer Usage
#### 4.1. Paper
[Patches Are All You Need?---ICLR2022 (Under Review)](https://openreview.net/forum?id=TVHS5Y4dNvM)
#### 4.2. Overview
![](./model/img/ConvMixer.png)

#### 4.3. Usage Code
```python

from fightingcv_attention.backbone.ConvMixer import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    x=torch.randn(1,3,224,224)
    convmixer=ConvMixer(dim=512,depth=12)
    out=convmixer(x)
    print(out.shape)  #[1, 1000]


```

### 5. ShuffleTransformer Usage
#### 5.1. Paper
[Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer](https://arxiv.org/pdf/2106.03650.pdf)

#### 5.2. Usage Code
```python

from fightingcv_attention.backbone.ShuffleTransformer import ShuffleTransformer
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    sft = ShuffleTransformer()
    output=sft(input)
    print(output.shape)


```

### 6. ConTNet Usage
#### 6.1. Paper
[ConTNet: Why not use convolution and transformer at the same time?](https://arxiv.org/abs/2104.13497)

#### 6.2. Usage Code
```python

from fightingcv_attention.backbone.ConTNet import ConTNet
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == "__main__":
    model = build_model(use_avgdown=True, relative=True, qkv_bias=True, pre_norm=True)
    input = torch.randn(1, 3, 224, 224)
    out = model(input)
    print(out.shape)


```

### 7 HATNet Usage
#### 7.1. Paper
[Vision Transformers with Hierarchical Attention](https://arxiv.org/abs/2106.03180)

#### 7.2. Usage Code
```python

from fightingcv_attention.backbone.HATNet import HATNet
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    hat = HATNet(dims=[48, 96, 240, 384], head_dim=48, expansions=[8, 8, 4, 4],
        grid_sizes=[8, 7, 7, 1], ds_ratios=[8, 4, 2, 1], depths=[2, 2, 6, 3])
    output=hat(input)
    print(output.shape)


```

### 8 CoaT Usage
#### 8.1. Paper
[Co-Scale Conv-Attentional Image Transformers](https://arxiv.org/abs/2104.06399)

#### 8.2. Usage Code
```python

from fightingcv_attention.backbone.CoaT import CoaT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CoaT(patch_size=4, embed_dims=[152, 152, 152, 152], serial_depths=[2, 2, 2, 2], parallel_depth=6, num_heads=8, mlp_ratios=[4, 4, 4, 4])
    output=model(input)
    print(output.shape) # torch.Size([1, 1000])

```

### 9 PVT Usage
#### 9.1. Paper
[PVT v2: Improved Baselines with Pyramid Vision Transformer](https://arxiv.org/pdf/2106.13797.pdf)

#### 9.2. Usage Code
```python

from fightingcv_attention.backbone.PVT import PyramidVisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = PyramidVisionTransformer(
        patch_size=4, embed_dims=[64, 128, 320, 512], num_heads=[1, 2, 5, 8], mlp_ratios=[8, 8, 4, 4], qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6), depths=[2, 2, 2, 2], sr_ratios=[8, 4, 2, 1])
    output=model(input)
    print(output.shape)

```


### 10 CPVT Usage
#### 10.1. Paper
[Conditional Positional Encodings for Vision Transformers](https://arxiv.org/abs/2102.10882)

#### 10.2. Usage Code
```python

from fightingcv_attention.backbone.CPVT import CPVTV2
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CPVTV2(
        patch_size=4, embed_dims=[64, 128, 320, 512], num_heads=[1, 2, 5, 8], mlp_ratios=[8, 8, 4, 4], qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6), depths=[3, 4, 6, 3], sr_ratios=[8, 4, 2, 1])
    output=model(input)
    print(output.shape)

```

### 11 PIT Usage
#### 11.1. Paper
[Rethinking Spatial Dimensions of Vision Transformers](https://arxiv.org/abs/2103.16302)

#### 11.2. Usage Code
```python

from fightingcv_attention.backbone.PIT import PoolingTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = PoolingTransformer(
        image_size=224,
        patch_size=14,
        stride=7,
        base_dims=[64, 64, 64],
        depth=[3, 6, 4],
        heads=[4, 8, 16],
        mlp_ratio=4
    )
    output=model(input)
    print(output.shape)

```

### 12 CrossViT Usage
#### 12.1. Paper
[CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification](https://arxiv.org/abs/2103.14899)

#### 12.2. Usage Code
```python

from fightingcv_attention.backbone.CrossViT import VisionTransformer
import torch
from torch import nn

if __name__ == "__main__":
    input=torch.randn(1,3,224,224)
    model = VisionTransformer(
        img_size=[240, 224],
        patch_size=[12, 16], 
        embed_dim=[192, 384], 
        depth=[[1, 4, 0], [1, 4, 0], [1, 4, 0]],
        num_heads=[6, 6], 
        mlp_ratio=[4, 4, 1], 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
    )
    output=model(input)
    print(output.shape)

```

### 13 TnT Usage
#### 13.1. Paper
[Transformer in Transformer](https://arxiv.org/abs/2103.00112)

#### 13.2. Usage Code
```python

from fightingcv_attention.backbone.TnT import TNT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = TNT(
        img_size=224, 
        patch_size=16, 
        outer_dim=384, 
        inner_dim=24, 
        depth=12,
        outer_num_heads=6, 
        inner_num_heads=4, 
        qkv_bias=False,
        inner_stride=4)
    output=model(input)
    print(output.shape)

```

### 14 DViT Usage
#### 14.1. Paper
[DeepViT: Towards Deeper Vision Transformer](https://arxiv.org/abs/2103.11886)

#### 14.2. Usage Code
```python

from fightingcv_attention.backbone.DViT import DeepVisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = DeepVisionTransformer(
        patch_size=16, embed_dim=384, 
        depth=[False] * 16, 
        apply_transform=[False] * 0 + [True] * 32, 
        num_heads=12, 
        mlp_ratio=3, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6),
        )
    output=model(input)
    print(output.shape)

```

### 15 CeiT Usage
#### 15.1. Paper
[Incorporating Convolution Designs into Visual Transformers](https://arxiv.org/abs/2103.11816)

#### 15.2. Usage Code
```python

from fightingcv_attention.backbone.CeiT import CeIT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CeIT(
        hybrid_backbone=Image2Tokens(),
        patch_size=4, 
        embed_dim=192, 
        depth=12, 
        num_heads=3, 
        mlp_ratio=4, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
        )
    output=model(input)
    print(output.shape)

```

### 16 ConViT Usage
#### 16.1. Paper
[ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases](https://arxiv.org/abs/2103.10697)

#### 16.2. Usage Code
```python

from fightingcv_attention.backbone.ConViT import VisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VisionTransformer(
        num_heads=16,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
        )
    output=model(input)
    print(output.shape)

```

### 17 CaiT Usage
#### 17.1. Paper
[Going deeper with Image Transformers](https://arxiv.org/abs/2103.17239)

#### 17.2. Usage Code
```python

from fightingcv_attention.backbone.CaiT import CaiT
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CaiT(
        img_size= 224,
        patch_size=16, 
        embed_dim=192, 
        depth=24, 
        num_heads=4, 
        mlp_ratio=4, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6),
        init_scale=1e-5,
        depth_token_only=2
        )
    output=model(input)
    print(output.shape)

```

### 18 PatchConvnet Usage
#### 18.1. Paper
[Augmenting Convolutional networks with attention-based aggregation](https://arxiv.org/abs/2112.13692)

#### 18.2. Usage Code
```python

from fightingcv_attention.backbone.PatchConvnet import PatchConvnet
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = PatchConvnet(
        patch_size=16,
        embed_dim=384,
        depth=60,
        num_heads=1,
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6),
        Patch_layer=ConvStem,
        Attention_block=Conv_blocks_se,
        depth_token_only=1,
        mlp_ratio_clstk=3.0,
    )
    output=model(input)
    print(output.shape)

```

### 19 DeiT Usage
#### 19.1. Paper
[Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877)

#### 19.2. Usage Code
```python

from fightingcv_attention.backbone.DeiT import DistilledVisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = DistilledVisionTransformer(
        patch_size=16, 
        embed_dim=384, 
        depth=12, 
        num_heads=6, 
        mlp_ratio=4, 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6)
        )
    output=model(input)
    print(output[0].shape)

```

### 20 LeViT Usage
#### 20.1. Paper
[LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference](https://arxiv.org/abs/2104.01136)

#### 20.2. Usage Code
```python

from fightingcv_attention.backbone.LeViT import *
import torch
from torch import nn

if __name__ == '__main__':
    for name in specification:
        input=torch.randn(1,3,224,224)
        model = globals()[name](fuse=True, pretrained=False)
        model.eval()
        output = model(input)
        print(output.shape)

```

### 21 VOLO Usage
#### 21.1. Paper
[VOLO: Vision Outlooker for Visual Recognition](https://arxiv.org/abs/2106.13112)

#### 21.2. Usage Code
```python

from fightingcv_attention.backbone.VOLO import VOLO
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VOLO([4, 4, 8, 2],
                 embed_dims=[192, 384, 384, 384],
                 num_heads=[6, 12, 12, 12],
                 mlp_ratios=[3, 3, 3, 3],
                 downsamples=[True, False, False, False],
                 outlook_attention=[True, False, False, False ],
                 post_layers=['ca', 'ca'],
                 )
    output=model(input)
    print(output[0].shape)

```

### 22 Container Usage
#### 22.1. Paper
[Container: Context Aggregation Network](https://arxiv.org/abs/2106.01401)

#### 22.2. Usage Code
```python

from fightingcv_attention.backbone.Container import VisionTransformer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VisionTransformer(
        img_size=[224, 56, 28, 14], 
        patch_size=[4, 2, 2, 2], 
        embed_dim=[64, 128, 320, 512], 
        depth=[3, 4, 8, 3], 
        num_heads=16, 
        mlp_ratio=[8, 8, 4, 4], 
        qkv_bias=True,
        norm_layer=partial(nn.LayerNorm, eps=1e-6))
    output=model(input)
    print(output.shape)

```

### 23 CMT Usage
#### 23.1. Paper
[CMT: Convolutional Neural Networks Meet Vision Transformers](https://arxiv.org/abs/2107.06263)

#### 23.2. Usage Code
```python

from fightingcv_attention.backbone.CMT import CMT_Tiny
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = CMT_Tiny()
    output=model(input)
    print(output[0].shape)

```

### 24 EfficientFormer Usage
#### 24.1. Paper
[EfficientFormer: Vision Transformers at MobileNet Speed](https://arxiv.org/abs/2206.01191)

#### 24.2. Usage Code
```python

from fightingcv_attention.backbone.EfficientFormer import EfficientFormer
import torch
from torch import nn

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = EfficientFormer(
        layers=EfficientFormer_depth['l1'],
        embed_dims=EfficientFormer_width['l1'],
        downsamples=[True, True, True, True],
        vit_num=1,
    )
    output=model(input)
    print(output[0].shape)

```






# MLP Series

- Pytorch implementation of ["RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition---arXiv 2021.05.05"](https://arxiv.org/pdf/2105.01883v1.pdf)

- Pytorch implementation of ["MLP-Mixer: An all-MLP Architecture for Vision---arXiv 2021.05.17"](https://arxiv.org/pdf/2105.01601.pdf)

- Pytorch implementation of ["ResMLP: Feedforward networks for image classification with data-efficient training---arXiv 2021.05.07"](https://arxiv.org/pdf/2105.03404.pdf)

- Pytorch implementation of ["Pay Attention to MLPs---arXiv 2021.05.17"](https://arxiv.org/abs/2105.08050)


- Pytorch implementation of ["Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?---arXiv 2021.09.12"](https://arxiv.org/abs/2109.05422)

### 1. RepMLP Usage
#### 1.1. Paper
["RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition"](https://arxiv.org/pdf/2105.01883v1.pdf)

#### 1.2. Overview
![](./model/img/repmlp.png)

#### 1.3. Usage Code
```python
from fightingcv_attention.mlp.repmlp import RepMLP
import torch
from torch import nn

N=4 #batch size
C=512 #input dim
O=1024 #output dim
H=14 #image height
W=14 #image width
h=7 #patch height
w=7 #patch width
fc1_fc2_reduction=1 #reduction ratio
fc3_groups=8 # groups
repconv_kernels=[1,3,5,7] #kernel list
repmlp=RepMLP(C,O,H,W,h,w,fc1_fc2_reduction,fc3_groups,repconv_kernels=repconv_kernels)
x=torch.randn(N,C,H,W)
repmlp.eval()
for module in repmlp.modules():
    if isinstance(module, nn.BatchNorm2d) or isinstance(module, nn.BatchNorm1d):
        nn.init.uniform_(module.running_mean, 0, 0.1)
        nn.init.uniform_(module.running_var, 0, 0.1)
        nn.init.uniform_(module.weight, 0, 0.1)
        nn.init.uniform_(module.bias, 0, 0.1)

#training result
out=repmlp(x)
#inference result
repmlp.switch_to_deploy()
deployout = repmlp(x)

print(((deployout-out)**2).sum())
```

### 2. MLP-Mixer Usage
#### 2.1. Paper
["MLP-Mixer: An all-MLP Architecture for Vision"](https://arxiv.org/pdf/2105.01601.pdf)

#### 2.2. Overview
![](./model/img/mlpmixer.png)

#### 2.3. Usage Code
```python
from fightingcv_attention.mlp.mlp_mixer import MlpMixer
import torch
mlp_mixer=MlpMixer(num_classes=1000,num_blocks=10,patch_size=10,tokens_hidden_dim=32,channels_hidden_dim=1024,tokens_mlp_dim=16,channels_mlp_dim=1024)
input=torch.randn(50,3,40,40)
output=mlp_mixer(input)
print(output.shape)
```

***

### 3. ResMLP Usage
#### 3.1. Paper
["ResMLP: Feedforward networks for image classification with data-efficient training"](https://arxiv.org/pdf/2105.03404.pdf)

#### 3.2. Overview
![](./model/img/resmlp.png)

#### 3.3. Usage Code
```python
from fightingcv_attention.mlp.resmlp import ResMLP
import torch

input=torch.randn(50,3,14,14)
resmlp=ResMLP(dim=128,image_size=14,patch_size=7,class_num=1000)
out=resmlp(input)
print(out.shape) #the last dimention is class_num
```

***

### 4. gMLP Usage
#### 4.1. Paper
["Pay Attention to MLPs"](https://arxiv.org/abs/2105.08050)

#### 4.2. Overview
![](./model/img/gMLP.jpg)

#### 4.3. Usage Code
```python
from fightingcv_attention.mlp.g_mlp import gMLP
import torch

num_tokens=10000
bs=50
len_sen=49
num_layers=6
input=torch.randint(num_tokens,(bs,len_sen)) #bs,len_sen
gmlp = gMLP(num_tokens=num_tokens,len_sen=len_sen,dim=512,d_ff=1024)
output=gmlp(input)
print(output.shape)
```

***

### 5. sMLP Usage
#### 5.1. Paper
["Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?"](https://arxiv.org/abs/2109.05422)

#### 5.2. Overview
![](./model/img/sMLP.jpg)

#### 5.3. Usage Code
```python
from fightingcv_attention.mlp.sMLP_block import sMLPBlock
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,3,224,224)
    smlp=sMLPBlock(h=224,w=224)
    out=smlp(input)
    print(out.shape)
```

### 6. vip-mlp Usage
#### 6.1. Paper
["Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition"](https://arxiv.org/abs/2106.12368)

#### 6.2. Usage Code
```python
from fightingcv_attention.mlp.vip-mlp import VisionPermutator
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(1,3,224,224)
    model = VisionPermutator(
        layers=[4, 3, 8, 3], 
        embed_dims=[384, 384, 384, 384], 
        patch_size=14, 
        transitions=[False, False, False, False],
        segment_dim=[16, 16, 16, 16], 
        mlp_ratios=[3, 3, 3, 3], 
        mlp_fn=WeightedPermuteMLP
    )
    output=model(input)
    print(output.shape)
```


# Re-Parameter Series

- Pytorch implementation of ["RepVGG: Making VGG-style ConvNets Great Again---CVPR2021"](https://arxiv.org/abs/2101.03697)

- Pytorch implementation of ["ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks---ICCV2019"](https://arxiv.org/abs/1908.03930)

- Pytorch implementation of ["Diverse Branch Block: Building a Convolution as an Inception-like Unit---CVPR2021"](https://arxiv.org/abs/2103.13425)


***

### 1. RepVGG Usage
#### 1.1. Paper
["RepVGG: Making VGG-style ConvNets Great Again"](https://arxiv.org/abs/2101.03697)

#### 1.2. Overview
![](./model/img/repvgg.png)

#### 1.3. Usage Code
```python

from fightingcv_attention.rep.repvgg import RepBlock
import torch


input=torch.randn(50,512,49,49)
repblock=RepBlock(512,512)
repblock.eval()
out=repblock(input)
repblock._switch_to_deploy()
out2=repblock(input)
print('difference between vgg and repvgg')
print(((out2-out)**2).sum())
```



***

### 2. ACNet Usage
#### 2.1. Paper
["ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks"](https://arxiv.org/abs/1908.03930)

#### 2.2. Overview
![](./model/img/acnet.png)

#### 2.3. Usage Code
```python
from fightingcv_attention.rep.acnet import ACNet
import torch
from torch import nn

input=torch.randn(50,512,49,49)
acnet=ACNet(512,512)
acnet.eval()
out=acnet(input)
acnet._switch_to_deploy()
out2=acnet(input)
print('difference:')
print(((out2-out)**2).sum())

```



***

### 2. Diverse Branch Block Usage
#### 2.1. Paper
["Diverse Branch Block: Building a Convolution as an Inception-like Unit"](https://arxiv.org/abs/2103.13425)

#### 2.2. Overview
![](./model/img/ddb.png)

#### 2.3. Usage Code
##### 2.3.1 Transform I
```python
from fightingcv_attention.rep.ddb import transI_conv_bn
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)
#conv+bn
conv1=nn.Conv2d(64,64,3,padding=1)
bn1=nn.BatchNorm2d(64)
bn1.eval()
out1=bn1(conv1(input))

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transI_conv_bn(conv1,bn1)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.2 Transform II
```python
from fightingcv_attention.rep.ddb import transII_conv_branch
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1=nn.Conv2d(64,64,3,padding=1)
conv2=nn.Conv2d(64,64,3,padding=1)
out1=conv1(input)+conv2(input)

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transII_conv_branch(conv1,conv2)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.3 Transform III
```python
from fightingcv_attention.rep.ddb import transIII_conv_sequential
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1=nn.Conv2d(64,64,1,padding=0,bias=False)
conv2=nn.Conv2d(64,64,3,padding=1,bias=False)
out1=conv2(conv1(input))


#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1,bias=False)
conv_fuse.weight.data=transIII_conv_sequential(conv1,conv2)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.4 Transform IV
```python
from fightingcv_attention.rep.ddb import transIV_conv_concat
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1=nn.Conv2d(64,32,3,padding=1)
conv2=nn.Conv2d(64,32,3,padding=1)
out1=torch.cat([conv1(input),conv2(input)],dim=1)

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transIV_conv_concat(conv1,conv2)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```

##### 2.3.5 Transform V
```python
from fightingcv_attention.rep.ddb import transV_avg
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

avg=nn.AvgPool2d(kernel_size=3,stride=1)
out1=avg(input)

conv=transV_avg(64,3)
out2=conv(input)

print("difference:",((out2-out1)**2).sum().item())
```


##### 2.3.6 Transform VI
```python
from fightingcv_attention.rep.ddb import transVI_conv_scale
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,64,7,7)

#conv+conv
conv1x1=nn.Conv2d(64,64,1)
conv1x3=nn.Conv2d(64,64,(1,3),padding=(0,1))
conv3x1=nn.Conv2d(64,64,(3,1),padding=(1,0))
out1=conv1x1(input)+conv1x3(input)+conv3x1(input)

#conv_fuse
conv_fuse=nn.Conv2d(64,64,3,padding=1)
conv_fuse.weight.data,conv_fuse.bias.data=transVI_conv_scale(conv1x1,conv1x3,conv3x1)
out2=conv_fuse(input)

print("difference:",((out2-out1)**2).sum().item())
```





# Convolution Series

- Pytorch implementation of ["MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications---CVPR2017"](https://arxiv.org/abs/1704.04861)

- Pytorch implementation of ["Efficientnet: Rethinking model scaling for convolutional neural networks---PMLR2019"](http://proceedings.mlr.press/v97/tan19a.html)

- Pytorch implementation of ["Involution: Inverting the Inherence of Convolution for Visual Recognition---CVPR2021"](https://arxiv.org/abs/2103.06255)

- Pytorch implementation of ["Dynamic Convolution: Attention over Convolution Kernels---CVPR2020 Oral"](https://arxiv.org/abs/1912.03458)

- Pytorch implementation of ["CondConv: Conditionally Parameterized Convolutions for Efficient Inference---NeurIPS2019"](https://arxiv.org/abs/1904.04971)

***

### 1. Depthwise Separable Convolution Usage
#### 1.1. Paper
["MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications"](https://arxiv.org/abs/1704.04861)

#### 1.2. Overview
![](./model/img/DepthwiseSeparableConv.png)

#### 1.3. Usage Code
```python
from fightingcv_attention.conv.DepthwiseSeparableConvolution import DepthwiseSeparableConvolution
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,3,224,224)
dsconv=DepthwiseSeparableConvolution(3,64)
out=dsconv(input)
print(out.shape)
```

***


### 2. MBConv Usage
#### 2.1. Paper
["Efficientnet: Rethinking model scaling for convolutional neural networks"](http://proceedings.mlr.press/v97/tan19a.html)

#### 2.2. Overview
![](./model/img/MBConv.jpg)

#### 2.3. Usage Code
```python
from fightingcv_attention.conv.MBConv import MBConvBlock
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,3,224,224)
mbconv=MBConvBlock(ksize=3,input_filters=3,output_filters=512,image_size=224)
out=mbconv(input)
print(out.shape)


```

***


### 3. Involution Usage
#### 3.1. Paper
["Involution: Inverting the Inherence of Convolution for Visual Recognition"](https://arxiv.org/abs/2103.06255)

#### 3.2. Overview
![](./model/img/Involution.png)

#### 3.3. Usage Code
```python
from fightingcv_attention.conv.Involution import Involution
import torch
from torch import nn
from torch.nn import functional as F

input=torch.randn(1,4,64,64)
involution=Involution(kernel_size=3,in_channel=4,stride=2)
out=involution(input)
print(out.shape)
```

***


### 4. DynamicConv Usage
#### 4.1. Paper
["Dynamic Convolution: Attention over Convolution Kernels"](https://arxiv.org/abs/1912.03458)

#### 4.2. Overview
![](./model/img/DynamicConv.png)

#### 4.3. Usage Code
```python
from fightingcv_attention.conv.DynamicConv import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(2,32,64,64)
    m=DynamicConv(in_planes=32,out_planes=64,kernel_size=3,stride=1,padding=1,bias=False)
    out=m(input)
    print(out.shape) # 2,32,64,64

```

***


### 5. CondConv Usage
#### 5.1. Paper
["CondConv: Conditionally Parameterized Convolutions for Efficient Inference"](https://arxiv.org/abs/1904.04971)

#### 5.2. Overview
![](./model/img/CondConv.png)

#### 5.3. Usage Code
```python
from fightingcv_attention.conv.CondConv import *
import torch
from torch import nn
from torch.nn import functional as F





if __name__ == '__main__':
    input=torch.randn(2,32,64,64)
    m=CondConv(in_planes=32,out_planes=64,kernel_size=3,stride=1,padding=1,bias=False)
    out=m(input)
    print(out.shape)

```


================================================
FILE: main.py
================================================
from model.attention.MobileViTv2Attention import *
import torch
from torch import nn
from torch.nn import functional as F

if __name__ == '__main__':
    input=torch.randn(50,49,512)
    sa = MobileViTv2Attention(d_model=512)
    output=sa(input)
    print(output.shape)
 

================================================
FILE: model/.vscode/settings.json
================================================
{
    "python.pythonPath": "D:\\Anaconda\\python.exe"
}

================================================
FILE: model/__init__.py
================================================

def test():
    print ("hello world")

if __name__ == '__main__':
    test()

================================================
FILE: model/analysis/Attention.md
================================================
## Content
  
- [1. External Attention](#1-external-attention)

- [2. Self Attention](#2-self-attention)

- [3. Squeeze-and-Excitation(SE) Attention](#3-squeeze-and-excitationse-attention)

- [4. Selective Kernel(SK) Attention](#4-selective-kernelsk-attention)

- [5. CBAM Attention](#5-cbam-attention)

- [6. BAM Attention](#6-bam-attention)

- [7. ECA Attention](#7-eca-attention)

- [8. DANet Attention](#8-danet-attention)

- [9. Pyramid Split Attention(PSA)](#9-pyramid-split-attentionpsa)

- [10. Efficient Multi-Head Self-Attention(EMSA)](#10-efficient-multi-head-self-attentionemsa)

- [Write at the end](#Write_at_the_end)



## 1. External Attention

### 1.1. Citation

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks.---arXiv 2021.05.05

Address:[https://arxiv.org/abs/2105.02358](https://arxiv.org/abs/2105.02358)

### 1.2. Model Structure

![](./img/External_Attention.png)

### 1.3. Brief
This is an article on arXiv in May. It mainly solves two pain points of Self-Attention (SA): (1) O(n^2) computational complexity; (2) SA is in the same sample The above calculates Attention based on different positions, ignoring the relationship between different samples. Therefore, this paper uses two serial MLP structures as memory units, which reduces the computational complexity to O(n); in addition, these two memory units are learned based on all training data, so they also implicitly consider the differences. The connection between the samples.

### 1.4. Usage

```python
from attention.ExternalAttention import ExternalAttention
import torch


input=torch.randn(50,49,512)
ea = ExternalAttention(d_model=512,S=8)
output=ea(input)
print(output.shape)
```





## 2. Self Attention

### 2.1. Citation

Attention Is All You Need---NeurIPS2017

Address:[https://arxiv.org/abs/1706.03762](https://arxiv.org/abs/1706.03762)

### 2.2. Model Structure

![](./img/SA.png)

### 2.3. Brief
This is an article published by Google in NeurIPS2017. It has a great influence in various fields such as CV, NLP, and multi-modality. The current citation volume has been 2.2w+. The Self-Attention proposed in Transformer is a kind of Attention, which is used to calculate the weight between different positions in the feature, so as to achieve the effect of updating the feature. First, the input feature is mapped into three features of Q, K, and V through FC, and then Q and K are dot-multiplied to obtain the attention map, and the attention map and V are dot-multiplied to obtain the weighted feature. Finally, the feature is mapped through FC, and a new feature is obtained. (There are many very good explanations about Transformer and Self-Attention on the Internet, so I won’t give a detailed introduction here)

### 2.4. Usage

```python
from attention.SelfAttention import ScaledDotProductAttention
import torch

input=torch.randn(50,49,512)
sa = ScaledDotProductAttention(d_model=512, d_k=512, d_v=512, h=8)
output=sa(input,input,input)
print(output.shape)
```





## 3. Squeeze-and-Excitation(SE) Attention

### 3.1. Citation

Squeeze-and-Excitation Networks---CVPR2018

Address:[https://arxiv.org/abs/1709.01507](https://arxiv.org/abs/1709.01507)

### 3.2. Model Structure

![](./img/SE.png)

### 3.3. Brief
This is an article of CVPR2018, which is also very influential. The current citation volume is 7k+. This article is for channel attention. Because of its simple structure and effectiveness, it has set off a wave of channel attention. From the avenue to the simple, the idea of this article can be said to be very simple. First, the spatial dimension is applied to AdaptiveAvgPool, and then the channel attention is learned through two FCs, and the Sigmoid is used for normalization to obtain the Channel Attention Map, and finally the Channel Attention Map is combined with the original Multiply the features to get the weighted features.

### 3.4. Usage

```python
from attention.SEAttention import SEAttention
import torch

input=torch.randn(50,512,7,7)
se = SEAttention(channel=512,reduction=8)
output=se(input)
print(output.shape)
```



 

## 4. Selective Kernel(SK) Attention

### 4.1. Citation

Selective Kernel Networks---CVPR2019

Address:[https://arxiv.org/pdf/1903.06586.pdf](https://arxiv.org/pdf/1903.06586.pdf)

### 4.2. Model Structure

![](./img/SK.png)

### 4.3. Brief
This is an article from CVPR2019, which pays tribute to SENet's thoughts. In traditional CNN, each convolutional layer uses the same size convolution kernel, which limits the expressive ability of the model; and the "wider" model structure of Inception is also verified, using multiple different convolution kernels. Learning can indeed improve the expressive ability of the model. The author draws on the idea of ​​SENet, obtains the weight of the channel by dynamically calculating each convolution kernel, and dynamically merges the results of each convolution kernel.

I personally think that the reason why this article can also be called lightweight is that when channel attention is performed on the features of different kernels, the parameters are shared (ie because before Attention, the features are first fused, so different The result of the convolution kernel shares a parameter of the SE module).

The method in this article is divided into three parts: Split, Fuse, and Select. Split is a multi-branch operation, convolution with different convolution kernels to get different features; the Fuse part is to use the SE structure to obtain the channel attention matrix (N convolution kernels can get N attention matrices , This step is shared with all the feature parameters), so that the features of different kernels after SE can be obtained; the Select operation is to add these features.

### 4.4. Usage

```python
from attention.SKAttention import SKAttention
import torch

input=torch.randn(50,512,7,7)
se = SKAttention(channel=512,reduction=8)
output=se(input)
print(output.shape)
```



 

## 5. CBAM Attention

### 5.1. Citation

CBAM: Convolutional Block Attention Module---ECCV2018

Address:[https://openaccess.thecvf.com/content_ECCV_2018/papers/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.pdf](https://openaccess.thecvf.com/content_ECCV_2018/papers/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.pdf)

### 5.2. Model Structure

![](./img/CBAM1.png)

![](./img/CBAM2.png)

### 5.3. Brief
This is an ECCV2018 paper. This article uses Channel Attention and Spatial Attention at the same time and connects the two in series (the article also does ablation experiments in parallel and two series).

In terms of Channel Attention, the general structure is still similar to SE, but the author proposes that AvgPool and MaxPool have different representation effects, so the author performs AvgPool and MaxPool on the original features in the Spatial dimension, and then uses the SE structure to extract channel attention. Note here The parameters are shared, and then the two features are added and normalized to obtain the attention matrix.

Spatial Attention is similar to Channel Attention. After performing two pools in the channel dimension, the two features are spliced, and then a 7x7 convolution is used to extract the Spatial Attention (the reason for using 7x7 is because the spatial attention is extracted, so use The convolution kernel must be large enough). Then do a normalization to get the spatial attention matrix.

### 5.4. Usage

```python
from attention.CBAM import CBAMBlock
import torch

input=torch.randn(50,512,7,7)
kernel_size=input.shape[2]
cbam = CBAMBlock(channel=512,reduction=16,kernel_size=kernel_size)
output=cbam(input)
print(output.shape)
```



 

## 6. BAM Attention

### 6.1. Citation

BAM: Bottleneck Attention Module---BMCV2018

Address:[https://arxiv.org/pdf/1807.06514.pdf](https://arxiv.org/pdf/1807.06514.pdf)

### 6.2. Model Structure

![](./img/BAM.png)

### 6.3. Brief
This is the work of CBAM and the author at the same time. The work is very similar to CBAM, and it is also dual attention. The difference is that CBAM connects the results of two attention in series; while BAM directly adds two attention matrices.

In terms of Channel Attention, the structure is basically the same as SE. In terms of Spatial Attention, the pool is still performed in the channel dimension, and then a 3x3 hole convolution is used twice, and finally a 1x1 convolution will be used to obtain the Spatial Attention matrix.

Finally, the Channel Attention and Spatial Attention matrices are added (the broadcast mechanism is used here) and normalized. In this way, the attention matrix that combines space and channel is obtained.

### 6.4. Usage

```python
from attention.BAM import BAMBlock
import torch

input=torch.randn(50,512,7,7)
bam = BAMBlock(channel=512,reduction=16,dia_val=2)
output=bam(input)
print(output.shape)
```





## 7. ECA Attention

### 7.1. Citation

ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks---CVPR2020

Address:[https://arxiv.org/pdf/1910.03151.pdf](https://arxiv.org/pdf/1910.03151.pdf)

### 7.2. Model Structure

![](./img/ECA.png)

### 7.3. Brief
This is an article of CVPR2020.

As shown in the figure above, SE uses two fully connected layers to achieve channel attention, while ECA requires one convolution. The reason why the author did this is that it is not necessary to calculate the attention between all channels. On the other hand, the use of two fully connected layers does introduce too many parameters and calculations.

Therefore, after the author performed AvgPool, he only used a one-dimensional convolution with a receptive field of k (equivalent to only calculating the attention of the adjacent k channels), which greatly reduced the parameters and calculation amount. (i.e. is equivalent to SE being a global attention, while ECA is a local attention).
### 7.4. Usage

```python
from attention.ECAAttention import ECAAttention
import torch

input=torch.randn(50,512,7,7)
eca = ECAAttention(kernel_size=3)
output=eca(input)
print(output.shape)
```



 

## 8. DANet Attention

### 8.1. Citation

Dual Attention Network for Scene Segmentation---CVPR2019

Address:[https://arxiv.org/pdf/1809.02983.pdf](https://arxiv.org/pdf/1809.02983.pdf)

### 8.2. Model Structure

![](./img/danet.png)


### 8.3. Brief
This is an article by CVPR2019. The idea is very simple, that is, self-attention is used in the task of scene segmentation. The difference is that self-attention is to pay attention to the attention between each position, and this article will make a self-attention. To expand, we also made a branch of channel attention. The operation is the same as self-attention. The three Linears that generate Q, K, and V are removed from different channel attention. Finally, the features after the two attentions are summed element-wise.

### 8.4. Usage

```python
from attention.DANet import DAModule
import torch

input=torch.randn(50,512,7,7)
danet=DAModule(d_model=512,kernel_size=3,H=7,W=7)
print(danet(input).shape)
```



 

## 9. Pyramid Split Attention(PSA)

### 9.1. Citation

EPSANet: An Efficient Pyramid Split Attention Block on Convolutional Neural Network---arXiv 2021.05.30

Address:[https://arxiv.org/pdf/2105.14447.pdf](https://arxiv.org/pdf/2105.14447.pdf)

### 9.2. Model Structure

![](./img/psa.png)


### 9.3. Brief
This is an article uploaded by Shenzhen University on arXiv on May 30. The purpose of this article is how to obtain and explore spatial information of different scales to enrich the feature space. The network structure is relatively simple, mainly divided into four steps. In the first part, the original feature is divided into n groups according to the channel, and then the different groups are convolved with different scales to obtain the new feature W1; the second part is SE performs SE on the original features to obtain different Attention Map; the third part is to perform softmax on different groups; the fourth part is to multiply the obtained attention with the original feature W1.
### 9.4. Usage

```python
from attention.PSA import PSA
import torch

input=torch.randn(50,512,7,7)
psa = PSA(channel=512,reduction=8)
output=psa(input)
print(output.shape)
```



 

## 10. Efficient Multi-Head Self-Attention
Download .txt
gitextract_gcvwg1wd/

├── LICENSE
├── README.md
├── README_EN.md
├── README_pip.md
├── main.py
├── model/
│   ├── .vscode/
│   │   └── settings.json
│   ├── __init__.py
│   ├── analysis/
│   │   ├── Attention.md
│   │   ├── 注意力机制.md
│   │   └── 重参数机制.md
│   ├── attention/
│   │   ├── A2Atttention.py
│   │   ├── ACmixAttention.py
│   │   ├── AFT.py
│   │   ├── Axial_attention.py
│   │   ├── BAM.py
│   │   ├── CBAM.py
│   │   ├── CoAtNet.py
│   │   ├── CoTAttention.py
│   │   ├── CoordAttention.py
│   │   ├── CrissCrossAttention.py
│   │   ├── Crossformer.py
│   │   ├── DANet.py
│   │   ├── DAT.py
│   │   ├── ECAAttention.py
│   │   ├── EMSA.py
│   │   ├── ExternalAttention.py
│   │   ├── HaloAttention.py
│   │   ├── MOATransformer.py
│   │   ├── MUSEAttention.py
│   │   ├── MobileViTAttention.py
│   │   ├── MobileViTv2Attention.py
│   │   ├── OutlookAttention.py
│   │   ├── PSA.py
│   │   ├── ParNetAttention.py
│   │   ├── PolarizedSelfAttention.py
│   │   ├── ResidualAttention.py
│   │   ├── S2Attention.py
│   │   ├── SEAttention.py
│   │   ├── SGE.py
│   │   ├── SKAttention.py
│   │   ├── SelfAttention.py
│   │   ├── ShuffleAttention.py
│   │   ├── SimAM.py
│   │   ├── SimplifiedSelfAttention.py
│   │   ├── TripletAttention.py
│   │   ├── UFOAttention.py
│   │   ├── ViP.py
│   │   └── gfnet.py
│   ├── backbone/
│   │   ├── CMT.py
│   │   ├── CPVT.py
│   │   ├── CaiT.py
│   │   ├── CeiT.py
│   │   ├── CoaT.py
│   │   ├── ConTNet.py
│   │   ├── ConViT.py
│   │   ├── Container.py
│   │   ├── ConvMixer.py
│   │   ├── CrossViT.py
│   │   ├── DViT.py
│   │   ├── DeiT.py
│   │   ├── EfficientFormer.py
│   │   ├── HATNet.py
│   │   ├── LeViT.py
│   │   ├── MobileNetV3.py
│   │   ├── MobileViT.py
│   │   ├── PIT.py
│   │   ├── PVT.py
│   │   ├── PatchConvnet.py
│   │   ├── ShuffleTransformer.py
│   │   ├── TnT.py
│   │   ├── VOLO.py
│   │   ├── convnextv2.py
│   │   ├── resnet.py
│   │   ├── resnext.py
│   │   ├── swin_transformer.py
│   │   ├── swin_transformer_v2.py
│   │   └── swin_transformer_v2_cr.py
│   ├── conv/
│   │   ├── CondConv.py
│   │   ├── DepthwiseSeparableConvolution.py
│   │   ├── DynamicConv.py
│   │   ├── HorNet.py
│   │   ├── Involution.py
│   │   └── MBConv.py
│   ├── fighingcv.egg-info/
│   │   ├── PKG-INFO
│   │   ├── SOURCES.txt
│   │   ├── dependency_links.txt
│   │   ├── entry_points.txt
│   │   ├── requires.txt
│   │   └── top_level.txt
│   ├── huggingface_hub.egg-info/
│   │   ├── PKG-INFO
│   │   ├── SOURCES.txt
│   │   ├── dependency_links.txt
│   │   ├── entry_points.txt
│   │   ├── requires.txt
│   │   └── top_level.txt
│   ├── mlp/
│   │   ├── g_mlp.py
│   │   ├── mlp_mixer.py
│   │   ├── repmlp.py
│   │   ├── resmlp.py
│   │   ├── sMLP_block.py
│   │   └── vip-mlp.py
│   └── rep/
│       ├── acnet.py
│       ├── ddb.py
│       ├── mobileone.py
│       └── repvgg.py
└── setup.py
Download .txt
SYMBOL INDEX (1441 symbols across 83 files)

FILE: model/__init__.py
  function test (line 2) | def test():

FILE: model/attention/A2Atttention.py
  class DoubleAttention (line 9) | class DoubleAttention(nn.Module):
    method __init__ (line 11) | def __init__(self, in_channels,c_m,c_n,reconstruct = True):
    method init_weights (line 25) | def init_weights(self):
    method forward (line 39) | def forward(self, x):

FILE: model/attention/ACmixAttention.py
  function position (line 5) | def position(H, W, is_cuda=True):
  function stride (line 16) | def stride(x, stride):
  function init_rate_half (line 20) | def init_rate_half(tensor):
  function init_rate_0 (line 24) | def init_rate_0(tensor):
  class ACmix (line 29) | class ACmix(nn.Module):
    method __init__ (line 30) | def __init__(self, in_planes, out_planes, kernel_att=7, head=4, kernel...
    method reset_parameters (line 58) | def reset_parameters(self):
    method forward (line 68) | def forward(self, x):

FILE: model/attention/AFT.py
  class AFT_FULL (line 8) | class AFT_FULL(nn.Module):
    method __init__ (line 10) | def __init__(self, d_model,n=49,simple=False):
    method init_weights (line 27) | def init_weights(self):
    method forward (line 41) | def forward(self, input):

FILE: model/attention/Axial_attention.py
  class Deterministic (line 9) | class Deterministic(nn.Module):
    method __init__ (line 10) | def __init__(self, net):
    method record_rng (line 18) | def record_rng(self, *args):
    method forward (line 24) | def forward(self, *args, record_rng = False, set_rng = False, **kwargs):
  class ReversibleBlock (line 43) | class ReversibleBlock(nn.Module):
    method __init__ (line 44) | def __init__(self, f, g):
    method forward (line 49) | def forward(self, x, f_args = {}, g_args = {}):
    method backward_pass (line 59) | def backward_pass(self, y, dy, f_args = {}, g_args = {}):
  class IrreversibleBlock (line 97) | class IrreversibleBlock(nn.Module):
    method __init__ (line 98) | def __init__(self, f, g):
    method forward (line 103) | def forward(self, x, f_args, g_args):
  class _ReversibleFunction (line 109) | class _ReversibleFunction(Function):
    method forward (line 111) | def forward(ctx, x, blocks, kwargs):
    method backward (line 120) | def backward(ctx, dy):
  class ReversibleSequence (line 127) | class ReversibleSequence(nn.Module):
    method __init__ (line 128) | def __init__(self, blocks, ):
    method forward (line 132) | def forward(self, x, arg_route = (True, True), **kwargs):
  function exists (line 142) | def exists(val):
  function map_el_ind (line 145) | def map_el_ind(arr, ind):
  function sort_and_return_indices (line 148) | def sort_and_return_indices(arr):
  function calculate_permutations (line 157) | def calculate_permutations(num_dimensions, emb_dim):
  class ChanLayerNorm (line 174) | class ChanLayerNorm(nn.Module):
    method __init__ (line 175) | def __init__(self, dim, eps = 1e-5):
    method forward (line 181) | def forward(self, x):
  class PreNorm (line 186) | class PreNorm(nn.Module):
    method __init__ (line 187) | def __init__(self, dim, fn):
    method forward (line 192) | def forward(self, x):
  class Sequential (line 196) | class Sequential(nn.Module):
    method __init__ (line 197) | def __init__(self, blocks):
    method forward (line 201) | def forward(self, x):
  class PermuteToFrom (line 207) | class PermuteToFrom(nn.Module):
    method __init__ (line 208) | def __init__(self, permutation, fn):
    method forward (line 215) | def forward(self, x, **kwargs):
  class AxialPositionalEmbedding (line 234) | class AxialPositionalEmbedding(nn.Module):
    method __init__ (line 235) | def __init__(self, dim, shape, emb_dim_index = 1):
    method forward (line 250) | def forward(self, x):
  class SelfAttention (line 257) | class SelfAttention(nn.Module):
    method __init__ (line 258) | def __init__(self, dim, heads, dim_heads = None):
    method forward (line 268) | def forward(self, x, kv = None):
  class AxialAttention (line 287) | class AxialAttention(nn.Module):
    method __init__ (line 288) | def __init__(self, dim, num_dimensions = 2, heads = 8, dim_heads = Non...
    method forward (line 302) | def forward(self, x):
  class AxialImageTransformer (line 316) | class AxialImageTransformer(nn.Module):
    method __init__ (line 317) | def __init__(self, dim, depth, heads = 8, dim_heads = None, dim_index ...
    method forward (line 340) | def forward(self, x):

FILE: model/attention/BAM.py
  class Flatten (line 6) | class Flatten(nn.Module):
    method forward (line 7) | def forward(self,x):
  class ChannelAttention (line 10) | class ChannelAttention(nn.Module):
    method __init__ (line 11) | def __init__(self,channel,reduction=16,num_layers=3):
    method forward (line 28) | def forward(self, x) :
  class SpatialAttention (line 34) | class SpatialAttention(nn.Module):
    method __init__ (line 35) | def __init__(self,channel,reduction=16,num_layers=3,dia_val=2):
    method forward (line 47) | def forward(self, x) :
  class BAMBlock (line 55) | class BAMBlock(nn.Module):
    method __init__ (line 57) | def __init__(self, channel=512,reduction=16,dia_val=2):
    method init_weights (line 64) | def init_weights(self):
    method forward (line 78) | def forward(self, x):

FILE: model/attention/CBAM.py
  class ChannelAttention (line 8) | class ChannelAttention(nn.Module):
    method __init__ (line 9) | def __init__(self,channel,reduction=16):
    method forward (line 20) | def forward(self, x) :
  class SpatialAttention (line 28) | class SpatialAttention(nn.Module):
    method __init__ (line 29) | def __init__(self,kernel_size=7):
    method forward (line 34) | def forward(self, x) :
  class CBAMBlock (line 44) | class CBAMBlock(nn.Module):
    method __init__ (line 46) | def __init__(self, channel=512,reduction=16,kernel_size=49):
    method init_weights (line 52) | def init_weights(self):
    method forward (line 66) | def forward(self, x):

FILE: model/attention/CoAtNet.py
  class CoAtNet (line 9) | class CoAtNet(nn.Module):
    method __init__ (line 10) | def __init__(self,in_ch,image_size,out_chs=[64,96,192,384,768]):
    method forward (line 56) | def forward(self, x) :

FILE: model/attention/CoTAttention.py
  class CoTAttention (line 11) | class CoTAttention(nn.Module):
    method __init__ (line 13) | def __init__(self, dim=512,kernel_size=3):
    method forward (line 37) | def forward(self, x):

FILE: model/attention/CoordAttention.py
  class h_sigmoid (line 5) | class h_sigmoid(nn.Module):
    method __init__ (line 6) | def __init__(self, inplace=True):
    method forward (line 10) | def forward(self, x):
  class h_swish (line 13) | class h_swish(nn.Module):
    method __init__ (line 14) | def __init__(self, inplace=True):
    method forward (line 18) | def forward(self, x):
  class CoordAtt (line 21) | class CoordAtt(nn.Module):
    method __init__ (line 22) | def __init__(self, inp, oup, reduction=32):
    method forward (line 37) | def forward(self, x):

FILE: model/attention/CrissCrossAttention.py
  function INF (line 11) | def INF(B,H,W):
  class CrissCrossAttention (line 15) | class CrissCrossAttention(nn.Module):
    method __init__ (line 17) | def __init__(self, in_dim):
    method forward (line 27) | def forward(self, x):

FILE: model/attention/Crossformer.py
  class Mlp (line 7) | class Mlp(nn.Module):
    method __init__ (line 8) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 17) | def forward(self, x):
  class DynamicPosBias (line 25) | class DynamicPosBias(nn.Module):
    method __init__ (line 26) | def __init__(self, dim, num_heads, residual):
    method forward (line 47) | def forward(self, biases):
    method flops (line 57) | def flops(self, N):
  class Attention (line 64) | class Attention(nn.Module):
    method __init__ (line 77) | def __init__(self, dim, group_size, num_heads, qkv_bias=True, qk_scale...
    method forward (line 118) | def forward(self, x, mask=None):
    method extra_repr (line 154) | def extra_repr(self) -> str:
    method flops (line 157) | def flops(self, N):
  class CrossFormerBlock (line 173) | class CrossFormerBlock(nn.Module):
    method __init__ (line 192) | def __init__(self, dim, input_resolution, num_heads, group_size=7, lsd...
    method forward (line 223) | def forward(self, x):
    method extra_repr (line 257) | def extra_repr(self) -> str:
    method flops (line 261) | def flops(self):
  class PatchMerging (line 275) | class PatchMerging(nn.Module):
    method __init__ (line 284) | def __init__(self, input_resolution, dim, norm_layer=nn.LayerNorm, pat...
    method forward (line 302) | def forward(self, x):
    method extra_repr (line 321) | def extra_repr(self) -> str:
    method flops (line 324) | def flops(self):
  class Stage (line 336) | class Stage(nn.Module):
    method __init__ (line 356) | def __init__(self, dim, input_resolution, depth, num_heads, group_size,
    method forward (line 388) | def forward(self, x):
    method extra_repr (line 398) | def extra_repr(self) -> str:
    method flops (line 401) | def flops(self):
  class PatchEmbed (line 410) | class PatchEmbed(nn.Module):
    method __init__ (line 421) | def __init__(self, img_size=224, patch_size=[4], in_chans=3, embed_dim...
    method forward (line 448) | def forward(self, x):
    method flops (line 462) | def flops(self):
  class CrossFormer (line 476) | class CrossFormer(nn.Module):
    method __init__ (line 501) | def __init__(self, img_size=224, patch_size=[4], in_chans=3, num_class...
    method _init_weights (line 565) | def _init_weights(self, m):
    method no_weight_decay (line 575) | def no_weight_decay(self):
    method no_weight_decay_keywords (line 579) | def no_weight_decay_keywords(self):
    method forward_features (line 582) | def forward_features(self, x):
    method forward (line 596) | def forward(self, x):
    method flops (line 601) | def flops(self):

FILE: model/attention/DANet.py
  class PositionAttentionModule (line 8) | class PositionAttentionModule(nn.Module):
    method __init__ (line 10) | def __init__(self,d_model=512,kernel_size=3,H=7,W=7):
    method forward (line 15) | def forward(self,x):
  class ChannelAttentionModule (line 23) | class ChannelAttentionModule(nn.Module):
    method __init__ (line 25) | def __init__(self,d_model=512,kernel_size=3,H=7,W=7):
    method forward (line 30) | def forward(self,x):
  class DAModule (line 40) | class DAModule(nn.Module):
    method __init__ (line 42) | def __init__(self,d_model=512,kernel_size=3,H=7,W=7):
    method forward (line 47) | def forward(self,input):

FILE: model/attention/DAT.py
  class LocalAttention (line 19) | class LocalAttention(nn.Module):
    method __init__ (line 21) | def __init__(self, dim, heads, window_size, attn_drop, proj_drop):
    method forward (line 55) | def forward(self, x, mask=None):
  class ShiftWindowAttention (line 92) | class ShiftWindowAttention(LocalAttention):
    method __init__ (line 94) | def __init__(self, dim, heads, window_size, attn_drop, proj_drop, shif...
    method forward (line 120) | def forward(self, x):
  class DAttentionBaseline (line 129) | class DAttentionBaseline(nn.Module):
    method __init__ (line 131) | def __init__(
    method _get_ref_points (line 205) | def _get_ref_points(self, H_key, W_key, B, dtype, device):
    method forward (line 218) | def forward(self, x):
  class TransformerMLP (line 297) | class TransformerMLP(nn.Module):
    method __init__ (line 299) | def __init__(self, channels, expansion, drop):
    method forward (line 312) | def forward(self, x):
  class LayerNormProxy (line 320) | class LayerNormProxy(nn.Module):
    method __init__ (line 322) | def __init__(self, dim):
    method forward (line 327) | def forward(self, x):
  class TransformerMLPWithConv (line 333) | class TransformerMLPWithConv(nn.Module):
    method __init__ (line 335) | def __init__(self, channels, expansion, drop):
    method forward (line 348) | def forward(self, x):
  class TransformerStage (line 355) | class TransformerStage(nn.Module):
    method __init__ (line 357) | def __init__(self, fmap_size, window_size, ns_per_pt,
    method forward (line 405) | def forward(self, x):
  class DAT (line 424) | class DAT(nn.Module):
    method __init__ (line 426) | def __init__(self, img_size=224, patch_size=4, num_classes=1000, expan...
    method reset_parameters (line 489) | def reset_parameters(self):
    method load_pretrained (line 497) | def load_pretrained(self, state_dict):
    method no_weight_decay (line 537) | def no_weight_decay(self):
    method no_weight_decay_keywords (line 541) | def no_weight_decay_keywords(self):
    method forward (line 544) | def forward(self, x):

FILE: model/attention/ECAAttention.py
  class ECAAttention (line 9) | class ECAAttention(nn.Module):
    method __init__ (line 11) | def __init__(self, kernel_size=3):
    method init_weights (line 17) | def init_weights(self):
    method forward (line 31) | def forward(self, x):

FILE: model/attention/EMSA.py
  class EMSA (line 8) | class EMSA(nn.Module):
    method __init__ (line 10) | def __init__(self, d_model, d_k, d_v, h,dropout=.1,H=7,W=7,ratio=3,app...
    method init_weights (line 42) | def init_weights(self):
    method forward (line 56) | def forward(self, queries, keys, values, attention_mask=None, attentio...

FILE: model/attention/ExternalAttention.py
  class ExternalAttention (line 8) | class ExternalAttention(nn.Module):
    method __init__ (line 10) | def __init__(self, d_model,S=64):
    method init_weights (line 18) | def init_weights(self):
    method forward (line 32) | def forward(self, queries):

FILE: model/attention/HaloAttention.py
  function to (line 9) | def to(x):
  function pair (line 12) | def pair(x):
  function expand_dim (line 15) | def expand_dim(t, dim, k):
  function rel_to_abs (line 21) | def rel_to_abs(x):
  function relative_logits_1d (line 34) | def relative_logits_1d(q, rel_k):
  class RelPosEmb (line 46) | class RelPosEmb(nn.Module):
    method __init__ (line 47) | def __init__(
    method forward (line 61) | def forward(self, q):
  class HaloAttention (line 75) | class HaloAttention(nn.Module):
    method __init__ (line 76) | def __init__(
    method forward (line 107) | def forward(self, x):

FILE: model/attention/MOATransformer.py
  class Mlp (line 13) | class Mlp(nn.Module):
    method __init__ (line 14) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 23) | def forward(self, x):
  function window_partition (line 32) | def window_partition(x, window_size):
  function window_reverse (line 50) | def window_reverse(windows, window_size, H, W):
  class WindowAttention (line 67) | class WindowAttention(nn.Module):
    method __init__ (line 81) | def __init__(self, dim, window_size, num_heads, qkv_bias=True, qk_scal...
    method forward (line 120) | def forward(self, x):
    method extra_repr (line 149) | def extra_repr(self) -> str:
    method flops (line 152) | def flops(self, N):
  class GlobalAttention (line 165) | class GlobalAttention(nn.Module):
    method __init__ (line 178) | def __init__(self, dim, window_size, input_resolution,num_heads, qkv_b...
    method forward (line 238) | def forward(self, x, H, W):
    method extra_repr (line 284) | def extra_repr(self) -> str:
    method flops (line 287) | def flops(self, N):
  class LocalTransformerBlock (line 301) | class LocalTransformerBlock(nn.Module):
    method __init__ (line 320) | def __init__(self, dim, input_resolution, num_heads, window_size=7,
    method forward (line 348) | def forward(self, x):
    method extra_repr (line 377) | def extra_repr(self) -> str:
    method flops (line 381) | def flops(self):
  class PatchMerging (line 396) | class PatchMerging(nn.Module):
    method __init__ (line 405) | def __init__(self, input_resolution, dim, norm_layer=nn.LayerNorm):
    method forward (line 412) | def forward(self, x):
    method extra_repr (line 435) | def extra_repr(self) -> str:
    method flops (line 438) | def flops(self):
  class BasicLayer (line 445) | class BasicLayer(nn.Module):
    method __init__ (line 465) | def __init__(self, dim, input_resolution, depth, num_heads, window_size,
    method forward (line 508) | def forward(self, x):
    method extra_repr (line 539) | def extra_repr(self) -> str:
    method flops (line 542) | def flops(self):
  class PatchEmbed (line 551) | class PatchEmbed(nn.Module):
    method __init__ (line 562) | def __init__(self, img_size=224, patch_size=4, in_chans=3, embed_dim=9...
    method forward (line 581) | def forward(self, x):
    method flops (line 591) | def flops(self):
  class MOATransformer (line 599) | class MOATransformer(nn.Module):
    method __init__ (line 625) | def __init__(self, img_size=224, patch_size=4, in_chans=3, num_classes...
    method _init_weights (line 684) | def _init_weights(self, m):
    method no_weight_decay (line 694) | def no_weight_decay(self):
    method no_weight_decay_keywords (line 698) | def no_weight_decay_keywords(self):
    method forward_features (line 701) | def forward_features(self, x):
    method forward (line 715) | def forward(self, x):
    method flops (line 720) | def flops(self):

FILE: model/attention/MUSEAttention.py
  class Depth_Pointwise_Conv1d (line 8) | class Depth_Pointwise_Conv1d(nn.Module):
    method __init__ (line 9) | def __init__(self,in_ch,out_ch,k):
    method forward (line 27) | def forward(self,x):
  class MUSEAttention (line 33) | class MUSEAttention(nn.Module):
    method __init__ (line 35) | def __init__(self, d_model, d_k, d_v, h,dropout=.1):
    method init_weights (line 59) | def init_weights(self):
    method forward (line 73) | def forward(self, queries, keys, values, attention_mask=None, attentio...

FILE: model/attention/MobileViTAttention.py
  class PreNorm (line 6) | class PreNorm(nn.Module):
    method __init__ (line 7) | def __init__(self,dim,fn):
    method forward (line 11) | def forward(self,x,**kwargs):
  class FeedForward (line 14) | class FeedForward(nn.Module):
    method __init__ (line 15) | def __init__(self,dim,mlp_dim,dropout) :
    method forward (line 24) | def forward(self,x):
  class Attention (line 27) | class Attention(nn.Module):
    method __init__ (line 28) | def __init__(self,dim,heads,head_dim,dropout):
    method forward (line 44) | def forward(self,x):
  class Transformer (line 57) | class Transformer(nn.Module):
    method __init__ (line 58) | def __init__(self,dim,depth,heads,head_dim,mlp_dim,dropout=0.):
    method forward (line 68) | def forward(self,x):
  class MobileViTAttention (line 75) | class MobileViTAttention(nn.Module):
    method __init__ (line 76) | def __init__(self,in_channel=3,dim=512,kernel_size=3,patch_size=7):
    method forward (line 87) | def forward(self,x):

FILE: model/attention/MobileViTv2Attention.py
  class MobileViTv2Attention (line 8) | class MobileViTv2Attention(nn.Module):
    method __init__ (line 13) | def __init__(self, d_model):
    method init_weights (line 30) | def init_weights(self):
    method forward (line 44) | def forward(self, input):

FILE: model/attention/OutlookAttention.py
  class OutlookAttention (line 8) | class OutlookAttention(nn.Module):
    method __init__ (line 10) | def __init__(self,dim,num_heads=1,kernel_size=3,padding=1,stride=1,qkv...
    method forward (line 31) | def forward(self, x) :

FILE: model/attention/PSA.py
  class PSA (line 8) | class PSA(nn.Module):
    method __init__ (line 10) | def __init__(self, channel=512,reduction=4,S=4):
    method init_weights (line 31) | def init_weights(self):
    method forward (line 45) | def forward(self, x):

FILE: model/attention/ParNetAttention.py
  class ParNetAttention (line 8) | class ParNetAttention(nn.Module):
    method __init__ (line 10) | def __init__(self, channel=512):
    method forward (line 29) | def forward(self, x):

FILE: model/attention/PolarizedSelfAttention.py
  class ParallelPolarizedSelfAttention (line 8) | class ParallelPolarizedSelfAttention(nn.Module):
    method __init__ (line 10) | def __init__(self, channel=512):
    method forward (line 23) | def forward(self, x):
  class SequentialPolarizedSelfAttention (line 54) | class SequentialPolarizedSelfAttention(nn.Module):
    method __init__ (line 56) | def __init__(self, channel=512):
    method forward (line 69) | def forward(self, x):

FILE: model/attention/ResidualAttention.py
  class ResidualAttention (line 8) | class ResidualAttention(nn.Module):
    method __init__ (line 10) | def __init__(self, channel=512 , num_class=1000,la=0.2):
    method forward (line 15) | def forward(self, x):

FILE: model/attention/S2Attention.py
  function spatial_shift1 (line 7) | def spatial_shift1(x):
  function spatial_shift2 (line 16) | def spatial_shift2(x):
  class SplitAttention (line 25) | class SplitAttention(nn.Module):
    method __init__ (line 26) | def __init__(self,channel=512,k=3):
    method forward (line 35) | def forward(self,x_all):
  class S2Attention (line 48) | class S2Attention(nn.Module):
    method __init__ (line 50) | def __init__(self, channels=512 ):
    method forward (line 56) | def forward(self, x):

FILE: model/attention/SEAttention.py
  class SEAttention (line 8) | class SEAttention(nn.Module):
    method __init__ (line 10) | def __init__(self, channel=512,reduction=16):
    method init_weights (line 21) | def init_weights(self):
    method forward (line 35) | def forward(self, x):

FILE: model/attention/SGE.py
  class SpatialGroupEnhance (line 8) | class SpatialGroupEnhance(nn.Module):
    method __init__ (line 10) | def __init__(self, groups):
    method init_weights (line 20) | def init_weights(self):
    method forward (line 34) | def forward(self, x):

FILE: model/attention/SKAttention.py
  class SKAttention (line 9) | class SKAttention(nn.Module):
    method __init__ (line 11) | def __init__(self, channel=512,kernels=[1,3,5,7],reduction=16,group=1,...
    method forward (line 31) | def forward(self, x):

FILE: model/attention/SelfAttention.py
  class ScaledDotProductAttention (line 8) | class ScaledDotProductAttention(nn.Module):
    method __init__ (line 13) | def __init__(self, d_model, d_k, d_v, h,dropout=.1):
    method init_weights (line 35) | def init_weights(self):
    method forward (line 49) | def forward(self, queries, keys, values, attention_mask=None, attentio...

FILE: model/attention/ShuffleAttention.py
  class ShuffleAttention (line 8) | class ShuffleAttention(nn.Module):
    method __init__ (line 10) | def __init__(self, channel=512,reduction=16,G=8):
    method init_weights (line 23) | def init_weights(self):
    method channel_shuffle (line 39) | def channel_shuffle(x, groups):
    method forward (line 49) | def forward(self, x):

FILE: model/attention/SimAM.py
  class SimAM (line 5) | class SimAM(torch.nn.Module):
    method __init__ (line 6) | def __init__(self, channels = None, e_lambda = 1e-4):
    method __repr__ (line 12) | def __repr__(self):
    method get_module_name (line 18) | def get_module_name():
    method forward (line 21) | def forward(self, x):

FILE: model/attention/SimplifiedSelfAttention.py
  class SimplifiedScaledDotProductAttention (line 8) | class SimplifiedScaledDotProductAttention(nn.Module):
    method __init__ (line 13) | def __init__(self, d_model, h,dropout=.1):
    method init_weights (line 35) | def init_weights(self):
    method forward (line 49) | def forward(self, queries, keys, values, attention_mask=None, attentio...

FILE: model/attention/TripletAttention.py
  class BasicConv (line 4) | class BasicConv(nn.Module):
    method __init__ (line 5) | def __init__(self, in_planes, out_planes, kernel_size, stride=1, paddi...
    method forward (line 12) | def forward(self, x):
  class ZPool (line 20) | class ZPool(nn.Module):
    method forward (line 21) | def forward(self, x):
  class AttentionGate (line 24) | class AttentionGate(nn.Module):
    method __init__ (line 25) | def __init__(self):
    method forward (line 30) | def forward(self, x):
  class TripletAttention (line 36) | class TripletAttention(nn.Module):
    method __init__ (line 37) | def __init__(self, no_spatial=False):
    method forward (line 44) | def forward(self, x):

FILE: model/attention/UFOAttention.py
  function XNorm (line 8) | def XNorm(x,gamma):
  class UFOAttention (line 13) | class UFOAttention(nn.Module):
    method __init__ (line 18) | def __init__(self, d_model, d_k, d_v, h,dropout=.1):
    method init_weights (line 41) | def init_weights(self):
    method forward (line 55) | def forward(self, queries, keys, values):

FILE: model/attention/ViP.py
  class MLP (line 5) | class MLP(nn.Module):
    method __init__ (line 6) | def __init__(self,in_features,hidden_features,out_features,act_layer=n...
    method forward (line 13) | def forward(self, x) :
  class WeightedPermuteMLP (line 16) | class WeightedPermuteMLP(nn.Module):
    method __init__ (line 17) | def __init__(self,dim,seg_dim=8, qkv_bias=False, proj_drop=0.):
    method forward (line 30) | def forward(self,x) :

FILE: model/attention/gfnet.py
  class PatchEmbed (line 6) | class PatchEmbed(nn.Module):
    method __init__ (line 9) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 19) | def forward(self, x):
  class GlobalFilter (line 27) | class GlobalFilter(nn.Module):
    method __init__ (line 28) | def __init__(self, dim, h=14, w=8):
    method forward (line 34) | def forward(self, x, spatial_size=None):
  class Mlp (line 53) | class Mlp(nn.Module):
    method __init__ (line 54) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 63) | def forward(self, x):
  class Block (line 71) | class Block(nn.Module):
    method __init__ (line 72) | def __init__(self, dim, mlp_ratio=4., drop=0., drop_path=0., act_layer...
    method forward (line 81) | def forward(self, x):
  class GFNet (line 86) | class GFNet(nn.Module):
    method __init__ (line 87) | def __init__(self, embed_dim=384, img_size=224, patch_size=16, mlp_rat...
    method forward (line 104) | def forward(self, x):

FILE: model/backbone/CMT.py
  function _cfg (line 21) | def _cfg(url='', **kwargs):
  class SwishImplementation (line 33) | class SwishImplementation(torch.autograd.Function):
    method forward (line 35) | def forward(ctx, i):
    method backward (line 41) | def backward(ctx, grad_output):
  class MemoryEfficientSwish (line 47) | class MemoryEfficientSwish(nn.Module):
    method forward (line 48) | def forward(self, x):
  class Mlp (line 52) | class Mlp(nn.Module):
    method __init__ (line 53) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 71) | def forward(self, x, H, W):
  class Attention (line 85) | class Attention(nn.Module):
    method __init__ (line 86) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None,
    method forward (line 109) | def forward(self, x, H, W, relative_pos):
  class Block (line 131) | class Block(nn.Module):
    method __init__ (line 132) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 146) | def forward(self, x, H, W, relative_pos):
  class PatchEmbed (line 156) | class PatchEmbed(nn.Module):
    method __init__ (line 159) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 175) | def forward(self, x):
  class CMT (line 187) | class CMT(nn.Module):
    method __init__ (line 188) | def __init__(self, img_size=224, in_chans=3, num_classes=1000, embed_d...
    method _init_weights (line 276) | def _init_weights(self, m):
    method update_temperature (line 292) | def update_temperature(self):
    method no_weight_decay (line 298) | def no_weight_decay(self):
    method get_classifier (line 301) | def get_classifier(self):
    method reset_classifier (line 304) | def reset_classifier(self, num_classes, global_pool=''):
    method forward_features (line 308) | def forward_features(self, x):
    method forward (line 350) | def forward(self, x):
  function resize_pos_embed (line 356) | def resize_pos_embed(posemb, posemb_new):
  function checkpoint_filter_fn (line 376) | def checkpoint_filter_fn(state_dict, model):
  function _create_cmt_model (line 394) | def _create_cmt_model(pretrained=False, distilled=False, **kwargs):
  function cmt_ti (line 419) | def cmt_ti(pretrained=False, **kwargs):
  function cmt_xs (line 428) | def cmt_xs(pretrained=False, **kwargs):
  function cmt_s (line 439) | def cmt_s(pretrained=False, **kwargs):
  function cmt_b (line 450) | def cmt_b(pretrained=False, **kwargs):
  function CMT_Tiny (line 461) | def CMT_Tiny(pretrained=False, **kwargs):

FILE: model/backbone/CPVT.py
  class Mlp (line 13) | class Mlp(nn.Module):
    method __init__ (line 14) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 23) | def forward(self, x):
  class GroupAttention (line 32) | class GroupAttention(nn.Module):
    method __init__ (line 36) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 52) | def forward(self, x, H, W):
  class Attention (line 74) | class Attention(nn.Module):
    method __init__ (line 78) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 98) | def forward(self, x, H, W):
  class Block (line 122) | class Block(nn.Module):
    method __init__ (line 124) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 137) | def forward(self, x, H, W):
  class SBlock (line 144) | class SBlock(TimmBlock):
    method __init__ (line 145) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 150) | def forward(self, x, H, W):
  class GroupBlock (line 154) | class GroupBlock(TimmBlock):
    method __init__ (line 155) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 165) | def forward(self, x, H, W):
  class PatchEmbed (line 171) | class PatchEmbed(nn.Module):
    method __init__ (line 175) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 189) | def forward(self, x):
  class PyramidVisionTransformer (line 200) | class PyramidVisionTransformer(nn.Module):
    method __init__ (line 201) | def __init__(self, img_size=224, patch_size=16, in_chans=3, num_classe...
    method reset_drop_path (line 251) | def reset_drop_path(self, drop_path_rate):
    method _init_weights (line 259) | def _init_weights(self, m):
    method no_weight_decay (line 269) | def no_weight_decay(self):
    method get_classifier (line 272) | def get_classifier(self):
    method reset_classifier (line 275) | def reset_classifier(self, num_classes, global_pool=''):
    method forward_features (line 279) | def forward_features(self, x):
    method forward (line 297) | def forward(self, x):
  class PosCNN (line 305) | class PosCNN(nn.Module):
    method __init__ (line 306) | def __init__(self, in_chans, embed_dim=768, s=1):
    method forward (line 311) | def forward(self, x, H, W):
    method no_weight_decay (line 322) | def no_weight_decay(self):
  class CPVTV2 (line 326) | class CPVTV2(PyramidVisionTransformer):
    method __init__ (line 333) | def __init__(self, img_size=224, patch_size=4, in_chans=3, num_classes...
    method _init_weights (line 347) | def _init_weights(self, m):
    method no_weight_decay (line 366) | def no_weight_decay(self):
    method forward_features (line 369) | def forward_features(self, x):
  class PCPVT (line 387) | class PCPVT(CPVTV2):
    method __init__ (line 388) | def __init__(self, img_size=224, patch_size=4, in_chans=3, num_classes...
  class ALTGVT (line 397) | class ALTGVT(PCPVT):
    method __init__ (line 401) | def __init__(self, img_size=224, patch_size=4, in_chans=3, num_classes...
  function _conv_filter (line 425) | def _conv_filter(state_dict, patch_size=16):
  function pcpvt_small_v0 (line 437) | def pcpvt_small_v0(pretrained=False, **kwargs):
  function pcpvt_base_v0 (line 447) | def pcpvt_base_v0(pretrained=False, **kwargs):
  function pcpvt_large_v0 (line 457) | def pcpvt_large_v0(pretrained=False, **kwargs):

FILE: model/backbone/CaiT.py
  class Class_Attention (line 20) | class Class_Attention(nn.Module):
    method __init__ (line 23) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 37) | def forward(self, x ):
  class LayerScale_Block_CA (line 56) | class LayerScale_Block_CA(nn.Module):
    method __init__ (line 59) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 74) | def forward(self, x, x_cls):
  class Attention_talking_head (line 86) | class Attention_talking_head(nn.Module):
    method __init__ (line 89) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 110) | def forward(self, x):
  class LayerScale_Block (line 129) | class LayerScale_Block(nn.Module):
    method __init__ (line 132) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 146) | def forward(self, x):
  class CaiT (line 154) | class CaiT(nn.Module):
    method __init__ (line 157) | def __init__(self, img_size=224, patch_size=16, in_chans=3, num_classe...
    method _init_weights (line 212) | def _init_weights(self, m):
    method no_weight_decay (line 222) | def no_weight_decay(self):
    method forward_features (line 226) | def forward_features(self, x):
    method forward (line 247) | def forward(self, x):
  function cait_XXS24_224 (line 255) | def cait_XXS24_224(pretrained=False, **kwargs):
  function cait_XXS24 (line 277) | def cait_XXS24(pretrained=False, **kwargs):
  function cait_XXS36_224 (line 298) | def cait_XXS36_224(pretrained=False, **kwargs):
  function cait_XXS36 (line 320) | def cait_XXS36(pretrained=False, **kwargs):
  function cait_XS24 (line 342) | def cait_XS24(pretrained=False, **kwargs):
  function cait_S24_224 (line 367) | def cait_S24_224(pretrained=False, **kwargs):
  function cait_S24 (line 389) | def cait_S24(pretrained=False, **kwargs):
  function cait_S36 (line 411) | def cait_S36(pretrained=False, **kwargs):
  function cait_M36 (line 433) | def cait_M36(pretrained=False, **kwargs):
  function cait_M48 (line 456) | def cait_M48(pretrained=False, **kwargs):

FILE: model/backbone/CeiT.py
  class Image2Tokens (line 18) | class Image2Tokens(nn.Module):
    method __init__ (line 19) | def __init__(self, in_chans=3, out_chans=64, kernel_size=7, stride=2):
    method forward (line 26) | def forward(self, x):
  class Mlp (line 33) | class Mlp(nn.Module):
    method __init__ (line 34) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 43) | def forward(self, x):
  class LocallyEnhancedFeedForward (line 52) | class LocallyEnhancedFeedForward(nn.Module):
    method __init__ (line 53) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 76) | def forward(self, x):
  class Attention (line 101) | class Attention(nn.Module):
    method __init__ (line 102) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 115) | def forward(self, x):
  class AttentionLCA (line 131) | class AttentionLCA(Attention):
    method __init__ (line 132) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 137) | def forward(self, x):
  class Block (line 164) | class Block(nn.Module):
    method __init__ (line 166) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 191) | def forward(self, x):
  class HybridEmbed (line 203) | class HybridEmbed(nn.Module):
    method __init__ (line 207) | def __init__(self, backbone, img_size=224, patch_size=16, feature_size...
    method forward (line 236) | def forward(self, x):
  class CeIT (line 244) | class CeIT(nn.Module):
    method __init__ (line 245) | def __init__(self,
    method _init_weights (line 325) | def _init_weights(self, m):
    method no_weight_decay (line 335) | def no_weight_decay(self):
    method get_classifier (line 338) | def get_classifier(self):
    method reset_classifier (line 341) | def reset_classifier(self, num_classes, global_pool=''):
    method forward_features (line 345) | def forward_features(self, x):
    method forward (line 367) | def forward(self, x):
  function ceit_tiny_patch16_224 (line 374) | def ceit_tiny_patch16_224(pretrained=False, **kwargs):
  function ceit_small_patch16_224 (line 390) | def ceit_small_patch16_224(pretrained=False, **kwargs):
  function ceit_base_patch16_224 (line 406) | def ceit_base_patch16_224(pretrained=False, **kwargs):
  function ceit_tiny_patch16_384 (line 422) | def ceit_tiny_patch16_384(pretrained=False, **kwargs):
  function ceit_small_patch16_384 (line 438) | def ceit_small_patch16_384(pretrained=False, **kwargs):

FILE: model/backbone/CoaT.py
  function _cfg_coat (line 29) | def _cfg_coat(url='', **kwargs):
  class Mlp (line 40) | class Mlp(nn.Module):
    method __init__ (line 42) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 51) | def forward(self, x):
  class ConvRelPosEnc (line 60) | class ConvRelPosEnc(nn.Module):
    method __init__ (line 62) | def __init__(self, Ch, h, window):
    method forward (line 97) | def forward(self, q, v, size):
  class FactorAtt_ConvRelPosEnc (line 119) | class FactorAtt_ConvRelPosEnc(nn.Module):
    method __init__ (line 121) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 135) | def forward(self, x, size):
  class ConvPosEnc (line 161) | class ConvPosEnc(nn.Module):
    method __init__ (line 165) | def __init__(self, dim, k=3):
    method forward (line 169) | def forward(self, x, size):
  class SerialBlock (line 188) | class SerialBlock(nn.Module):
    method __init__ (line 191) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 210) | def forward(self, x, size):
  class ParallelBlock (line 225) | class ParallelBlock(nn.Module):
    method __init__ (line 227) | def __init__(self, dims, num_heads, mlp_ratios=[], qkv_bias=False, qk_...
    method upsample (line 261) | def upsample(self, x, output_size, size):
    method downsample (line 265) | def downsample(self, x, output_size, size):
    method interpolate (line 269) | def interpolate(self, x, output_size, size):
    method forward (line 286) | def forward(self, x1, x2, x3, x4, sizes):
  class PatchEmbed (line 327) | class PatchEmbed(nn.Module):
    method __init__ (line 329) | def __init__(self, patch_size=16, in_chans=3, embed_dim=768):
    method forward (line 337) | def forward(self, x):
  class CoaT (line 347) | class CoaT(nn.Module):
    method __init__ (line 349) | def __init__(self, patch_size=16, in_chans=3, num_classes=1000, embed_...
    method _init_weights (line 461) | def _init_weights(self, m):
    method no_weight_decay (line 471) | def no_weight_decay(self):
    method get_classifier (line 474) | def get_classifier(self):
    method reset_classifier (line 477) | def reset_classifier(self, num_classes, global_pool=''):
    method insert_cls (line 481) | def insert_cls(self, x, cls_token):
    method remove_cls (line 487) | def remove_cls(self, x):
    method forward_features (line 491) | def forward_features(self, x0):
    method forward (line 578) | def forward(self, x):
  function coat_tiny (line 589) | def coat_tiny(**kwargs):
  function coat_mini (line 595) | def coat_mini(**kwargs):
  function coat_small (line 601) | def coat_small(**kwargs):
  function coat_lite_tiny (line 608) | def coat_lite_tiny(**kwargs):
  function coat_lite_mini (line 614) | def coat_lite_mini(**kwargs):
  function coat_lite_small (line 620) | def coat_lite_small(**kwargs):
  function coat_lite_medium (line 626) | def coat_lite_medium(**kwargs):

FILE: model/backbone/ConTNet.py
  function _no_grad_trunc_normal_ (line 20) | def _no_grad_trunc_normal_(tensor, mean, std, a, b):
  function trunc_normal_ (line 56) | def trunc_normal_(tensor, mean=0., std=1., a=-2., b=2.):
  function fixed_padding (line 76) | def fixed_padding(inputs, kernel_size, dilation):
  class ConvBN (line 84) | class ConvBN(nn.Sequential):
    method __init__ (line 85) | def __init__(self, in_planes, out_planes, kernel_size, stride=1, group...
  class MHSA (line 99) | class MHSA(nn.Module):
    method __init__ (line 104) | def __init__(self,
    method forward (line 139) | def forward(self, x):
  class MLP (line 163) | class MLP(nn.Module):
    method __init__ (line 168) | def __init__(self,
    method forward (line 179) | def forward(self, x):
  class STE (line 189) | class STE(nn.Module):
    method __init__ (line 195) | def __init__(self,
    method forward (line 222) | def forward(self, x):
  class ConTBlock (line 250) | class ConTBlock(nn.Module):
    method __init__ (line 254) | def __init__(self,
    method forward (line 283) | def forward(self, x):
  class ConTNet (line 297) | class ConTNet(nn.Module):
    method __init__ (line 301) | def __init__(self,
    method _make_layer (line 382) | def _make_layer(self,
    method _initialize_weights (line 415) | def _initialize_weights(self):
    method forward (line 428) | def forward(self, x):
  function create_ConTNet_Ti (line 440) | def create_ConTNet_Ti(kwargs):
  function create_ConTNet_S (line 450) | def create_ConTNet_S(kwargs):
  function create_ConTNet_M (line 460) | def create_ConTNet_M(kwargs):
  function create_ConTNet_B (line 470) | def create_ConTNet_B(kwargs):
  function build_model (line 480) | def build_model(use_avgdown, relative, qkv_bias, pre_norm):

FILE: model/backbone/ConViT.py
  class Mlp (line 26) | class Mlp(nn.Module):
    method __init__ (line 27) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method _init_weights (line 37) | def _init_weights(self, m):
    method forward (line 46) | def forward(self, x):
  class GPSA (line 55) | class GPSA(nn.Module):
    method __init__ (line 56) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method _init_weights (line 77) | def _init_weights(self, m):
    method forward (line 86) | def forward(self, x):
    method get_attention (line 98) | def get_attention(self, x):
    method get_attention_map (line 114) | def get_attention_map(self, x, return_map = False):
    method local_init (line 125) | def local_init(self, locality_strength=1.):
    method get_rel_indices (line 140) | def get_rel_indices(self, num_patches):
  class MHSA (line 154) | class MHSA(nn.Module):
    method __init__ (line 155) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method _init_weights (line 167) | def _init_weights(self, m):
    method get_attention_map (line 176) | def get_attention_map(self, x, return_map = False):
    method forward (line 200) | def forward(self, x):
  class Block (line 214) | class Block(nn.Module):
    method __init__ (line 216) | def __init__(self, dim, num_heads,  mlp_ratio=4., qkv_bias=False, qk_s...
    method forward (line 230) | def forward(self, x):
  class PatchEmbed (line 236) | class PatchEmbed(nn.Module):
    method __init__ (line 239) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 250) | def forward(self, x):
    method _init_weights (line 256) | def _init_weights(self, m):
  class HybridEmbed (line 266) | class HybridEmbed(nn.Module):
    method __init__ (line 269) | def __init__(self, backbone, img_size=224, feature_size=None, in_chans...
    method forward (line 291) | def forward(self, x):
  class VisionTransformer (line 298) | class VisionTransformer(nn.Module):
    method __init__ (line 301) | def __init__(self, img_size=224, patch_size=16, in_chans=3, num_classe...
    method _init_weights (line 350) | def _init_weights(self, m):
    method no_weight_decay (line 360) | def no_weight_decay(self):
    method get_classifier (line 363) | def get_classifier(self):
    method reset_classifier (line 366) | def reset_classifier(self, num_classes, global_pool=''):
    method forward_features (line 370) | def forward_features(self, x):
    method forward (line 388) | def forward(self, x):
  function convit_tiny (line 395) | def convit_tiny(pretrained=False, **kwargs):
  function convit_small (line 411) | def convit_small(pretrained=False, **kwargs):
  function convit_base (line 427) | def convit_base(pretrained=False, **kwargs):

FILE: model/backbone/Container.py
  class Mlp (line 17) | class Mlp(nn.Module):
    method __init__ (line 19) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 28) | def forward(self, x):
  class CMlp (line 36) | class CMlp(nn.Module):
    method __init__ (line 38) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 47) | def forward(self, x):
  class Attention (line 57) | class Attention(nn.Module):
    method __init__ (line 59) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 71) | def forward(self, x):
  class Attention_pure (line 86) | class Attention_pure(nn.Module):
    method __init__ (line 88) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 98) | def forward(self, x):
  class MixBlock (line 112) | class MixBlock(nn.Module):
    method __init__ (line 114) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 135) | def forward(self, x):
  class CBlock (line 154) | class CBlock(nn.Module):
    method __init__ (line 156) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 170) | def forward(self, x):
  class Block (line 176) | class Block(nn.Module):
    method __init__ (line 178) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 192) | def forward(self, x):
  class PatchEmbed (line 199) | class PatchEmbed(nn.Module):
    method __init__ (line 203) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 214) | def forward(self, x):
  class HybridEmbed (line 226) | class HybridEmbed(nn.Module):
    method __init__ (line 228) | def __init__(self, backbone, img_size=224, feature_size=None, in_chans...
    method forward (line 257) | def forward(self, x):
  class VisionTransformer (line 265) | class VisionTransformer(nn.Module):
    method __init__ (line 271) | def __init__(self, img_size=[224, 56, 28, 14], patch_size=[4, 2, 2, 2]...
    method _init_weights (line 356) | def _init_weights(self, m):
    method no_weight_decay (line 366) | def no_weight_decay(self):
    method get_classifier (line 369) | def get_classifier(self):
    method reset_classifier (line 372) | def reset_classifier(self, num_classes, global_pool=''):
    method forward_features (line 376) | def forward_features(self, x):
    method forward (line 395) | def forward(self, x):
  function container_v1_light (line 404) | def container_v1_light(pretrained=False, **kwargs):

FILE: model/backbone/ConvMixer.py
  class Residual (line 6) | class Residual(nn.Module):
    method __init__ (line 7) | def __init__(self,fn):
    method forward (line 10) | def forward(self,x):
  function ConvMixer (line 13) | def ConvMixer(dim,depth,kernel_size=9,patch_size=7,num_classes=1000):

FILE: model/backbone/CrossViT.py
  class PatchEmbed (line 36) | class PatchEmbed(nn.Module):
    method __init__ (line 39) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 67) | def forward(self, x):
  class CrossAttention (line 76) | class CrossAttention(nn.Module):
    method __init__ (line 77) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 91) | def forward(self, x):
  class CrossAttentionBlock (line 108) | class CrossAttentionBlock(nn.Module):
    method __init__ (line 110) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 124) | def forward(self, x):
  class MultiScaleBlock (line 132) | class MultiScaleBlock(nn.Module):
    method __init__ (line 134) | def __init__(self, dim, patches, depth, num_heads, mlp_ratio, qkv_bias...
    method forward (line 186) | def forward(self, x):
  function _compute_num_patches (line 201) | def _compute_num_patches(img_size, patches):
  class VisionTransformer (line 205) | class VisionTransformer(nn.Module):
    method __init__ (line 208) | def __init__(self, img_size=(224, 224), patch_size=(8, 16), in_chans=3...
    method _init_weights (line 263) | def _init_weights(self, m):
    method no_weight_decay (line 273) | def no_weight_decay(self):
    method get_classifier (line 279) | def get_classifier(self):
    method reset_classifier (line 282) | def reset_classifier(self, num_classes, global_pool=''):
    method forward_features (line 286) | def forward_features(self, x):
    method forward (line 307) | def forward(self, x):
  function crossvit_tiny_224 (line 317) | def crossvit_tiny_224(pretrained=False, **kwargs):
  function crossvit_small_224 (line 330) | def crossvit_small_224(pretrained=False, **kwargs):
  function crossvit_base_224 (line 343) | def crossvit_base_224(pretrained=False, **kwargs):
  function crossvit_9_224 (line 356) | def crossvit_9_224(pretrained=False, **kwargs):
  function crossvit_15_224 (line 369) | def crossvit_15_224(pretrained=False, **kwargs):
  function crossvit_18_224 (line 382) | def crossvit_18_224(pretrained=False, **kwargs):
  function crossvit_9_dagger_224 (line 395) | def crossvit_9_dagger_224(pretrained=False, **kwargs):
  function crossvit_15_dagger_224 (line 407) | def crossvit_15_dagger_224(pretrained=False, **kwargs):
  function crossvit_15_dagger_384 (line 419) | def crossvit_15_dagger_384(pretrained=False, **kwargs):
  function crossvit_18_dagger_224 (line 431) | def crossvit_18_dagger_224(pretrained=False, **kwargs):
  function crossvit_18_dagger_384 (line 443) | def crossvit_18_dagger_384(pretrained=False, **kwargs):

FILE: model/backbone/DViT.py
  class Mlp (line 23) | class Mlp(nn.Module):
    method __init__ (line 24) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 34) | def forward(self, x):
  class Attention (line 42) | class Attention(nn.Module):
    method __init__ (line 43) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 56) | def forward(self, x, atten=None):
  class ReAttention (line 70) | class ReAttention(nn.Module):
    method __init__ (line 75) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 94) | def forward(self, x, atten=None):
  class Block (line 110) | class Block(nn.Module):
    method __init__ (line 112) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 135) | def forward(self, x, atten=None):
  class PatchEmbed_CNN (line 147) | class PatchEmbed_CNN(nn.Module):
    method __init__ (line 151) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 170) | def forward(self, x):
  class PatchEmbed (line 182) | class PatchEmbed(nn.Module):
    method __init__ (line 186) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 197) | def forward(self, x):
  class HybridEmbed (line 206) | class HybridEmbed(nn.Module):
    method __init__ (line 210) | def __init__(self, backbone, img_size=224, feature_size=None, in_chans...
    method forward (line 231) | def forward(self, x):
  function _cfg (line 237) | def _cfg(url='', **kwargs):
  class DeepVisionTransformer (line 269) | class DeepVisionTransformer(nn.Module):
    method __init__ (line 272) | def __init__(self, img_size=224, patch_size=16, in_chans=3, num_classe...
    method _init_weights (line 314) | def _init_weights(self, m):
    method no_weight_decay (line 324) | def no_weight_decay(self):
    method get_classifier (line 327) | def get_classifier(self):
    method reset_classifier (line 330) | def reset_classifier(self, num_classes, global_pool=''):
    method forward_features (line 334) | def forward_features(self, x):
    method forward (line 356) | def forward(self, x):
  function deepvit_patch16_224_re_attn_16b (line 368) | def deepvit_patch16_224_re_attn_16b(pretrained=False, **kwargs):
  function deepvit_patch16_224_re_attn_24b (line 381) | def deepvit_patch16_224_re_attn_24b(pretrained=False, **kwargs):
  function deepvit_patch16_224_re_attn_32b (line 394) | def deepvit_patch16_224_re_attn_32b(pretrained=False, **kwargs):
  function deepvit_S (line 406) | def deepvit_S(pretrained=False, **kwargs):
  function deepvit_L (line 418) | def deepvit_L(pretrained=False, **kwargs):
  function deepvit_L_384 (line 431) | def deepvit_L_384(pretrained=False, **kwargs):

FILE: model/backbone/DeiT.py
  class DistilledVisionTransformer (line 21) | class DistilledVisionTransformer(VisionTransformer):
    method __init__ (line 22) | def __init__(self, *args, **kwargs):
    method forward_features (line 33) | def forward_features(self, x):
    method forward (line 52) | def forward(self, x):
  function deit_tiny_patch16_224 (line 64) | def deit_tiny_patch16_224(pretrained=False, **kwargs):
  function deit_small_patch16_224 (line 79) | def deit_small_patch16_224(pretrained=False, **kwargs):
  function deit_base_patch16_224 (line 94) | def deit_base_patch16_224(pretrained=False, **kwargs):
  function deit_tiny_distilled_patch16_224 (line 109) | def deit_tiny_distilled_patch16_224(pretrained=False, **kwargs):
  function deit_small_distilled_patch16_224 (line 124) | def deit_small_distilled_patch16_224(pretrained=False, **kwargs):
  function deit_base_distilled_patch16_224 (line 139) | def deit_base_distilled_patch16_224(pretrained=False, **kwargs):
  function deit_base_patch16_384 (line 154) | def deit_base_patch16_384(pretrained=False, **kwargs):
  function deit_base_distilled_patch16_384 (line 169) | def deit_base_distilled_patch16_384(pretrained=False, **kwargs):

FILE: model/backbone/EfficientFormer.py
  class Attention (line 30) | class Attention(torch.nn.Module):
    method __init__ (line 31) | def __init__(self, dim=384, key_dim=32, num_heads=8,
    method train (line 62) | def train(self, mode=True):
    method forward (line 69) | def forward(self, x):  # x (B,N,C)
  function stem (line 89) | def stem(in_chs, out_chs):
  class Embedding (line 99) | class Embedding(nn.Module):
    method __init__ (line 106) | def __init__(self, patch_size=16, stride=16, padding=0,
    method forward (line 116) | def forward(self, x):
  class Flat (line 122) | class Flat(nn.Module):
    method __init__ (line 124) | def __init__(self, ):
    method forward (line 127) | def forward(self, x):
  class Pooling (line 132) | class Pooling(nn.Module):
    method __init__ (line 138) | def __init__(self, pool_size=3):
    method forward (line 143) | def forward(self, x):
  class LinearMlp (line 147) | class LinearMlp(nn.Module):
    method __init__ (line 151) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 162) | def forward(self, x):
  class Mlp (line 171) | class Mlp(nn.Module):
    method __init__ (line 177) | def __init__(self, in_features, hidden_features=None,
    method _init_weights (line 191) | def _init_weights(self, m):
    method forward (line 197) | def forward(self, x):
  class Meta3D (line 212) | class Meta3D(nn.Module):
    method __init__ (line 214) | def __init__(self, dim, mlp_ratio=4.,
    method forward (line 237) | def forward(self, x):
  class Meta4D (line 252) | class Meta4D(nn.Module):
    method __init__ (line 254) | def __init__(self, dim, pool_size=3, mlp_ratio=4.,
    method forward (line 274) | def forward(self, x):
  function meta_blocks (line 289) | def meta_blocks(dim, index, layers,
  class EfficientFormer (line 323) | class EfficientFormer(nn.Module):
    method __init__ (line 325) | def __init__(self, layers, embed_dims=None,
    method cls_init_weights (line 403) | def cls_init_weights(self, m):
    method init_weights (line 411) | def init_weights(self, pretrained=None):
    method forward_tokens (line 441) | def forward_tokens(self, x):
    method forward (line 453) | def forward(self, x):
  function _cfg (line 470) | def _cfg(url='', **kwargs):
  function efficientformer_l1 (line 482) | def efficientformer_l1(pretrained=False, **kwargs):
  function efficientformer_l3 (line 494) | def efficientformer_l3(pretrained=False, **kwargs):
  function efficientformer_l7 (line 506) | def efficientformer_l7(pretrained=False, **kwargs):

FILE: model/backbone/HATNet.py
  class InvertedResidual (line 11) | class InvertedResidual(nn.Module):
    method __init__ (line 12) | def __init__(self, in_dim, hidden_dim=None, out_dim=None, kernel_size=3,
    method forward (line 33) | def forward(self, x):
  class Attention (line 43) | class Attention(nn.Module):
    method __init__ (line 44) | def __init__(self, dim, head_dim, grid_size=1, ds_ratio=1, drop=0.):
    method forward (line 65) | def forward(self, x):
  class Block (line 105) | class Block(nn.Module):
    method __init__ (line 106) | def __init__(self, dim, head_dim, grid_size=1, ds_ratio=1, expansion=4,
    method forward (line 114) | def forward(self, x):
  class Downsample (line 120) | class Downsample(nn.Module):
    method __init__ (line 121) | def __init__(self, in_dim, out_dim, kernel_size=3):
    method forward (line 126) | def forward(self, x):
  class HATNet (line 131) | class HATNet(nn.Module):
    method __init__ (line 132) | def __init__(self, img_size=224, in_chans=3, num_classes=1000, dims=[6...
    method reset_drop_path (line 166) | def reset_drop_path(self, drop_path_rate):
    method _init_weights (line 174) | def _init_weights(self, m):
    method forward (line 183) | def forward(self, x):

FILE: model/backbone/LeViT.py
  function replace_batchnorm (line 15) | def replace_batchnorm(net):
  function LeViT_128S (line 48) | def LeViT_128S(num_classes=1000, distillation=True,
  function LeViT_128 (line 55) | def LeViT_128(num_classes=1000, distillation=True,
  function LeViT_192 (line 62) | def LeViT_192(num_classes=1000, distillation=True,
  function LeViT_256 (line 69) | def LeViT_256(num_classes=1000, distillation=True,
  function LeViT_384 (line 76) | def LeViT_384(num_classes=1000, distillation=True,
  class Conv2d_BN (line 85) | class Conv2d_BN(torch.nn.Sequential):
    method __init__ (line 86) | def __init__(self, a, b, ks=1, stride=1, pad=0, dilation=1,
    method fuse (line 102) | def fuse(self):
  class Linear_BN (line 115) | class Linear_BN(torch.nn.Sequential):
    method __init__ (line 116) | def __init__(self, a, b, bn_weight_init=1, resolution=-100000):
    method fuse (line 129) | def fuse(self):
    method forward (line 140) | def forward(self, x):
  class BN_Linear (line 146) | class BN_Linear(torch.nn.Sequential):
    method __init__ (line 147) | def __init__(self, a, b, bias=True, std=0.02):
    method fuse (line 159) | def fuse(self):
  function b16 (line 175) | def b16(n, activation, resolution=224):
  class Residual (line 186) | class Residual(torch.nn.Module):
    method __init__ (line 187) | def __init__(self, m, drop):
    method forward (line 192) | def forward(self, x):
  class Attention (line 200) | class Attention(torch.nn.Module):
    method __init__ (line 201) | def __init__(self, dim, key_dim, num_heads=8,
    method train (line 242) | def train(self, mode=True):
    method forward (line 249) | def forward(self, x):  # x (B,N,C)
  class Subsample (line 270) | class Subsample(torch.nn.Module):
    method __init__ (line 271) | def __init__(self, stride, resolution):
    method forward (line 276) | def forward(self, x):
  class AttentionSubsample (line 283) | class AttentionSubsample(torch.nn.Module):
    method __init__ (line 284) | def __init__(self, in_dim, out_dim, key_dim, num_heads=8,
    method train (line 342) | def train(self, mode=True):
    method forward (line 349) | def forward(self, x):
  class LeViT (line 368) | class LeViT(torch.nn.Module):
    method __init__ (line 372) | def __init__(self, img_size=224,
    method no_weight_decay (line 455) | def no_weight_decay(self):
    method forward (line 458) | def forward(self, x):
  function model_factory (line 472) | def model_factory(C, D, X, N, drop_path, weights,

FILE: model/backbone/MobileNetV3.py
  function _cfg (line 26) | def _cfg(url='', **kwargs):
  class MobileNetV3 (line 107) | class MobileNetV3(nn.Module):
    method __init__ (line 119) | def __init__(
    method as_sequential (line 157) | def as_sequential(self):
    method group_matcher (line 165) | def group_matcher(self, coarse=False):
    method set_grad_checkpointing (line 172) | def set_grad_checkpointing(self, enable=True):
    method get_classifier (line 176) | def get_classifier(self):
    method reset_classifier (line 179) | def reset_classifier(self, num_classes, global_pool='avg'):
    method forward_features (line 186) | def forward_features(self, x):
    method forward_head (line 195) | def forward_head(self, x, pre_logits: bool = False):
    method forward (line 207) | def forward(self, x):
  class MobileNetV3Features (line 212) | class MobileNetV3Features(nn.Module):
    method __init__ (line 218) | def __init__(
    method forward (line 252) | def forward(self, x) -> List[torch.Tensor]:
  function _create_mnv3 (line 270) | def _create_mnv3(variant, pretrained=False, **kwargs):
  function _gen_mobilenet_v3_rw (line 287) | def _gen_mobilenet_v3_rw(variant, channel_multiplier=1.0, pretrained=Fal...
  function _gen_mobilenet_v3 (line 322) | def _gen_mobilenet_v3(variant, channel_multiplier=1.0, pretrained=False,...
  function _gen_fbnetv3 (line 416) | def _gen_fbnetv3(variant, channel_multiplier=1.0, pretrained=False, **kw...
  function _gen_lcnet (line 476) | def _gen_lcnet(variant, channel_multiplier=1.0, pretrained=False, **kwar...
  function _gen_lcnet (line 511) | def _gen_lcnet(variant, channel_multiplier=1.0, pretrained=False, **kwar...
  function mobilenetv3_large_075 (line 548) | def mobilenetv3_large_075(pretrained=False, **kwargs):
  function mobilenetv3_large_100 (line 555) | def mobilenetv3_large_100(pretrained=False, **kwargs):
  function mobilenetv3_large_100_miil (line 562) | def mobilenetv3_large_100_miil(pretrained=False, **kwargs):
  function mobilenetv3_large_100_miil_in21k (line 571) | def mobilenetv3_large_100_miil_in21k(pretrained=False, **kwargs):
  function mobilenetv3_small_050 (line 580) | def mobilenetv3_small_050(pretrained=False, **kwargs):
  function mobilenetv3_small_075 (line 587) | def mobilenetv3_small_075(pretrained=False, **kwargs):
  function mobilenetv3_small_100 (line 594) | def mobilenetv3_small_100(pretrained=False, **kwargs):
  function mobilenetv3_rw (line 601) | def mobilenetv3_rw(pretrained=False, **kwargs):
  function tf_mobilenetv3_large_075 (line 611) | def tf_mobilenetv3_large_075(pretrained=False, **kwargs):
  function tf_mobilenetv3_large_100 (line 620) | def tf_mobilenetv3_large_100(pretrained=False, **kwargs):
  function tf_mobilenetv3_large_minimal_100 (line 629) | def tf_mobilenetv3_large_minimal_100(pretrained=False, **kwargs):
  function tf_mobilenetv3_small_075 (line 638) | def tf_mobilenetv3_small_075(pretrained=False, **kwargs):
  function tf_mobilenetv3_small_100 (line 647) | def tf_mobilenetv3_small_100(pretrained=False, **kwargs):
  function tf_mobilenetv3_small_minimal_100 (line 656) | def tf_mobilenetv3_small_minimal_100(pretrained=False, **kwargs):
  function fbnetv3_b (line 665) | def fbnetv3_b(pretrained=False, **kwargs):
  function fbnetv3_d (line 672) | def fbnetv3_d(pretrained=False, **kwargs):
  function fbnetv3_g (line 679) | def fbnetv3_g(pretrained=False, **kwargs):
  function lcnet_035 (line 686) | def lcnet_035(pretrained=False, **kwargs):
  function lcnet_050 (line 693) | def lcnet_050(pretrained=False, **kwargs):
  function lcnet_075 (line 700) | def lcnet_075(pretrained=False, **kwargs):
  function lcnet_100 (line 707) | def lcnet_100(pretrained=False, **kwargs):
  function lcnet_150 (line 714) | def lcnet_150(pretrained=False, **kwargs):

FILE: model/backbone/MobileViT.py
  function conv_bn (line 9) | def conv_bn(inp,oup,kernel_size=3,stride=1):
  class PreNorm (line 16) | class PreNorm(nn.Module):
    method __init__ (line 17) | def __init__(self,dim,fn):
    method forward (line 21) | def forward(self,x,**kwargs):
  class FeedForward (line 24) | class FeedForward(nn.Module):
    method __init__ (line 25) | def __init__(self,dim,mlp_dim,dropout) :
    method forward (line 34) | def forward(self,x):
  class Attention (line 37) | class Attention(nn.Module):
    method __init__ (line 38) | def __init__(self,dim,heads,head_dim,dropout):
    method forward (line 54) | def forward(self,x):
  class Transformer (line 67) | class Transformer(nn.Module):
    method __init__ (line 68) | def __init__(self,dim,depth,heads,head_dim,mlp_dim,dropout=0.):
    method forward (line 78) | def forward(self,x):
  class MobileViTAttention (line 85) | class MobileViTAttention(nn.Module):
    method __init__ (line 86) | def __init__(self,in_channel=3,dim=512,kernel_size=3,patch_size=7,dept...
    method forward (line 97) | def forward(self,x):
  class MV2Block (line 117) | class MV2Block(nn.Module):
    method __init__ (line 118) | def __init__(self,inp,out,stride=1,expansion=4):
    method forward (line 144) | def forward(self,x):
  class MobileViT (line 151) | class MobileViT(nn.Module):
    method __init__ (line 152) | def __init__(self,image_size,dims,channels,num_classes,depths=[2,4,3],...
    method forward (line 179) | def forward(self,x):
  function mobilevit_xxs (line 199) | def mobilevit_xxs():
  function mobilevit_xs (line 204) | def mobilevit_xs():
  function mobilevit_s (line 209) | def mobilevit_s():
  function count_paratermeters (line 215) | def count_paratermeters(model):

FILE: model/backbone/PIT.py
  class Transformer (line 15) | class Transformer(nn.Module):
    method __init__ (line 16) | def __init__(self, base_dim, depth, heads, mlp_ratio,
    method forward (line 38) | def forward(self, x, cls_tokens):
  class conv_head_pooling (line 54) | class conv_head_pooling(nn.Module):
    method __init__ (line 55) | def __init__(self, in_feature, out_feature, stride,
    method forward (line 64) | def forward(self, x, cls_token):
  class conv_embedding (line 72) | class conv_embedding(nn.Module):
    method __init__ (line 73) | def __init__(self, in_channels, out_channels, patch_size,
    method forward (line 79) | def forward(self, x):
  class PoolingTransformer (line 84) | class PoolingTransformer(nn.Module):
    method __init__ (line 85) | def __init__(self, image_size, patch_size, stride, base_dims, depth, h...
    method _init_weights (line 149) | def _init_weights(self, m):
    method no_weight_decay (line 155) | def no_weight_decay(self):
    method get_classifier (line 158) | def get_classifier(self):
    method reset_classifier (line 161) | def reset_classifier(self, num_classes, global_pool=''):
    method forward_features (line 168) | def forward_features(self, x):
    method forward (line 184) | def forward(self, x):
  class DistilledPoolingTransformer (line 190) | class DistilledPoolingTransformer(PoolingTransformer):
    method __init__ (line 191) | def __init__(self, *args, **kwargs):
    method forward (line 205) | def forward(self, x):
  function pit_b (line 215) | def pit_b(pretrained, **kwargs):
  function pit_s (line 233) | def pit_s(pretrained, **kwargs):
  function pit_xs (line 252) | def pit_xs(pretrained, **kwargs):
  function pit_ti (line 270) | def pit_ti(pretrained, **kwargs):
  function pit_b_distilled (line 289) | def pit_b_distilled(pretrained, **kwargs):
  function pit_s_distilled (line 308) | def pit_s_distilled(pretrained, **kwargs):
  function pit_xs_distilled (line 327) | def pit_xs_distilled(pretrained, **kwargs):
  function pit_ti_distilled (line 346) | def pit_ti_distilled(pretrained, **kwargs):

FILE: model/backbone/PVT.py
  class Mlp (line 15) | class Mlp(nn.Module):
    method __init__ (line 16) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 25) | def forward(self, x):
  class Attention (line 34) | class Attention(nn.Module):
    method __init__ (line 35) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
    method forward (line 55) | def forward(self, x, H, W):
  class Block (line 79) | class Block(nn.Module):
    method __init__ (line 81) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
    method forward (line 95) | def forward(self, x, H, W):
  class PatchEmbed (line 102) | class PatchEmbed(nn.Module):
    method __init__ (line 106) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 120) | def forward(self, x):
  class PyramidVisionTransformer (line 130) | class PyramidVisionTransformer(nn.Module):
    method __init__ (line 131) | def __init__(self, img_size=224, patch_size=16, in_chans=3, num_classe...
    method _init_weights (line 180) | def _init_weights(self, m):
    method no_weight_decay (line 190) | def no_weight_decay(self):
    method get_classifier (line 194) | def get_classifier(self):
    method reset_classifier (line 197) | def reset_classifier(self, num_classes, global_pool=''):
    method _get_pos_embed (line 201) | def _get_pos_embed(self, pos_embed, patch_embed, H, W):
    method forward_features (line 209) | def forward_features(self, x):
    method forward (line 237) | def forward(self, x):
  function _conv_filter (line 244) | def _conv_filter(state_dict, patch_size=16):
  function pvt_tiny (line 256) | def pvt_tiny(pretrained=False, **kwargs):
  function pvt_small (line 267) | def pvt_small(pretrained=False, **kwargs):
  function pvt_medium (line 277) | def pvt_medium(pretrained=False, **kwargs):
  function pvt_large (line 288) | def pvt_large(pretrained=False, **kwargs):
  function pvt_huge_v2 (line 299) | def pvt_huge_v2(pretrained=False, **kwargs):

FILE: model/backbone/PatchConvnet.py
  class Mlp (line 18) | class Mlp(nn.Module):
    method __init__ (line 19) | def __init__(
    method forward (line 35) | def forward(self, x: torch.Tensor) -> torch.Tensor:
  class Learned_Aggregation_Layer (line 44) | class Learned_Aggregation_Layer(nn.Module):
    method __init__ (line 45) | def __init__(
    method forward (line 68) | def forward(self, x: torch.Tensor) -> torch.Tensor:
  class Learned_Aggregation_Layer_multi (line 88) | class Learned_Aggregation_Layer_multi(nn.Module):
    method __init__ (line 89) | def __init__(
    method forward (line 112) | def forward(self, x: torch.Tensor) -> torch.Tensor:
  class Layer_scale_init_Block_only_token (line 143) | class Layer_scale_init_Block_only_token(nn.Module):
    method __init__ (line 144) | def __init__(
    method forward (line 173) | def forward(self, x: torch.Tensor, x_cls: torch.Tensor) -> torch.Tensor:
  class Conv_blocks_se (line 180) | class Conv_blocks_se(nn.Module):
    method __init__ (line 181) | def __init__(self, dim: int):
    method forward (line 193) | def forward(self, x: torch.Tensor) -> torch.Tensor:
  class Layer_scale_init_Block (line 204) | class Layer_scale_init_Block(nn.Module):
    method __init__ (line 205) | def __init__(
    method forward (line 220) | def forward(self, x: torch.Tensor) -> torch.Tensor:
  function conv3x3 (line 224) | def conv3x3(in_planes: int, out_planes: int, stride: int = 1) -> nn.Sequ...
  class ConvStem (line 231) | class ConvStem(nn.Module):
    method __init__ (line 234) | def __init__(self, img_size: int = 224, patch_size: int = 16, in_chans...
    method forward (line 253) | def forward(self, x: torch.Tensor, padding_size: Optional[int] = None)...
  class PatchConvnet (line 259) | class PatchConvnet(nn.Module):
    method __init__ (line 260) | def __init__(
    method _init_weights (line 360) | def _init_weights(self, m):
    method no_weight_decay (line 370) | def no_weight_decay(self):
    method get_classifier (line 373) | def get_classifier(self):
    method get_num_layers (line 376) | def get_num_layers(self):
    method reset_classifier (line 379) | def reset_classifier(self, num_classes: int, global_pool: str = ''):
    method forward_features (line 383) | def forward_features(self, x: torch.Tensor) -> torch.Tensor:
    method forward (line 402) | def forward(self, x: torch.Tensor) -> torch.Tensor:
  function S60 (line 416) | def S60(pretrained: bool = False, **kwargs):
  function S120 (line 435) | def S120(pretrained: bool = False, **kwargs):
  function B60 (line 454) | def B60(pretrained: bool = False, **kwargs):
  function B120 (line 471) | def B120(pretrained: bool = False, **kwargs):
  function L60 (line 489) | def L60(pretrained: bool = False, **kwargs):
  function L120 (line 508) | def L120(pretrained: bool = False, **kwargs):
  function S60_multi (line 527) | def S60_multi(pretrained: bool = False, **kwargs):

FILE: model/backbone/ShuffleTransformer.py
  class Mlp (line 8) | class Mlp(nn.Module):
    method __init__ (line 9) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 19) | def forward(self, x):
  class Attention (line 27) | class Attention(nn.Module):
    method __init__ (line 28) | def __init__(self, dim, num_heads, window_size=1, shuffle=False, qkv_b...
    method forward (line 65) | def forward(self, x):
  class Block (line 95) | class Block(nn.Module):
    method __init__ (line 96) | def __init__(self, dim, out_dim, num_heads, window_size=1, shuffle=Fal...
    method forward (line 111) | def forward(self, x):
  class PatchMerging (line 118) | class PatchMerging(nn.Module):
    method __init__ (line 119) | def __init__(self, dim, out_dim, norm_layer=nn.BatchNorm2d):
    method forward (line 126) | def forward(self, x):
    method extra_repr (line 131) | def extra_repr(self) -> str:
  class StageModule (line 135) | class StageModule(nn.Module):
    method __init__ (line 136) | def __init__(self, layers, dim, out_dim, num_heads, window_size=1, shu...
    method forward (line 159) | def forward(self, x):
  class PatchEmbedding (line 169) | class PatchEmbedding(nn.Module):
    method __init__ (line 170) | def __init__(self, inter_channel=32, out_channels=48):
    method forward (line 184) | def forward(self, x):
  class ShuffleTransformer (line 189) | class ShuffleTransformer(nn.Module):
    method __init__ (line 190) | def __init__(self, img_size=224, in_chans=3, num_classes=1000, token_d...
    method _init_weights (line 227) | def _init_weights(self, m):
    method no_weight_decay (line 237) | def no_weight_decay(self):
    method no_weight_decay_keywords (line 241) | def no_weight_decay_keywords(self):
    method get_classifier (line 244) | def get_classifier(self):
    method reset_classifier (line 247) | def reset_classifier(self, num_classes, global_pool=''):
    method forward_features (line 251) | def forward_features(self, x):
    method forward (line 268) | def forward(self, x):

FILE: model/backbone/TnT.py
  function _cfg (line 15) | def _cfg(url='', **kwargs):
  function make_divisible (line 36) | def make_divisible(v, divisor=8, min_value=None):
  class Mlp (line 45) | class Mlp(nn.Module):
    method __init__ (line 46) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 55) | def forward(self, x):
  class SE (line 64) | class SE(nn.Module):
    method __init__ (line 65) | def __init__(self, dim, hidden_ratio=None):
    method forward (line 78) | def forward(self, x):
  class Attention (line 85) | class Attention(nn.Module):
    method __init__ (line 86) | def __init__(self, dim, hidden_dim, num_heads=8, qkv_bias=False, qk_sc...
    method forward (line 101) | def forward(self, x):
  class Block (line 117) | class Block(nn.Module):
    method __init__ (line 120) | def __init__(self, outer_dim, inner_dim, outer_num_heads, inner_num_he...
    method forward (line 153) | def forward(self, inner_tokens, outer_tokens):
  class PatchEmbed (line 169) | class PatchEmbed(nn.Module):
    method __init__ (line 172) | def __init__(self, img_size=224, patch_size=16, in_chans=3, outer_dim=...
    method forward (line 186) | def forward(self, x):
  class TNT (line 198) | class TNT(nn.Module):
    method __init__ (line 201) | def __init__(self, img_size=224, patch_size=16, in_chans=3, num_classe...
    method _init_weights (line 253) | def _init_weights(self, m):
    method no_weight_decay (line 263) | def no_weight_decay(self):
    method get_classifier (line 266) | def get_classifier(self):
    method reset_classifier (line 269) | def reset_classifier(self, num_classes, global_pool=''):
    method forward_features (line 273) | def forward_features(self, x):
    method forward (line 289) | def forward(self, x):
  function _conv_filter (line 295) | def _conv_filter(state_dict, patch_size=16):
  function tnt_s_patch16_224 (line 306) | def tnt_s_patch16_224(pretrained=False, **kwargs):
  function tnt_b_patch16_224 (line 326) | def tnt_b_patch16_224(pretrained=False, **kwargs):

FILE: model/backbone/VOLO.py
  function _cfg (line 28) | def _cfg(url='', **kwargs):
  class OutlookAttention (line 45) | class OutlookAttention(nn.Module):
    method __init__ (line 54) | def __init__(self, dim, num_heads, kernel_size=3, padding=1, stride=1,
    method forward (line 74) | def forward(self, x):
  class Outlooker (line 103) | class Outlooker(nn.Module):
    method __init__ (line 113) | def __init__(self, dim, kernel_size, padding, stride=1,
    method forward (line 134) | def forward(self, x):
  class Mlp (line 140) | class Mlp(nn.Module):
    method __init__ (line 143) | def __init__(self, in_features, hidden_features=None,
    method forward (line 154) | def forward(self, x):
  class Attention (line 163) | class Attention(nn.Module):
    method __init__ (line 166) | def __init__(self, dim,  num_heads=8, qkv_bias=False,
    method forward (line 178) | def forward(self, x):
  class Transformer (line 197) | class Transformer(nn.Module):
    method __init__ (line 202) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False,
    method forward (line 220) | def forward(self, x):
  class ClassAttention (line 226) | class ClassAttention(nn.Module):
    method __init__ (line 231) | def __init__(self, dim, num_heads=8, head_dim=None, qkv_bias=False,
    method forward (line 250) | def forward(self, x):
  class ClassBlock (line 269) | class ClassBlock(nn.Module):
    method __init__ (line 275) | def __init__(self, dim, num_heads, head_dim=None, mlp_ratio=4.,
    method forward (line 293) | def forward(self, x):
  function get_block (line 300) | def get_block(block_type, **kargs):
  function rand_bbox (line 308) | def rand_bbox(size, lam, scale=1):
  class PatchEmbed (line 331) | class PatchEmbed(nn.Module):
    method __init__ (line 337) | def __init__(self, img_size=224, stem_conv=False, stem_stride=1,
    method forward (line 365) | def forward(self, x):
  class Downsample (line 372) | class Downsample(nn.Module):
    method __init__ (line 376) | def __init__(self, in_embed_dim, out_embed_dim, patch_size):
    method forward (line 381) | def forward(self, x):
  function outlooker_blocks (line 388) | def outlooker_blocks(block_fn, index, dim, layers, num_heads=1, kernel_s...
  function transformer_blocks (line 409) | def transformer_blocks(block_fn, index, dim, layers, num_heads, mlp_rati...
  class VOLO (line 433) | class VOLO(nn.Module):
    method __init__ (line 458) | def __init__(self, layers, img_size=224, in_chans=3, num_classes=1000,...
    method _init_weights (line 548) | def _init_weights(self, m):
    method no_weight_decay (line 558) | def no_weight_decay(self):
    method get_classifier (line 561) | def get_classifier(self):
    method reset_classifier (line 564) | def reset_classifier(self, num_classes):
    method forward_embeddings (line 569) | def forward_embeddings(self, x):
    method forward_tokens (line 576) | def forward_tokens(self, x):
    method forward_cls (line 587) | def forward_cls(self, x):
    method forward (line 595) | def forward(self, x):
  function volo_d1 (line 649) | def volo_d1(pretrained=False, **kwargs):
  function volo_d2 (line 682) | def volo_d2(pretrained=False, **kwargs):
  function volo_d3 (line 705) | def volo_d3(pretrained=False, **kwargs):
  function volo_d4 (line 728) | def volo_d4(pretrained=False, **kwargs):
  function volo_d5 (line 751) | def volo_d5(pretrained=False, **kwargs):

FILE: model/backbone/convnextv2.py
  class Block (line 13) | class Block(nn.Module):
    method __init__ (line 20) | def __init__(self, dim, drop_path=0.):
    method forward (line 30) | def forward(self, x):
  class LayerNorm (line 44) | class LayerNorm(nn.Module):
    method __init__ (line 50) | def __init__(self, normalized_shape, eps=1e-6, data_format="channels_l...
    method forward (line 60) | def forward(self, x):
  class GRN (line 70) | class GRN(nn.Module):
    method __init__ (line 73) | def __init__(self, dim):
    method forward (line 78) | def forward(self, x):
  class ConvNeXtV2 (line 84) | class ConvNeXtV2(nn.Module):
    method __init__ (line 95) | def __init__(self, in_chans=3, num_classes=1000,
    method _init_weights (line 131) | def _init_weights(self, m):
    method forward_features (line 136) | def forward_features(self, x):
    method forward (line 142) | def forward(self, x):
  function convnextv2_atto (line 147) | def convnextv2_atto(**kwargs):
  function convnextv2_femto (line 151) | def convnextv2_femto(**kwargs):
  function convnext_pico (line 155) | def convnext_pico(**kwargs):
  function convnextv2_nano (line 159) | def convnextv2_nano(**kwargs):
  function convnextv2_tiny (line 163) | def convnextv2_tiny(**kwargs):
  function convnextv2_base (line 167) | def convnextv2_base(**kwargs):
  function convnextv2_large (line 171) | def convnextv2_large(**kwargs):
  function convnextv2_huge (line 175) | def convnextv2_huge(**kwargs):

FILE: model/backbone/resnet.py
  class BottleNeck (line 10) | class BottleNeck(nn.Module):
    method __init__ (line 12) | def __init__(self,in_channel,channel,stride=1,downsample=None):
    method forward (line 29) | def forward(self,x):
  class ResNet (line 43) | class ResNet(nn.Module):
    method __init__ (line 44) | def __init__(self,block,layers,num_classes=1000):
    method forward (line 65) | def forward(self,x):
    method _make_layer (line 86) | def _make_layer(self,block,channel,blocks,stride=1):
  function ResNet50 (line 101) | def ResNet50(num_classes=1000):
  function ResNet101 (line 105) | def ResNet101(num_classes=1000):
  function ResNet152 (line 109) | def ResNet152(num_classes=1000):

FILE: model/backbone/resnext.py
  class BottleNeck (line 10) | class BottleNeck(nn.Module):
    method __init__ (line 12) | def __init__(self,in_channel,channel,stride=1,C=32,downsample=None):
    method forward (line 29) | def forward(self,x):
  class ResNeXt (line 43) | class ResNeXt(nn.Module):
    method __init__ (line 44) | def __init__(self,block,layers,num_classes=1000):
    method forward (line 65) | def forward(self,x):
    method _make_layer (line 86) | def _make_layer(self,block,channel,blocks,stride=1):
  function ResNeXt50 (line 101) | def ResNeXt50(num_classes=1000):
  function ResNeXt101 (line 105) | def ResNeXt101(num_classes=1000):
  function ResNeXt152 (line 109) | def ResNeXt152(num_classes=1000):

FILE: model/backbone/swin_transformer.py
  function _cfg (line 35) | def _cfg(url='', **kwargs):
  function window_partition (line 99) | def window_partition(x, window_size: int):
  function window_reverse (line 114) | def window_reverse(windows, window_size: int, H: int, W: int):
  function get_relative_position_index (line 130) | def get_relative_position_index(win_h, win_w):
  class WindowAttention (line 142) | class WindowAttention(nn.Module):
    method __init__ (line 155) | def __init__(self, dim, num_heads, head_dim=None, window_size=7, qkv_b...
    method _get_rel_pos_bias (line 181) | def _get_rel_pos_bias(self) -> torch.Tensor:
    method forward (line 187) | def forward(self, x, mask: Optional[torch.Tensor] = None):
  class SwinTransformerBlock (line 217) | class SwinTransformerBlock(nn.Module):
    method __init__ (line 235) | def __init__(
    method forward (line 284) | def forward(self, x):
  class PatchMerging (line 324) | class PatchMerging(nn.Module):
    method __init__ (line 332) | def __init__(self, input_resolution, dim, out_dim=None, norm_layer=nn....
    method forward (line 340) | def forward(self, x):
  class BasicLayer (line 364) | class BasicLayer(nn.Module):
    method __init__ (line 382) | def __init__(
    method forward (line 408) | def forward(self, x):
  class SwinTransformer (line 418) | class SwinTransformer(nn.Module):
    method __init__ (line 442) | def __init__(
    method init_weights (line 502) | def init_weights(self, mode=''):
    method no_weight_decay (line 510) | def no_weight_decay(self):
    method group_matcher (line 518) | def group_matcher(self, coarse=False):
    method set_grad_checkpointing (line 529) | def set_grad_checkpointing(self, enable=True):
    method get_classifier (line 534) | def get_classifier(self):
    method reset_classifier (line 537) | def reset_classifier(self, num_classes, global_pool=None):
    method forward_features (line 544) | def forward_features(self, x):
    method forward_head (line 553) | def forward_head(self, x, pre_logits: bool = False):
    method forward (line 558) | def forward(self, x):
  function _create_swin_transformer (line 564) | def _create_swin_transformer(variant, pretrained=False, **kwargs):
  function swin_base_patch4_window12_384 (line 574) | def swin_base_patch4_window12_384(pretrained=False, **kwargs):
  function swin_base_patch4_window7_224 (line 583) | def swin_base_patch4_window7_224(pretrained=False, **kwargs):
  function swin_large_patch4_window12_384 (line 592) | def swin_large_patch4_window12_384(pretrained=False, **kwargs):
  function swin_large_patch4_window7_224 (line 601) | def swin_large_patch4_window7_224(pretrained=False, **kwargs):
  function swin_small_patch4_window7_224 (line 610) | def swin_small_patch4_window7_224(pretrained=False, **kwargs):
  function swin_tiny_patch4_window7_224 (line 619) | def swin_tiny_patch4_window7_224(pretrained=False, **kwargs):
  function swin_base_patch4_window12_384_in22k (line 628) | def swin_base_patch4_window12_384_in22k(pretrained=False, **kwargs):
  function swin_base_patch4_window7_224_in22k (line 637) | def swin_base_patch4_window7_224_in22k(pretrained=False, **kwargs):
  function swin_large_patch4_window12_384_in22k (line 646) | def swin_large_patch4_window12_384_in22k(pretrained=False, **kwargs):
  function swin_large_patch4_window7_224_in22k (line 655) | def swin_large_patch4_window7_224_in22k(pretrained=False, **kwargs):
  function swin_s3_tiny_224 (line 664) | def swin_s3_tiny_224(pretrained=False, **kwargs):
  function swin_s3_small_224 (line 674) | def swin_s3_small_224(pretrained=False, **kwargs):
  function swin_s3_base_224 (line 684) | def swin_s3_base_224(pretrained=False, **kwargs):

FILE: model/backbone/swin_transformer_v2.py
  function _cfg (line 30) | def _cfg(url='', **kwargs):
  function window_partition (line 94) | def window_partition(x, window_size: Tuple[int, int]):
  function window_reverse (line 109) | def window_reverse(windows, window_size: Tuple[int, int], img_size: Tupl...
  class WindowAttention (line 125) | class WindowAttention(nn.Module):
    method __init__ (line 138) | def __init__(
    method forward (line 202) | def forward(self, x, mask: Optional[torch.Tensor] = None):
  class SwinTransformerBlock (line 244) | class SwinTransformerBlock(nn.Module):
    method __init__ (line 262) | def __init__(
    method _calc_window_shift (line 311) | def _calc_window_shift(self, target_window_size, target_shift_size) ->...
    method _attn (line 318) | def _attn(self, x):
    method forward (line 350) | def forward(self, x):
  class PatchMerging (line 356) | class PatchMerging(nn.Module):
    method __init__ (line 364) | def __init__(self, input_resolution, dim, norm_layer=nn.LayerNorm):
    method forward (line 371) | def forward(self, x):
  class BasicLayer (line 396) | class BasicLayer(nn.Module):
    method __init__ (line 414) | def __init__(
    method forward (line 445) | def forward(self, x):
    method _init_respostnorm (line 454) | def _init_respostnorm(self):
  class SwinTransformerV2 (line 462) | class SwinTransformerV2(nn.Module):
    method __init__ (line 487) | def __init__(
    method _init_weights (line 550) | def _init_weights(self, m):
    method no_weight_decay (line 557) | def no_weight_decay(self):
    method group_matcher (line 565) | def group_matcher(self, coarse=False):
    method set_grad_checkpointing (line 576) | def set_grad_checkpointing(self, enable=True):
    method get_classifier (line 581) | def get_classifier(self):
    method reset_classifier (line 584) | def reset_classifier(self, num_classes, global_pool=None):
    method forward_features (line 591) | def forward_features(self, x):
    method forward_head (line 603) | def forward_head(self, x, pre_logits: bool = False):
    method forward (line 608) | def forward(self, x):
  function checkpoint_filter_fn (line 614) | def checkpoint_filter_fn(state_dict, model):
  function _create_swin_transformer_v2 (line 626) | def _create_swin_transformer_v2(variant, pretrained=False, **kwargs):
  function swinv2_tiny_window16_256 (line 635) | def swinv2_tiny_window16_256(pretrained=False, **kwargs):
  function swinv2_tiny_window8_256 (line 644) | def swinv2_tiny_window8_256(pretrained=False, **kwargs):
  function swinv2_small_window16_256 (line 653) | def swinv2_small_window16_256(pretrained=False, **kwargs):
  function swinv2_small_window8_256 (line 662) | def swinv2_small_window8_256(pretrained=False, **kwargs):
  function swinv2_base_window16_256 (line 671) | def swinv2_base_window16_256(pretrained=False, **kwargs):
  function swinv2_base_window8_256 (line 680) | def swinv2_base_window8_256(pretrained=False, **kwargs):
  function swinv2_base_window12_192_22k (line 689) | def swinv2_base_window12_192_22k(pretrained=False, **kwargs):
  function swinv2_base_window12to16_192to256_22kft1k (line 698) | def swinv2_base_window12to16_192to256_22kft1k(pretrained=False, **kwargs):
  function swinv2_base_window12to24_192to384_22kft1k (line 709) | def swinv2_base_window12to24_192to384_22kft1k(pretrained=False, **kwargs):
  function swinv2_large_window12_192_22k (line 720) | def swinv2_large_window12_192_22k(pretrained=False, **kwargs):
  function swinv2_large_window12to16_192to256_22kft1k (line 729) | def swinv2_large_window12to16_192to256_22kft1k(pretrained=False, **kwargs):
  function swinv2_large_window12to24_192to384_22kft1k (line 740) | def swinv2_large_window12to24_192to384_22kft1k(pretrained=False, **kwargs):

FILE: model/backbone/swin_transformer_v2_cr.py
  function _cfg (line 46) | def _cfg(url='', **kwargs):
  function bchw_to_bhwc (line 100) | def bchw_to_bhwc(x: torch.Tensor) -> torch.Tensor:
  function bhwc_to_bchw (line 105) | def bhwc_to_bchw(x: torch.Tensor) -> torch.Tensor:
  function window_partition (line 110) | def window_partition(x, window_size: Tuple[int, int]):
  function window_reverse (line 125) | def window_reverse(windows, window_size: Tuple[int, int], img_size: Tupl...
  class WindowMultiHeadAttention (line 141) | class WindowMultiHeadAttention(nn.Module):
    method __init__ (line 153) | def __init__(
    method _make_pair_wise_relative_positions (line 187) | def _make_pair_wise_relative_positions(self) -> None:
    method update_input_size (line 199) | def update_input_size(self, new_window_size: int, **kwargs: Any) -> None:
    method _relative_positional_encodings (line 209) | def _relative_positional_encodings(self) -> torch.Tensor:
    method _forward_sequential (line 223) | def _forward_sequential(
    method _forward_batch (line 233) | def _forward_batch(
    method forward (line 265) | def forward(self, x: torch.Tensor, mask: Optional[torch.Tensor] = None...
  class SwinTransformerBlock (line 279) | class SwinTransformerBlock(nn.Module):
    method __init__ (line 296) | def __init__(
    method _calc_window_shift (line 349) | def _calc_window_shift(self, target_window_size):
    method _make_attention_mask (line 354) | def _make_attention_mask(self) -> None:
    method init_weights (line 380) | def init_weights(self):
    method update_input_size (line 386) | def update_input_size(self, new_window_size: Tuple[int, int], new_feat...
    method _shifted_window_attn (line 399) | def _shifted_window_attn(self, x):
    method forward (line 434) | def forward(self, x: torch.Tensor) -> torch.Tensor:
  class PatchMerging (line 448) | class PatchMerging(nn.Module):
    method __init__ (line 455) | def __init__(self, dim: int, norm_layer: Type[nn.Module] = nn.LayerNor...
    method forward (line 460) | def forward(self, x: torch.Tensor) -> torch.Tensor:
  class PatchEmbed (line 476) | class PatchEmbed(nn.Module):
    method __init__ (line 478) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 490) | def forward(self, x):
  class SwinTransformerStage (line 499) | class SwinTransformerStage(nn.Module):
    method __init__ (line 518) | def __init__(
    method update_input_size (line 569) | def update_input_size(self, new_window_size: int, new_feat_size: Tuple...
    method forward (line 581) | def forward(self, x: torch.Tensor) -> torch.Tensor:
  class SwinTransformerV2Cr (line 603) | class SwinTransformerV2Cr(nn.Module):
    method __init__ (line 627) | def __init__(
    method update_input_size (line 700) | def update_input_size(
    method group_matcher (line 729) | def group_matcher(self, coarse=False):
    method set_grad_checkpointing (line 739) | def set_grad_checkpointing(self, enable=True):
    method get_classifier (line 744) | def get_classifier(self) -> nn.Module:
    method reset_classifier (line 751) | def reset_classifier(self, num_classes: int, global_pool: Optional[str...
    method forward_features (line 762) | def forward_features(self, x: torch.Tensor) -> torch.Tensor:
    method forward_head (line 767) | def forward_head(self, x, pre_logits: bool = False):
    method forward (line 772) | def forward(self, x: torch.Tensor) -> torch.Tensor:
  function init_weights (line 778) | def init_weights(module: nn.Module, name: str = ''):
  function checkpoint_filter_fn (line 795) | def checkpoint_filter_fn(state_dict, model):
  function _create_swin_transformer_v2_cr (line 810) | def _create_swin_transformer_v2_cr(variant, pretrained=False, **kwargs):
  function swinv2_cr_tiny_384 (line 822) | def swinv2_cr_tiny_384(pretrained=False, **kwargs):
  function swinv2_cr_tiny_224 (line 834) | def swinv2_cr_tiny_224(pretrained=False, **kwargs):
  function swinv2_cr_tiny_ns_224 (line 846) | def swinv2_cr_tiny_ns_224(pretrained=False, **kwargs):
  function swinv2_cr_small_384 (line 861) | def swinv2_cr_small_384(pretrained=False, **kwargs):
  function swinv2_cr_small_224 (line 874) | def swinv2_cr_small_224(pretrained=False, **kwargs):
  function swinv2_cr_small_ns_224 (line 886) | def swinv2_cr_small_ns_224(pretrained=False, **kwargs):
  function swinv2_cr_base_384 (line 899) | def swinv2_cr_base_384(pretrained=False, **kwargs):
  function swinv2_cr_base_224 (line 911) | def swinv2_cr_base_224(pretrained=False, **kwargs):
  function swinv2_cr_base_ns_224 (line 923) | def swinv2_cr_base_ns_224(pretrained=False, **kwargs):
  function swinv2_cr_large_384 (line 936) | def swinv2_cr_large_384(pretrained=False, **kwargs):
  function swinv2_cr_large_224 (line 949) | def swinv2_cr_large_224(pretrained=False, **kwargs):
  function swinv2_cr_huge_384 (line 961) | def swinv2_cr_huge_384(pretrained=False, **kwargs):
  function swinv2_cr_huge_224 (line 974) | def swinv2_cr_huge_224(pretrained=False, **kwargs):
  function swinv2_cr_giant_384 (line 987) | def swinv2_cr_giant_384(pretrained=False, **kwargs):
  function swinv2_cr_giant_224 (line 1001) | def swinv2_cr_giant_224(pretrained=False, **kwargs):

FILE: model/conv/CondConv.py
  class Attention (line 5) | class Attention(nn.Module):
    method __init__ (line 6) | def __init__(self,in_planes,K,init_weight=True):
    method _initialize_weights (line 16) | def _initialize_weights(self):
    method forward (line 26) | def forward(self,x):
  class CondConv (line 31) | class CondConv(nn.Module):
    method __init__ (line 32) | def __init__(self,in_planes,out_planes,kernel_size,stride,padding=0,di...
    method _initialize_weights (line 56) | def _initialize_weights(self):
    method forward (line 60) | def forward(self,x):

FILE: model/conv/DepthwiseSeparableConvolution.py
  class DepthwiseSeparableConvolution (line 4) | class DepthwiseSeparableConvolution(nn.Module):
    method __init__ (line 5) | def __init__(self,in_ch,out_ch,kernel_size=3,stride=1,padding=1):
    method forward (line 24) | def forward(self, x):

FILE: model/conv/DynamicConv.py
  class Attention (line 5) | class Attention(nn.Module):
    method __init__ (line 6) | def __init__(self,in_planes,ratio,K,temprature=30,init_weight=True):
    method update_temprature (line 21) | def update_temprature(self):
    method _initialize_weights (line 25) | def _initialize_weights(self):
    method forward (line 35) | def forward(self,x):
  class DynamicConv (line 40) | class DynamicConv(nn.Module):
    method __init__ (line 41) | def __init__(self,in_planes,out_planes,kernel_size,stride,padding=0,di...
    method _initialize_weights (line 65) | def _initialize_weights(self):
    method forward (line 69) | def forward(self,x):

FILE: model/conv/HorNet.py
  function get_dwconv (line 9) | def get_dwconv(dim, kernel, bias):
  class GlobalLocalFilter (line 12) | class GlobalLocalFilter(nn.Module):
    method __init__ (line 14) | def __init__(self, dim, h=14, w=8):
    method forward (line 22) | def forward(self, x):
  class gnconv (line 45) | class gnconv(nn.Module):
    method __init__ (line 46) | def __init__(self, dim, order=5, gflayer=None, h=14, w=8, s=1.0):
    method forward (line 66) | def forward(self, x, mask=None, dummy=False):
  class Block (line 84) | class Block(nn.Module):
    method __init__ (line 87) | def __init__(self, dim, drop_path=0., layer_scale_init_value=1e-6, gnc...
    method forward (line 104) | def forward(self, x):
  class HorNet (line 126) | class HorNet(nn.Module):
    method __init__ (line 127) | def __init__(self, in_chans=3, num_classes=1000,
    method _init_weights (line 176) | def _init_weights(self, m):
    method forward_features (line 188) | def forward_features(self, x):
    method forward (line 195) | def forward(self, x):
  class LayerNorm (line 200) | class LayerNorm(nn.Module):
    method __init__ (line 206) | def __init__(self, normalized_shape, eps=1e-6, data_format="channels_l...
    method forward (line 216) | def forward(self, x):
  function hornet_tiny_7x7 (line 227) | def hornet_tiny_7x7(pretrained=False,in_22k=False, **kwargs):
  function hornet_tiny_gf (line 241) | def hornet_tiny_gf(pretrained=False,in_22k=False, **kwargs):
  function hornet_small_7x7 (line 255) | def hornet_small_7x7(pretrained=False,in_22k=False, **kwargs):
  function hornet_small_gf (line 269) | def hornet_small_gf(pretrained=False,in_22k=False, **kwargs):
  function hornet_base_7x7 (line 283) | def hornet_base_7x7(pretrained=False,in_22k=False, **kwargs):
  function hornet_base_gf (line 297) | def hornet_base_gf(pretrained=False,in_22k=False, **kwargs):
  function hornet_base_gf_img384 (line 311) | def hornet_base_gf_img384(pretrained=False,in_22k=False, **kwargs):
  function hornet_large_7x7 (line 325) | def hornet_large_7x7(pretrained=False,in_22k=False, **kwargs):
  function hornet_large_gf (line 339) | def hornet_large_gf(pretrained=False,in_22k=False, **kwargs):
  function hornet_large_gf_img384 (line 353) | def hornet_large_gf_img384(pretrained=False,in_22k=False, **kwargs):

FILE: model/conv/Involution.py
  class Involution (line 9) | class Involution(nn.Module):
    method __init__ (line 10) | def __init__(self, kernel_size, in_channel=4, stride=1, group=1,ratio=4):
    method forward (line 34) | def forward(self, inputs):

FILE: model/conv/MBConv.py
  class SwishImplementation (line 8) | class SwishImplementation(torch.autograd.Function):
    method forward (line 10) | def forward(ctx, i):
    method backward (line 16) | def backward(ctx, grad_output):
  class MemoryEfficientSwish (line 21) | class MemoryEfficientSwish(nn.Module):
    method forward (line 22) | def forward(self, x):
  function drop_connect (line 26) | def drop_connect(inputs, p, training):
  function get_same_padding_conv2d (line 38) | def get_same_padding_conv2d(image_size=None):
  function get_width_and_height_from_size (line 41) | def get_width_and_height_from_size(x):
  function calculate_output_image_size (line 47) | def calculate_output_image_size(input_image_size, stride):
  class Conv2dStaticSamePadding (line 60) | class Conv2dStaticSamePadding(nn.Conv2d):
    method __init__ (line 63) | def __init__(self, in_channels, out_channels, kernel_size, image_size=...
    method forward (line 80) | def forward(self, x):
  class Identity (line 85) | class Identity(nn.Module):
    method __init__ (line 86) | def __init__(self, ):
    method forward (line 89) | def forward(self, input):
  class MBConvBlock (line 94) | class MBConvBlock(nn.Module):
    method __init__ (line 98) | def __init__(self, ksize, input_filters, output_filters, expand_ratio=...
    method forward (line 140) | def forward(self, inputs, drop_connect_rate=None):

FILE: model/mlp/g_mlp.py
  function exist (line 6) | def exist(x):
  class Residual (line 9) | class Residual(nn.Module):
    method __init__ (line 10) | def __init__(self,fn):
    method forward (line 14) | def forward(self,x):
  class SpatialGatingUnit (line 17) | class SpatialGatingUnit(nn.Module):
    method __init__ (line 18) | def __init__(self,dim,len_sen):
    method forward (line 26) | def forward(self,x):
  class gMLP (line 35) | class gMLP(nn.Module):
    method __init__ (line 36) | def __init__(self,num_tokens=None,len_sen=49,dim=512,d_ff=1024,num_lay...
    method forward (line 58) | def forward(self,x):

FILE: model/mlp/mlp_mixer.py
  class MlpBlock (line 4) | class MlpBlock(nn.Module):
    method __init__ (line 5) | def __init__(self,input_dim,mlp_dim=512) :
    method forward (line 11) | def forward(self,x):
  class MixerBlock (line 17) | class MixerBlock(nn.Module):
    method __init__ (line 18) | def __init__(self,tokens_mlp_dim=16,channels_mlp_dim=1024,tokens_hidde...
    method forward (line 24) | def forward(self,x):
  class MlpMixer (line 39) | class MlpMixer(nn.Module):
    method __init__ (line 40) | def __init__(self,num_classes,num_blocks,patch_size,tokens_hidden_dim,...
    method forward (line 54) | def forward(self,x):

FILE: model/mlp/repmlp.py
  function setup_seed (line 9) | def setup_seed(seed):
  class RepMLP (line 16) | class RepMLP(nn.Module):
    method __init__ (line 17) | def __init__(self,C,O,H,W,h,w,fc1_fc2_reduction=1,fc3_groups=8,repconv...
    method switch_to_deploy (line 69) | def switch_to_deploy(self):
    method get_equivalent_fc1_fc3_params (line 95) | def get_equivalent_fc1_fc3_params(self):
    method _conv_to_fc (line 148) | def _conv_to_fc(self,conv_kernel, conv_bias):
    method _fuse_bn (line 156) | def _fuse_bn(self, conv_or_fc, bn):
    method forward (line 166) | def forward(self,x) :

FILE: model/mlp/resmlp.py
  class Rearange (line 5) | class Rearange(nn.Module):
    method __init__ (line 6) | def __init__(self,image_size=14,patch_size=7) :
    method forward (line 15) | def forward(self,x):
  class Affine (line 24) | class Affine(nn.Module):
    method __init__ (line 25) | def __init__(self, channel):
    method forward (line 30) | def forward(self, x):
  class PreAffinePostLayerScale (line 33) | class PreAffinePostLayerScale(nn.Module): # https://arxiv.org/abs/2103.1...
    method __init__ (line 34) | def __init__(self, dim, depth, fn):
    method forward (line 48) | def forward(self, x):
  class ResMLP (line 52) | class ResMLP(nn.Module):
    method __init__ (line 53) | def __init__(self,dim=128,image_size=14,patch_size=7,expansion_factor=...
    method forward (line 74) | def forward(self, x) :

FILE: model/mlp/sMLP_block.py
  class sMLPBlock (line 10) | class sMLPBlock(nn.Module):
    method __init__ (line 11) | def __init__(self,h=224,w=224,c=3):
    method forward (line 17) | def forward(self,x):

FILE: model/mlp/vip-mlp.py
  function _cfg (line 8) | def _cfg(url='', **kwargs):
  class Mlp (line 24) | class Mlp(nn.Module):
    method __init__ (line 25) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 34) | def forward(self, x):
  class WeightedPermuteMLP (line 42) | class WeightedPermuteMLP(nn.Module):
    method __init__ (line 43) | def __init__(self, dim, segment_dim=8, qkv_bias=False, qk_scale=None, ...
    method forward (line 58) | def forward(self, x):
  class PermutatorBlock (line 81) | class PermutatorBlock(nn.Module):
    method __init__ (line 83) | def __init__(self, dim, segment_dim, mlp_ratio=4., qkv_bias=False, qk_...
    method forward (line 97) | def forward(self, x):
  class PatchEmbed (line 102) | class PatchEmbed(nn.Module):
    method __init__ (line 105) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 109) | def forward(self, x):
  class Downsample (line 114) | class Downsample(nn.Module):
    method __init__ (line 117) | def __init__(self, in_embed_dim, out_embed_dim, patch_size):
    method forward (line 121) | def forward(self, x):
  function basic_blocks (line 127) | def basic_blocks(dim, index, layers, segment_dim, mlp_ratio=3., qkv_bias...
  class VisionPermutator (line 140) | class VisionPermutator(nn.Module):
    method __init__ (line 143) | def __init__(self, layers, img_size=224, patch_size=4, in_chans=3, num...
    method _init_weights (line 174) | def _init_weights(self, m):
    method get_classifier (line 183) | def get_classifier(self):
    method reset_classifier (line 186) | def reset_classifier(self, num_classes, global_pool=''):
    method forward_embeddings (line 190) | def forward_embeddings(self, x):
    method forward_tokens (line 196) | def forward_tokens(self,x):
    method forward (line 203) | def forward(self, x):
  function vip_s14 (line 214) | def vip_s14(pretrained=False, **kwargs):
  function vip_s7 (line 226) | def vip_s7(pretrained=False, **kwargs):
  function vip_m7 (line 238) | def vip_m7(pretrained=False, **kwargs):
  function vip_l7 (line 252) | def vip_l7(pretrained=False, **kwargs):

FILE: model/rep/acnet.py
  function setup_seed (line 8) | def setup_seed(seed):
  function _conv_bn (line 15) | def _conv_bn(input_channel,output_channel,kernel_size=3,padding=1,stride...
  class ACNet (line 21) | class ACNet(nn.Module):
    method __init__ (line 22) | def __init__(self,input_channel,output_channel,kernel_size=3,groups=1,...
    method forward (line 43) | def forward(self, inputs):
    method _switch_to_deploy (line 52) | def _switch_to_deploy(self):
    method _pad_1x3_kernel (line 72) | def _pad_1x3_kernel(self,kernel):
    method _pad_3x1_kernel (line 79) | def _pad_3x1_kernel(self,kernel):
    method _get_equivalent_kernel_bias (line 87) | def _get_equivalent_kernel_bias(self):
    method _fuse_conv_bn (line 95) | def _fuse_conv_bn(self,branch):

FILE: model/rep/ddb.py
  function transI_conv_bn (line 5) | def transI_conv_bn(conv, bn):
  function transII_conv_branch (line 17) | def transII_conv_branch(conv1, conv2):
  function transIII_conv_sequential (line 23) | def transIII_conv_sequential(conv1, conv2):
  function transIV_conv_concat (line 28) | def transIV_conv_concat(conv1, conv2):
  function transV_avg (line 35) | def transV_avg(channel,kernel):
  function transVI_conv_scale (line 42) | def transVI_conv_scale(conv1, conv2, conv3):

FILE: model/rep/repvgg.py
  function setup_seed (line 8) | def setup_seed(seed):
  function _conv_bn (line 15) | def _conv_bn(input_channel,output_channel,kernel_size=3,padding=1,stride...
  class RepBlock (line 21) | class RepBlock(nn.Module):
    method __init__ (line 22) | def __init__(self,input_channel,output_channel,kernel_size=3,groups=1,...
    method forward (line 47) | def forward(self, inputs):
    method _switch_to_deploy (line 61) | def _switch_to_deploy(self):
    method _pad_1x1_kernel (line 80) | def _pad_1x1_kernel(self,kernel):
    method _get_equivalent_kernel_bias (line 88) | def _get_equivalent_kernel_bias(self):
    method _fuse_conv_bn (line 96) | def _fuse_conv_bn(self,branch):
Condensed preview — 106 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (1,132K chars).
[
  {
    "path": "LICENSE",
    "chars": 1070,
    "preview": "MIT License\n\nCopyright (c) 2021 xmu-xiaoma666\n\nPermission is hereby granted, free of charge, to any person obtaining a c"
  },
  {
    "path": "README.md",
    "chars": 63325,
    "preview": "\n<img src=\"./FightingCVimg/LOGO.gif\" height=\"200\" width=\"400\"/>\n\n简体中文 | [English](./README_EN.md)\n\n# FightingCV 代码库, 包含 "
  },
  {
    "path": "README_EN.md",
    "chars": 61535,
    "preview": "\n<img src=\"./FightingCVimg/LOGO.gif\" height=\"200\" width=\"400\"/>\n\nEnglish | [简体中文](./README.md)\n\n# FightingCV Codebase Fo"
  },
  {
    "path": "README_pip.md",
    "chars": 60443,
    "preview": "## pip使用文档\n\n### 安装\n\n 直接通过 pip 安装,可直接在其他任务中使用\n\n  ```shell\n  pip install fightingcv-attention\n  ```\n\n### 演示\n\n#### 使用 pip 方"
  },
  {
    "path": "main.py",
    "chars": 272,
    "preview": "from model.attention.MobileViTv2Attention import *\nimport torch\nfrom torch import nn\nfrom torch.nn import functional as "
  },
  {
    "path": "model/.vscode/settings.json",
    "chars": 55,
    "preview": "{\n    \"python.pythonPath\": \"D:\\\\Anaconda\\\\python.exe\"\n}"
  },
  {
    "path": "model/__init__.py",
    "chars": 77,
    "preview": "\ndef test():\n    print (\"hello world\")\n\nif __name__ == '__main__':\n    test()"
  },
  {
    "path": "model/analysis/Attention.md",
    "chars": 13687,
    "preview": "## Content\n  \n- [1. External Attention](#1-external-attention)\n\n- [2. Self Attention](#2-self-attention)\n\n- [3. Squeeze-"
  },
  {
    "path": "model/analysis/注意力机制.md",
    "chars": 8283,
    "preview": "## 目录\n  \n- [1. External Attention](#1-external-attention)\n\n- [2. Self Attention](#2-self-attention)\n\n- [3. Squeeze-and-E"
  },
  {
    "path": "model/analysis/重参数机制.md",
    "chars": 9369,
    "preview": "[toc]\n\n## 【写在前面】\n\n最近拜读了丁霄汉大神的一系列重参数的论文,觉得这个思想真的很妙。能够在将所有的cost都放在训练过程中,在测试的时候能够在所有的网络参数和计算量都进行缩减。目前网上也有部分对这些论文进行了解析,为了能够让"
  },
  {
    "path": "model/attention/A2Atttention.py",
    "chars": 2081,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\nfrom torch.nn import functional as F\n\n\n\nc"
  },
  {
    "path": "model/attention/ACmixAttention.py",
    "chars": 4717,
    "preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\ndef position(H, W, is_cuda=True):\n    if is_cuda:\n  "
  },
  {
    "path": "model/attention/AFT.py",
    "chars": 1880,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass AFT_FULL(nn.Module):\n\n    def __"
  },
  {
    "path": "model/attention/Axial_attention.py",
    "chars": 12360,
    "preview": "import torch\nfrom torch import nn\nfrom operator import itemgetter\n# from axial_attention.reversible import ReversibleSeq"
  },
  {
    "path": "model/attention/BAM.py",
    "chars": 3211,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\nclass Flatten(nn.Module):\n    def forwar"
  },
  {
    "path": "model/attention/CBAM.py",
    "chars": 2431,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass ChannelAttention(nn.Module):\n   "
  },
  {
    "path": "model/attention/CoAtNet.py",
    "chars": 2714,
    "preview": "from torch import nn, sqrt\nimport torch\nimport sys\nfrom math import sqrt\nsys.path.append('.')\nfrom model.conv.MBConv imp"
  },
  {
    "path": "model/attention/CoTAttention.py",
    "chars": 1599,
    "preview": "import numpy as np\nimport torch\nfrom torch import flatten, nn\nfrom torch.nn import init\nfrom torch.nn.modules.activation"
  },
  {
    "path": "model/attention/CoordAttention.py",
    "chars": 1600,
    "preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nclass h_sigmoid(nn.Module):\n    def __init__(self, i"
  },
  {
    "path": "model/attention/CrissCrossAttention.py",
    "chars": 2669,
    "preview": "'''\nThis code is borrowed from Serge-weihao/CCNet-Pure-Pytorch\n'''\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.f"
  },
  {
    "path": "model/attention/Crossformer.py",
    "chars": 25891,
    "preview": "import torch\nimport torch.nn as nn\nimport torch.utils.checkpoint as checkpoint\nfrom timm.models.layers import DropPath, "
  },
  {
    "path": "model/attention/DANet.py",
    "chars": 1908,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\nfrom model.attention.SelfAttention import"
  },
  {
    "path": "model/attention/DAT.py",
    "chars": 23157,
    "preview": "# --------------------------------------------------------\n# Swin Transformer\n# Copyright (c) 2021 Microsoft\n# Licensed "
  },
  {
    "path": "model/attention/ECAAttention.py",
    "chars": 1356,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\nfrom collections import OrderedDict\n\n\n\ncl"
  },
  {
    "path": "model/attention/EMSA.py",
    "chars": 3665,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass EMSA(nn.Module):\n\n    def __init"
  },
  {
    "path": "model/attention/ExternalAttention.py",
    "chars": 1299,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass ExternalAttention(nn.Module):\n\n "
  },
  {
    "path": "model/attention/HaloAttention.py",
    "chars": 5193,
    "preview": "import torch\nfrom torch import nn, einsum\nimport torch.nn.functional as F\n\nfrom einops import rearrange, repeat\n\n# relat"
  },
  {
    "path": "model/attention/MOATransformer.py",
    "chars": 30270,
    "preview": "\n# --------------------------------------------------------\n# Adopted from Swin Transformer\n# Modified by Krushi Patel\n#"
  },
  {
    "path": "model/attention/MUSEAttention.py",
    "chars": 3574,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass Depth_Pointwise_Conv1d(nn.Module"
  },
  {
    "path": "model/attention/MobileViTAttention.py",
    "chars": 3388,
    "preview": "from torch import nn\nimport torch\nfrom einops import rearrange\n\n\nclass PreNorm(nn.Module):\n    def __init__(self,dim,fn)"
  },
  {
    "path": "model/attention/MobileViTv2Attention.py",
    "chars": 1943,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass MobileViTv2Attention(nn.Module):"
  },
  {
    "path": "model/attention/OutlookAttention.py",
    "chars": 2183,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\nimport math\nfrom torch.nn import function"
  },
  {
    "path": "model/attention/PSA.py",
    "chars": 2231,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass PSA(nn.Module):\n\n    def __init_"
  },
  {
    "path": "model/attention/ParNetAttention.py",
    "chars": 997,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass ParNetAttention(nn.Module):\n\n   "
  },
  {
    "path": "model/attention/PolarizedSelfAttention.py",
    "chars": 3989,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass ParallelPolarizedSelfAttention(n"
  },
  {
    "path": "model/attention/ResidualAttention.py",
    "chars": 786,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass ResidualAttention(nn.Module):\n\n "
  },
  {
    "path": "model/attention/S2Attention.py",
    "chars": 2137,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\ndef spatial_shift1(x):\n    b,w,h,c = x."
  },
  {
    "path": "model/attention/SEAttention.py",
    "chars": 1367,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass SEAttention(nn.Module):\n\n    def"
  },
  {
    "path": "model/attention/SGE.py",
    "chars": 1738,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass SpatialGroupEnhance(nn.Module):\n"
  },
  {
    "path": "model/attention/SKAttention.py",
    "chars": 1803,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\nfrom collections import OrderedDict\n\n\n\ncl"
  },
  {
    "path": "model/attention/SelfAttention.py",
    "chars": 3004,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass ScaledDotProductAttention(nn.Mod"
  },
  {
    "path": "model/attention/ShuffleAttention.py",
    "chars": 2560,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\nfrom torch.nn.parameter import Parameter\n"
  },
  {
    "path": "model/attention/SimAM.py",
    "chars": 882,
    "preview": "import torch\nimport torch.nn as nn\n\n\nclass SimAM(torch.nn.Module):\n    def __init__(self, channels = None, e_lambda = 1e"
  },
  {
    "path": "model/attention/SimplifiedSelfAttention.py",
    "chars": 2853,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.nn import init\n\n\n\nclass SimplifiedScaledDotProductAttent"
  },
  {
    "path": "model/attention/TripletAttention.py",
    "chars": 2292,
    "preview": "import torch\nimport torch.nn as nn\n\nclass BasicConv(nn.Module):\n    def __init__(self, in_planes, out_planes, kernel_siz"
  },
  {
    "path": "model/attention/UFOAttention.py",
    "chars": 2510,
    "preview": "import numpy as np\nimport torch\nfrom torch import nn\nfrom torch.functional import norm\nfrom torch.nn import init\n\n\ndef X"
  },
  {
    "path": "model/attention/ViP.py",
    "chars": 1883,
    "preview": "import torch\nfrom torch import nn\n\n\nclass MLP(nn.Module):\n    def __init__(self,in_features,hidden_features,out_features"
  },
  {
    "path": "model/attention/gfnet.py",
    "chars": 4025,
    "preview": "import torch\nfrom torch import nn\nimport math\nfrom timm.models.layers import DropPath, to_2tuple\n\nclass PatchEmbed(nn.Mo"
  },
  {
    "path": "model/backbone/CMT.py",
    "chars": 19677,
    "preview": "## Author: Jianyuan Guo (jyguo@pku.edu.cn)\n\nimport math\nimport logging\nfrom functools import partial\nfrom collections im"
  },
  {
    "path": "model/backbone/CPVT.py",
    "chars": 19249,
    "preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom functools import partial\n\nfrom timm.models.layer"
  },
  {
    "path": "model/backbone/CaiT.py",
    "chars": 18233,
    "preview": "# Copyright (c) 2015-present, Facebook, Inc.\n# All rights reserved.\n\nimport torch\nimport torch.nn as nn\nfrom functools i"
  },
  {
    "path": "model/backbone/CeiT.py",
    "chars": 17749,
    "preview": "import math\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom functools import partial\n\nfrom timm."
  },
  {
    "path": "model/backbone/CoaT.py",
    "chars": 27470,
    "preview": "\"\"\" \nCoaT architecture.\n\nModified from timm/models/vision_transformer.py\n\"\"\"\n\nimport torch\nimport torch.nn as nn\nimport "
  },
  {
    "path": "model/backbone/ConTNet.py",
    "chars": 18774,
    "preview": "import torch.nn as nn\nimport torch.nn.functional as F\nimport torch\n\nfrom einops.layers.torch import Rearrange\nfrom einop"
  },
  {
    "path": "model/backbone/ConViT.py",
    "chars": 17440,
    "preview": "# Copyright (c) 2015-present, Facebook, Inc.\n# All rights reserved.\n#\n# This source code is licensed under the CC-by-NC "
  },
  {
    "path": "model/backbone/Container.py",
    "chars": 19206,
    "preview": "import torch\nimport torch.nn as nn\nfrom functools import partial\nimport math\nfrom timm.models.vision_transformer import "
  },
  {
    "path": "model/backbone/ConvMixer.py",
    "chars": 1127,
    "preview": "import torch.nn as nn\nfrom torch.nn.modules.activation import GELU\nimport torch\nfrom torch.nn.modules.pooling import Ada"
  },
  {
    "path": "model/backbone/CrossViT.py",
    "chars": 22312,
    "preview": "# Copyright IBM All Rights Reserved.\n# SPDX-License-Identifier: Apache-2.0\n\n\n\"\"\"\nModifed from Timm. https://github.com/r"
  },
  {
    "path": "model/backbone/DViT.py",
    "chars": 19063,
    "preview": "\"\"\" \nCode for DeepViT. The implementation has heavy reference to timm.\n\"\"\"\nimport torch\nimport torch.nn as nn\nfrom funct"
  },
  {
    "path": "model/backbone/DeiT.py",
    "chars": 7300,
    "preview": "# Copyright (c) 2015-present, Facebook, Inc.\n# All rights reserved.\nimport torch\nimport torch.nn as nn\nimport numpy as n"
  },
  {
    "path": "model/backbone/EfficientFormer.py",
    "chars": 17785,
    "preview": "\"\"\"\nEfficientFormer\n\"\"\"\nimport os\nimport copy\nimport torch\nimport torch.nn as nn\n\nfrom typing import Dict\nimport itertoo"
  },
  {
    "path": "model/backbone/HATNet.py",
    "chars": 7879,
    "preview": "from pyexpat import model\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom timm.models.layers im"
  },
  {
    "path": "model/backbone/LeViT.py",
    "chars": 19077,
    "preview": "# Copyright (c) 2015-present, Facebook, Inc.\n# All rights reserved.\n\n# Modified from\n# https://github.com/rwightman/pyto"
  },
  {
    "path": "model/backbone/MobileNetV3.py",
    "chars": 29193,
    "preview": "\"\"\" MobileNet V3\nA PyTorch impl of MobileNet-V3, compatible with TF weights from official impl.\nPaper: Searching for Mob"
  },
  {
    "path": "model/backbone/MobileViT.py",
    "chars": 7734,
    "preview": "from torch import nn\nimport torch\nfrom torch.nn.modules import conv\nfrom torch.nn.modules.conv import Conv2d\nfrom einops"
  },
  {
    "path": "model/backbone/PIT.py",
    "chars": 10993,
    "preview": "# PiT\n# Copyright 2021-present NAVER Corp.\n# Apache License v2.0\n\nimport torch\nfrom einops import rearrange\nfrom torch i"
  },
  {
    "path": "model/backbone/PVT.py",
    "chars": 11996,
    "preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom functools import partial\n\nfrom timm.models.layer"
  },
  {
    "path": "model/backbone/PatchConvnet.py",
    "chars": 17344,
    "preview": "# Copyright (c) 2015-present, Facebook, Inc.\n# All rights reserved.\n\n\nfrom functools import partial\nfrom typing import O"
  },
  {
    "path": "model/backbone/ShuffleTransformer.py",
    "chars": 12729,
    "preview": "import torch\nfrom torch import nn, einsum\nfrom einops import rearrange, repeat\nimport torch.utils.checkpoint as checkpoi"
  },
  {
    "path": "model/backbone/TnT.py",
    "chars": 14841,
    "preview": "# 2021.06.15-Changed for implementation of TNT model\n#            Huawei Technologies Co., Ltd. <foss@huawei.com>\nimport"
  },
  {
    "path": "model/backbone/VOLO.py",
    "chars": 29687,
    "preview": "# Copyright 2021 Sea Limited.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this"
  },
  {
    "path": "model/backbone/convnextv2.py",
    "chars": 6982,
    "preview": "# Copyright (c) Meta Platforms, Inc. and affiliates.\n\n# All rights reserved.\n\n# This source code is licensed under the l"
  },
  {
    "path": "model/backbone/resnet.py",
    "chars": 3876,
    "preview": "import torch\nfrom torch import nn\n\n\n\"\"\"\n    # in_channel:输入block之前的通道数\n    # channel:在block中间处理的时候的通道数(这个值是输出维度的1/4)\n   "
  },
  {
    "path": "model/backbone/resnext.py",
    "chars": 3911,
    "preview": "import torch\nfrom torch import nn\n\n\n\"\"\"\n    # in_channel:输入block之前的通道数\n    # channel:在block中间处理的时候的通道数(这个值是输出维度的1/4)\n   "
  },
  {
    "path": "model/backbone/swin_transformer.py",
    "chars": 29374,
    "preview": "\"\"\" Swin Transformer\nA PyTorch impl of : `Swin Transformer: Hierarchical Vision Transformer using Shifted Windows`\n    -"
  },
  {
    "path": "model/backbone/swin_transformer_v2.py",
    "chars": 31568,
    "preview": "\"\"\" Swin Transformer V2\nA PyTorch impl of : `Swin Transformer V2: Scaling Up Capacity and Resolution`\n    - https://arxi"
  },
  {
    "path": "model/backbone/swin_transformer_v2_cr.py",
    "chars": 41510,
    "preview": "\"\"\" Swin Transformer V2\nA PyTorch impl of : `Swin Transformer V2: Scaling Up Capacity and Resolution`\n    - https://arxi"
  },
  {
    "path": "model/conv/CondConv.py",
    "chars": 3097,
    "preview": "import torch\nfrom torch import nn\nfrom torch.nn import functional as F\n\nclass Attention(nn.Module):\n    def __init__(sel"
  },
  {
    "path": "model/conv/DepthwiseSeparableConvolution.py",
    "chars": 899,
    "preview": "import torch\nfrom torch import nn\n\nclass DepthwiseSeparableConvolution(nn.Module):\n    def __init__(self,in_ch,out_ch,ke"
  },
  {
    "path": "model/conv/DynamicConv.py",
    "chars": 3498,
    "preview": "import torch\nfrom torch import nn\nfrom torch.nn import functional as F\n\nclass Attention(nn.Module):\n    def __init__(sel"
  },
  {
    "path": "model/conv/HorNet.py",
    "chars": 13213,
    "preview": "from functools import partial\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom timm.models.layers"
  },
  {
    "path": "model/conv/Involution.py",
    "chars": 1907,
    "preview": "import math\nfrom functools import partial\n\nimport torch\nfrom torch import nn, select\nfrom torch.nn import functional as "
  },
  {
    "path": "model/conv/MBConv.py",
    "chars": 6757,
    "preview": "import math\nfrom functools import partial\n\nimport torch\nfrom torch import nn\nfrom torch.nn import functional as F\n\nclass"
  },
  {
    "path": "model/fighingcv.egg-info/PKG-INFO",
    "chars": 61077,
    "preview": "Metadata-Version: 2.1\nName: fighingcv\nVersion: 1.0.0\nSummary: Client library to download and publish models, datasets an"
  },
  {
    "path": "model/fighingcv.egg-info/SOURCES.txt",
    "chars": 262,
    "preview": "LICENSE\nREADME.md\nsetup.py\nmodel/fighingcv.egg-info/PKG-INFO\nmodel/fighingcv.egg-info/SOURCES.txt\nmodel/fighingcv.egg-in"
  },
  {
    "path": "model/fighingcv.egg-info/dependency_links.txt",
    "chars": 1,
    "preview": "\n"
  },
  {
    "path": "model/fighingcv.egg-info/entry_points.txt",
    "chars": 83,
    "preview": "[console_scripts]\nhuggingface-cli = huggingface_hub.commands.huggingface_cli:main\n\n"
  },
  {
    "path": "model/fighingcv.egg-info/requires.txt",
    "chars": 532,
    "preview": "filelock\nrequests\ntqdm\npyyaml>=5.1\ntyping-extensions>=3.7.4.3\npackaging>=20.9\n\n[:python_version < \"3.8\"]\nimportlib_metad"
  },
  {
    "path": "model/fighingcv.egg-info/top_level.txt",
    "chars": 1,
    "preview": "\n"
  },
  {
    "path": "model/huggingface_hub.egg-info/PKG-INFO",
    "chars": 61083,
    "preview": "Metadata-Version: 2.1\nName: huggingface-hub\nVersion: 1.0.0\nSummary: Client library to download and publish models, datas"
  },
  {
    "path": "model/huggingface_hub.egg-info/SOURCES.txt",
    "chars": 298,
    "preview": "LICENSE\nREADME.md\nsetup.py\nmodel/huggingface_hub.egg-info/PKG-INFO\nmodel/huggingface_hub.egg-info/SOURCES.txt\nmodel/hugg"
  },
  {
    "path": "model/huggingface_hub.egg-info/dependency_links.txt",
    "chars": 1,
    "preview": "\n"
  },
  {
    "path": "model/huggingface_hub.egg-info/entry_points.txt",
    "chars": 83,
    "preview": "[console_scripts]\nhuggingface-cli = huggingface_hub.commands.huggingface_cli:main\n\n"
  },
  {
    "path": "model/huggingface_hub.egg-info/requires.txt",
    "chars": 532,
    "preview": "filelock\nrequests\ntqdm\npyyaml>=5.1\ntyping-extensions>=3.7.4.3\npackaging>=20.9\n\n[:python_version < \"3.8\"]\nimportlib_metad"
  },
  {
    "path": "model/huggingface_hub.egg-info/top_level.txt",
    "chars": 1,
    "preview": "\n"
  },
  {
    "path": "model/mlp/g_mlp.py",
    "chars": 2022,
    "preview": "from collections import OrderedDict\nimport torch\nfrom torch import nn\n\n\ndef exist(x):\n    return x is not None\n\nclass Re"
  },
  {
    "path": "model/mlp/mlp_mixer.py",
    "chars": 2836,
    "preview": "import torch\nfrom torch import nn\n\nclass MlpBlock(nn.Module):\n    def __init__(self,input_dim,mlp_dim=512) :\n        sup"
  },
  {
    "path": "model/mlp/repmlp.py",
    "chars": 9244,
    "preview": "import torch\nfrom torch import nn\nfrom collections import OrderedDict\nfrom torch.nn import functional as F\nimport numpy "
  },
  {
    "path": "model/mlp/resmlp.py",
    "chars": 2728,
    "preview": "import torch\nfrom torch import nn\n\n\nclass Rearange(nn.Module):\n    def __init__(self,image_size=14,patch_size=7) :\n     "
  },
  {
    "path": "model/mlp/sMLP_block.py",
    "chars": 641,
    "preview": "import torch\nfrom torch import nn\n\n\n\n\n\n\n\nclass sMLPBlock(nn.Module):\n    def __init__(self,h=224,w=224,c=3):\n        sup"
  },
  {
    "path": "model/mlp/vip-mlp.py",
    "chars": 10038,
    "preview": "import torch\nimport torch.nn as nn\n\nfrom timm.data import IMAGENET_DEFAULT_MEAN, IMAGENET_DEFAULT_STD\nfrom timm.models.l"
  },
  {
    "path": "model/rep/acnet.py",
    "chars": 4333,
    "preview": "import torch\nfrom torch import mean, nn\nfrom collections import OrderedDict\nfrom torch.nn import functional as F\nimport "
  },
  {
    "path": "model/rep/ddb.py",
    "chars": 2008,
    "preview": "from torch import conv2d, nn\nimport torch\nfrom torch.nn import functional as F\n\ndef transI_conv_bn(conv, bn):\n\n    std ="
  },
  {
    "path": "model/rep/mobileone.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "model/rep/repvgg.py",
    "chars": 5352,
    "preview": "import torch\nfrom torch import mean, nn\nfrom collections import OrderedDict\nfrom torch.nn import functional as F\nimport "
  },
  {
    "path": "setup.py",
    "chars": 1220,
    "preview": "from setuptools import find_packages, setup\n\n\n\nsetup(\n    name=\"fighingcv\",\n    version=\"1.0.0\",\n    author=\"xmu-xiaoma6"
  }
]

About this extraction

This page contains the full source code of the xmu-xiaoma666/External-Attention-pytorch GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 106 files (1.0 MB), approximately 316.3k tokens, and a symbol index with 1441 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!