[
  {
    "path": "CVPR2021.md",
    "content": "# CVPR2021-Papers-with-Code-Demo\n\n :star_and_crescent:**论文下载：https://pan.baidu.com/share/init?surl=gjfUQlPf73MCk4vM8VbzoA**\n\n**密码：aicv**\n\n:star2: [CVPR 2021](http://cvpr2021.thecvf.com/) 持续更新最新论文/paper和相应的开源代码/code！\n\n:car: CVPR 2021 收录列表：http://cvpr2021.thecvf.com/sites/default/files/2021-03/accepted_paper_ids.txt\n\nhttp://cvpr2021.thecvf.com/node/142\n\n:car: 官网链接：http://cvpr2021.thecvf.com/\n\n> :timer_clock: 时间：2021年6月19日-6月25日\n> :watch: 论文/paper接收公布时间：2021年2月28日\n\n> :hand: ​注：欢迎各位大佬提交issue，分享CVPR 2021论文/paper和开源项目！共同完善这个项目\n>\n> :airplane: 为了方便下载，已将论文/paper存储在文件夹中 :heavy_check_mark: 表示论文/paper[已下载 / Paper Download](https://github.com/DWCTOD/CVPR2021-Papers-with-Code-Demo/tree/main/CVPR2021-paper-download)\n\n### **:fireworks: 欢迎进群** | Welcome\n\nCVPR 2021 论文/paper交流群已成立！已经收录的同学，可以添加微信：**nvshenj125**，请备注：**CVPR+姓名+学校/公司名称**！一定要根据格式申请，可以拉你进群。\n\n<a name=\"Contents\"></a>\n\n### :hammer: **目录 |Table of Contents（点击直接跳转）**\n- [Backbone](#Backbone)\n\n- [NAS](#NAS)\n\n- [图神经网络 / Graph Neural Networks](#GNN)\n\n- [胶囊网络 / Capsule Network](#CapsuleNetwork)\n\n- [图像分类 / Image Classification](#ImageClassification)\n\n- [目标检测/Object Detection](#ObjectDetection)\n\n- [目标跟踪/Object Tracking](#ObjectTracking)\n\n- [语义分割/Segmentation](#Segmentation)\n\n- [弱监督语义分割/Weakly Supervised Semantic Segmentation](#WSSS)\n\n- [医学图像分割](#MedicalImageSegmentation)\n\n- [视频目标分割/Video Object Segmentation](#VideoObjectSegmentation)\n\n- [交互式视频目标分割/Interactive Video Object Segmentation](#InteractiveVideoObjectSegmentation)\n\n- [Visual Transformer](#VisualTransformer)\n\n- [人脸识别/Face Recognition](#FaceRecognition)\n\n- [人脸检测/Face Detection](#FaceDetection)\n\n- [人脸活体检测/Face Anti-Spoofing](#FaceAnti-Spoofing)\n\n- [人脸年龄估计/Age Estimation](#AgeEstimation)\n\n- [人脸表情识别/Facial Expression Recognition](#FacialExpressionRecognition)\n\n- [人脸属性识别/Facial Attribute Recognition](#FacialAttributeRecognition)\n\n- [人体姿态估计/Human Pose Estimation](#HumanPoseEstimation)\n\n- [6D位姿估计 /6D Pose Estimation](#6DPoseEstimation)\n\n- [手势姿态估计（重建）/Hand Pose Estimation( Hand Mesh Recovery)](#HandPoseEstimation)\n\n- [行人重识别/Person Re-identification](#PersonRe-identification)\n\n- [行人搜索/Person Search](#PersonSearch)\n\n- [人群计数 / Crowd Counting](#CrowdCounting)\n\n- [GAN](#GAN)\n\n- [彩妆迁移 / Color-Pattern Makeup Transfer](#CPM)\n\n- [字体生成 / Font Generation](#FontGeneration)\n\n- [场景文本检测、识别/Scene Text Detection/Recognition](#OCR)\n\n- [图像、视频检索 / Image Retrieval/Video retrieval](#Retrieval)\n\n- [Image Animation](#ImageAnimation)\n\n- [抠图/Image Matting](#ImageMatting)\n\n- [超分辨率/Super Resolution](#SuperResolution)\n\n- [图像复原/Image Restoration](#ImageRestoration)\n\n- [图像补全/Image Inpainting](#ImageInpainting)\n\n- [图像编辑/Image Editing](#ImageEditing)\n\n- [图像匹配/Image Matching](#ImageMatching)\n\n- [图像融合/Image Blending](#ImageBlending)\n\n- [图像去雾/Image Dehazing](#ImageDehazing)\n\n- [反光去除/Reflection Removal](#ReflectionRemoval)\n\n- [车道线检测/Lane Detection](#LaneDetection)\n\n- [流体重建/Fluid Reconstruction](#FluidReconstruction)\n\n- [视频插帧/Frame Interpolation](#FrameInterpolation)\n\n- [ 3D点云/3D point cloud]( #3DPointCloud)\n\n- [标签噪声 / Label-Noise](#Label-Noise)\n\n- [对抗样本 Adversarial Examples](#AdversarialExamples)\n\n  \n\n【目录 可直接跳转】\n\n### :key: **关键词** | Keywords\n【TODO】\n\n<a name=\"Backbone\"></a>\n\n## Backbone\n\n:heavy_check_mark:  **Coordinate Attention for Efficient Mobile Network Design** \n\n解读：[ CA：新注意力机制](https://zhuanlan.zhihu.com/p/353764183)\n\n论文/paper：https://arxiv.org/abs/2103.02907\n\n代码/code： https://github.com/Andrew-Qibin/CoordAttention\n\n:heavy_check_mark:**Diverse Branch Block: Building a Convolution as an Inception-like Unit**\n\n解读：[ACNet续作——CVPR2021:Diverse Branch Block](https://zhuanlan.zhihu.com/p/360939086)\n\n论文/paper： https://arxiv.org/abs/2103.13425\n\n代码/code： https://github.com/DingXiaoH/DiverseBranchBlock\n\n:heavy_check_mark:**Involution: Inverting the Inherence of Convolution for Visual Recognition**\n\n解读：[我卷我自己——CVPR2021:Involution](https://zhuanlan.zhihu.com/p/357408252)\n\n论文/paper：  https://arxiv.org/abs/2103.06255\n\n代码/code：  https://github.com/d-li14/involution\n\n :heavy_check_mark: **Inception Convolution with Efficient Dilation Search** \n\n解读：[IC-Conv使用高效空洞搜索Inception卷积](https://zhuanlan.zhihu.com/p/340506749)\n\n论文/paper：https://arxiv.org/pdf/2012.13587.pdf \n\n代码/code：https://github.com/yifan123/IC-Conv\n\n:heavy_check_mark: **RepVGG: Making VGG-style ConvNets Great Again**\n\n解读： [RepVGG：极简架构，SOTA性能，让VGG式模型再次伟大](https://zhuanlan.zhihu.com/p/308159909)| [RepVGG-重参数化Backbone](https://blog.csdn.net/weixin_42096202/article/details/112546234)\n\n论文/paper： https://arxiv.org/abs/2101.03697\n\n代码/code： https://github.com/DingXiaoH/RepVGG\n\n:heavy_check_mark:**ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network**\n\n解读：[ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network 论文/paper学习](https://blog.csdn.net/calvinpaean/article/details/107120053)\n\n论文/paper： https://arxiv.org/abs/2007.00992\n\n代码/code： https://github.com/clovaai/rexnet\n\n:heavy_check_mark:**Scaling Local Self-Attention For Parameter Efficient Visual Backbones**\n\n论文/paper：https://arxiv.org/abs/2103.12731\n\n代码/code：https://github.com/lucidrains/halonet-pytorch | [作者主页](https://shlens.github.io/)\n\n[返回目录/back](#Contents)\n\n<a name=\"NAS\"></a> \n\n## NAS\n\n:heavy_check_mark:**AttentiveNAS: Improving Neural Architecture Search via Attentive**\n\n论文/paper： https://arxiv.org/abs/2011.09011\n\n代码/code：None\n\n:heavy_check_mark:**Combined Depth Space based Architecture Search For Person Re-identification**\n\n论文/paper： https://arxiv.org/abs/2104.04163\n\n代码/code：None\n\n:heavy_check_mark:**Contrastive Neural Architecture Search with Neural Architecture Comparators**\n\n论文/paper： https://arxiv.org/abs/2103.05471\n\n代码/code：https://github.com/chenyaofo/CTNAS\n\n:heavy_check_mark:**DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation** (CVPR 2021)\n\n论文/paper：https://arxiv.org/abs/2103.15954\n\n代码/code：None\n\n:heavy_check_mark:**HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens**\n\n论文/paper：https://arxiv.org/abs/2005.14446\n\n代码/code：None\n\n**HR-NAS: Searching Efficient High-Resolution Neural Architectures with Transformers** (CVPR2021)\n\n论文/paper：None\n\n代码/code：https://github.com/dingmyu/HR-NAS\n\n:heavy_check_mark:**Inception Convolution with Efficient Dilation Search**\n\n论文/paper：https://arxiv.org/abs/2012.13587\n\n代码/code：None\n\n**Joint-DetNAS: Upgrade Your Detector with NAS, Pruning and Dynamic Distillation**\n\n论文/paper：None\n\n代码/code： None\n\n:heavy_check_mark:**Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search**\n\n论文/paper：https://arxiv.org/abs/2104.05309\n\n代码/code： https://github.com/kcyu2014/nas-landmarkreg\n\n:heavy_check_mark:**NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network Training and Architecture Optimization**\n\n论文/paper：https://arxiv.org/abs/2104.00031 | [主页/Homepage](http://web.mit.edu/netadapt/)\n\n代码/code： None\n\n:heavy_check_mark:**Neural Architecture Search with Random Labels**\n\n论文/paper：https://arxiv.org/abs/2101.11834\n\n代码/code：https://github.com/megvii-model/RLNAS\n\n:heavy_check_mark:**OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection**\n\n论文/paper：https://arxiv.org/abs/2103.04507\n\n代码/code：https://github.com/VDIGPKU/OPANAS\n\n:heavy_check_mark:**Prioritized Architecture Sampling with Monto-Carlo Tree Search**\n\n论文/paper：https://arxiv.org/abs/2103.11922\n\n代码/code：https://github.com/xiusu/NAS-Bench-Macro\n\n:heavy_check_mark:**ReNAS: Relativistic Evaluation of Neural Architecture Search**\n\n论文/paper：https://arxiv.org/abs/1910.01523\n\n代码/code：None\n\n:heavy_check_mark:**Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator**\n\n论文/paper：https://arxiv.org/abs/2103.07289\n\n代码/code：https://github.com/eric8607242/SGNAS\n\n:heavy_check_mark:**Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search**\n\n论文/paper：https://arxiv.org/abs/2101.11342\n\n代码/code：None\n\n[返回目录/back](#Contents)\n\n<a name=\"GNN\"></a> \n\n## 图神经网络 / Graph Neural Networks\n\n:heavy_check_mark:**A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts**\n\n论文/paper：https://arxiv.org/abs/2105.00290\n\n代码/code：None\n\n:heavy_check_mark:**Binary Graph Neural Networks**\n\n论文/paper：https://arxiv.org/abs/2105.00290\n\n代码/code：https://github.com/mbahri/binary_gnn\n\n:heavy_check_mark:**Quantifying Explainers of Graph Neural Networks in Computational Pathology**\n\n论文/paper：https://arxiv.org/pdf/2011.12646.pdf\n\n代码/code：https://github.com/histocartography/patho-quant-explainer\n\n:heavy_check_mark:**Sequential Graph Convolutional Network for Active Learning**\n\n论文/paper：https://arxiv.org/pdf/2006.10219.pdf\n\n代码/code：None\n\n[返回目录/back](#Contents)\n\n<a name=\"CapsuleNetwork\"></a> \n\n# 胶囊网络 / Capsule Network\n\n:heavy_check_mark:**Capsule Network is Not More Robust than Convolutional Network**\n\n论文/paper： https://arxiv.org/abs/2103.15459\n\n代码/code：None\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageClassification\"></a> \n\n# 图像分类 / Image Classification\n\n:heavy_check_mark:**Correlated Input-Dependent Label Noise in Large-Scale Image Classification** (CVPR 2021 Oral)\n\n论文/paper： https://arxiv.org/abs/2105.10305\n\n代码/code：https://github.com/google/uncertainty-baselines/tree/master/baselines/imagenet\n\n[返回目录/back](#Contents)\n\n<a name=\"ObjectDetection\"></a> \n\n## 目标检测/Object Detection\n\n**Adaptive Class Suppression Loss for Long-Tail Object Detection**\n\n论文/paper：https://arxiv.org/abs/2104.00885\n\n代码/code：https://github.com/CASIA-IVA-Lab/ACSL\n\n:heavy_check_mark:**Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds**\n\n论文/paper： https://arxiv.org/abs/2104.06114\n\n代码/code：https://github.com/cheng052/BRNet\n\n:heavy_check_mark: **Categorical Depth Distribution Network for Monocular 3D Object Detection**\n\n论文/paper： https://arxiv.org/abs/2103.01100\n\n代码/code：https://github.com/TRAILab/CaDDN\n\n:heavy_check_mark:**Center-based 3D Object Detection and Tracking**\n\n论文/paper：https://arxiv.org/abs/2006.11275\n\n代码/code：https://github.com/tianweiy/CenterPoint\n\n**Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection**\n\n论文/paper：https://arxiv.org/abs/2103.16368\n\n代码/code：None\n\n**DAP: Detection-Aware Pre-training with Weak Supervision**\n\n论文/paper：https://arxiv.org/abs/2103.16651\n\n代码/code：None\n\n:heavy_check_mark:**Depth from Camera Motion and Object Detection**\n\n论文/paper：https://arxiv.org/abs/2103.01468\n\n代码/code：https://github.com/griffbr/ODMD\n\n:heavy_check_mark:**Dense Label Encoding for Boundary Discontinuity Free Rotation Detection**\n\n论文/paper：https://arxiv.org/abs/2011.09670\n\n代码/code：https://github.com/yangxue0827/RotationDetection\n\n**Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection**\n\n论文/paper：https://arxiv.org/abs/2103.17115\n\n代码/code：https://github.com/hzhupku/DCNet\n\n:heavy_check_mark:**Distilling Object Detectors via Decoupled Features**\n\n论文/paper：https://arxiv.org/abs/2103.14475\n\n代码/code：https://github.com/ggjy/DeFeat.pytorch\n\n:heavy_check_mark: **Dogfight: Detecting Drones from Drone Videos**\n\n论文/paper：https://arxiv.org/pdf/2103.17242\n\n代码/code：None\n\n:heavy_check_mark:**Domain-Specific Suppression for Adaptive Object Detection**\n\n论文/paper：https://arxiv.org/abs/2105.03570\n\n代码/code：None\n\n:heavy_check_mark:**End-to-End Object Detection with Fully Convolutional Network**\n\n论文/paper： https://arxiv.org/abs/2012.03544\n\n代码/code： https://github.com/Megvii-BaseDetection/DeFCN\n\n:heavy_check_mark:**FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding**\n\n论文/paper：https://arxiv.org/abs/2103.05950\n\n代码/code：https://github.com/MegviiDetection/FSCE\n\n:heavy_check_mark:**General Instance Distillation for Object Detection**\n\n论文/paper：https://arxiv.org/abs/2103.02340\n\n代码/code：None\n\n:heavy_check_mark:**Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection**\n\n解读：[大白话 Generalized Focal Loss V2](https://zhuanlan.zhihu.com/p/313684358)\n\n论文/paper：https://arxiv.org/abs/2011.12885\n\n代码/code：https://github.com/implus/GFocalV2\n\n:heavy_check_mark:**GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection**\n\n视频demo：https://www.youtube.com/watch?v=PWctKkyWrno\n\n论文/paper：https://arxiv.org/abs/2103.17202\n\n代码/code：https://github.com/abhi1kumar/groomed_nms\n\n:heavy_check_mark:**Group Collaborative Learning for Co-Salient Object Detection**\n\n论文/paper：https://arxiv.org/abs/2104.01108\n\n代码/code：https://github.com/fanq15/GCoNet\n\n:heavy_check_mark:**I^3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object DetectorsNone**\n\n论文/paper：https://arxiv.org/abs/2103.13757\n\n代码/code：None\n\n**Improved Handling of Motion Blur in Online Object Detection**\n\n论文/paper：https://arxiv.org/abs/2011.14448\n\n代码/code：None\n\n**IQDet: Instance-wise Quality Distribution Sampling for Object Detection**\n\n论文/paper：https://arxiv.org/abs/2104.06936\n\n代码/code：None\n\n:heavy_check_mark:**Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework**\n\n论文/paper：https://arxiv.org/abs/2103.11402\n\n代码/code：None\n\n:heavy_check_mark:**Instance Localization for Self-supervised Detection Pretraining**\n\n论文/paper： https://arxiv.org/abs/2102.08318\n\n代码/code：https://github.com/limbo0000/InstanceLoc\n\n:heavy_check_mark:**MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection**\n\n论文/paper： https://arxiv.org/abs/2103.04224\n\n代码/code：None\n\n:heavy_check_mark:**Multiple Instance Active Learning for Object Detection**\n\n论文/paper：https://github.com/yuantn/MIAL/raw/master/paper.pdf\n\n代码/code：https://github.com/yuantn/MI-AOD\n\n**Line Segment Detection Using Transformers without Edges**（CVPR 2021 Oral）\n\n\n\n论文/paper： https://arxiv.org/abs/2101.01909\n\n代码/code：None\n\n**Open-Vocabulary Object Detection Using Captions** （CVPR 2021 Oral）\n\n论文/paper： https://arxiv.org/abs/2011.10678\n\n代码/code：None\n\n:heavy_check_mark:**OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection**\n\n论文/paper： https://arxiv.org/abs/2103.04507\n\n代码/code：https://github.com/VDIGPKU/OPANAS\n\n:heavy_check_mark:**OTA: Optimal Transport Assignment for Object Detection**\n\n论文/paper：https://arxiv.org/abs/2103.14259\n\n代码/code：https://github.com/Megvii-BaseDetection/OTA\n\n **Positive-Unlabeled Data Purification in the Wild for Object Detection**\n\n论文/paper：https://openaccess.thecvf.com/content/CVPR2021/papers/Guo_Positive-Unlabeled_Data_Purification_in_the_Wild_for_Object_Detection_CVPR_2021_paper.pdf\n\n代码/code：None\n\n**PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS with Relationship Recovery**\n\n论文/paper：https://arxiv.org/abs/2105.12990\n\n代码/code：None\n\n:heavy_check_mark:**Robust and Accurate Object Detection via Adversarial Learning**\n\n论文/paper：https://arxiv.org/abs/2103.13886\n\n代码/code：None\n\n:heavy_check_mark:**ReDet: A Rotation-equivariant Detector for Aerial Object Detection**\n\n论文/paper：https://arxiv.org/abs/2103.07733\n\n代码/code：https://github.com/csuhan/ReDet\n\n:heavy_check_mark:**Scale-aware Automatic Augmentation for Object Detection**\n\n论文/paper：https://arxiv.org/abs/2103.17220\n\n代码/code： https://github.com/Jia-Research-Lab/SA-AutoAug\n\n:heavy_check_mark:**Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection**\n\n解读：[解读1](https://blog.csdn.net/qidailiming1994/article/details/104881661?utm_medium=distribute.pc_relevant_t0.none-task-blog-2%7Edefault%7EBlogCommendFromBaidu%7Edefault-1.baidujs&dist_request_id=&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-2%7Edefault%7EBlogCommendFromBaidu%7Edefault-1.baidujs) | [解读2](https://blog.csdn.net/smile909/article/details/115257127 )\n\n论文/paper：https://arxiv.org/abs/2103.01903\n\n代码/code：None\n\n:heavy_check_mark:**Sparse R-CNN: End-to-End Object Detection with Learnable Proposals**\n\n论文/paper：https://arxiv.org/abs/2011.12450\n\n代码/code：https://github.com/PeizeSun/SparseR-CNN\n\n:heavy_check_mark:**There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge**\n\n论文/paper：https://arxiv.org/abs/2103.01353\n\n代码/code：http://rl.uni-freiburg.de/research/multimodal-distill\n\n:heavy_check_mark:**Towards Open World Object Detection** （CVPR 2021 Oarl）\n\n论文/paper：https://arxiv.org/abs/2103.02603\n\n代码/code：https://github.com/JosephKJ/OWOD\n\n:heavy_check_mark:**Uncertainty-aware Joint Salient Object and Camouflaged Object Detection**\n\n论文/paper：https://arxiv.org/abs/2104.02628\n\n代码/code：https://github.com/JingZhang617/Joint_COD_SOD\n\n:heavy_check_mark:**UP-DETR: Unsupervised Pre-training for Object Detection with Transformers** (CVPR2021 Oral)\n\n解读：https://zhuanlan.zhihu.com/p/356349626 | [华南理工和微信AI提出的无监督预训练检测器UP-DETR](https://www.zhihu.com/question/432321109/answer/1606004872)\n\n论文/paper： https://arxiv.org/abs/2011.09094\n\n代码/code：https://github.com/dddzg/up-detr\n\n:heavy_check_mark:**YOLOF：You Only Look One-level Feature**\n\n论文/paper：https://arxiv.org/abs/2103.09460\n\n代码/code：https://github.com/megvii-model/YOLOF\n\n[返回目录/back](#Contents)\n\n\n\n<a name=\"ObjectTracking\"></a> \n\n## 目标跟踪/Object Tracking\n\n:heavy_check_mark:**IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking**\n\n论文/paper：https://arxiv.org/abs/2103.14938\n\n代码/code：https://github.com/VISION-SJTU/IoUattack\n\n:heavy_check_mark:**Graph Attention Tracking**\n\n论文/paper：https://arxiv.org/abs/2011.11204\n\n代码/code：https://github.com/ohhhyeahhh/SiamGAT\n\n:heavy_check_mark:**Learning a Proposal Classifier for Multiple Object Tracking**\n\n论文/paper：https://arxiv.org/abs/2103.07889\n\n代码/code：https://github.com/daip13/LPC_MOT\n\n:heavy_check_mark:**Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking**\n\n论文/paper：https://arxiv.org/abs/2012.02337\n\n代码/code：https://github.com/fatemeh-slh/ArTIST\n\n:heavy_check_mark:**Rotation Equivariant Siamese Networks for Tracking**\n\n论文/paper： https://arxiv.org/abs/2012.13078\n\n代码/code：https://github.com/dkgupta90/re-siamnet\n\n:heavy_check_mark: **Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark**\n\n视频demo：[bilibili](https://www.bilibili.com/video/BV1bh411D72N/)| [youtube](https://www.youtube.com/watch?v=7lvVDlkkff0&ab_channel=XiaoWang)\n\n论文/paper：https://arxiv.org/pdf/2103.16746.pdf | [Homepage](https://sites.google.com/view/langtrackbenchmark/) \n\n代码/code： https://github.com/wangxiao5791509/TNL2K_evaluation_toolkit\n\n:heavy_check_mark:**Track to Detect and Segment: An Online Multi-Object Tracker**\n\n视频demo： [youtube](https://youtu.be/oGNtSFHRZJA) | [bilibili](https://www.bilibili.com/video/BV1f5411c774)\n\n论文/paper：https://arxiv.org/pdf/2103.08808.pdf | [主页/Homepage](https://jialianwu.com/projects/TraDeS.html)\n\n代码/code：https://github.com/JialianW/TraDeS\n\n:heavy_check_mark:**Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking** (CVPR 2021 Oral)\n\n论文/paper：https://arxiv.org/abs/2103.11681\n\n代码/code： https://github.com/594422814/TransformerTrack\n\n:heavy_check_mark:**Transformer Tracking**\n\n论文/paper：https://arxiv.org/abs/2103.15436\n\n代码/code： https://github.com/chenxin-dlut/TransT\n\n\n\n[返回目录/back](#Contents)\n\n\n\n<a name=\"Segmentation\"></a> \n\n## 语义分割/Segmentation\n\n:heavy_check_mark:**Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation**\n\n论文/paper：https://arxiv.org/abs/2103.08896 | [主页/Homepage](https://www.semion.io/doc/anti-adversarially-manipulated-attributions-for-weakly-and-semi-supervised-semantic-segmentation)\n\n代码/code：https://github.com/jbeomlee93/AdvCAM\n\n:heavy_check_mark:**A^2-FPN: Attention Aggregation based Feature Pyramid Network for Instance Segmentation**\n\n论文/paper：https://arxiv.org/abs/2105.03186\n\n代码/code：None\n\n:heavy_check_mark: **BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation**\n\n论文/paper：https://arxiv.org/abs/2103.08907\n\n代码/code：https://github.com/jbeomlee93/BBAM\n\n:heavy_check_mark:**Bidirectional Projection Network for Cross Dimension Scene Understanding** (CVPR 2021 Oral)\n\n论文/paper：https://arxiv.org/abs/2103.14326\n\n代码/code：https://github.com/wbhu/BPNet\n\n:heavy_check_mark:**Boundary IoU: Improving Object-Centric Image Segmentation Evaluation**\n\n论文/paper：https://arxiv.org/abs/2103.16562 | [主页/Homepage](https://bowenc0221.github.io/boundary-iou/)\n\n代码/codehttps://github.com/bowenc0221/boundary-iou-api\n\n:heavy_check_mark:**Capturing Omni-Range Context for Omnidirectional Segmentation**\n\n论文/paper：https://arxiv.org/abs/2103.05687\n\n代码/code：None | [作者主页/Homepage](http://www.yangkailun.com/)\n\n:heavy_check_mark:**Coarse-to-Fine Domain Adaptive Semantic Segmentation with Photometric Alignment and Category-Center Regularization**\n\n论文/paper：https://arxiv.org/abs/2103.13041\n\n代码/code：None\n\n:heavy_check_mark:**Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations**\n\n论文/paper：https://arxiv.org/abs/2103.06342\n\n代码/code：None\n\n:heavy_check_mark: **Cross-Dataset Collaborative Learning for Semantic Segmentation**\n\n论文/paper：https://arxiv.org/abs/2103.11351\n\n代码/code：None\n\n:heavy_check_mark:**Cross-View Regularization for Domain Adaptive Panoptic Segmentation**\n\n论文/paper：https://arxiv.org/abs/2103.02584\n\n代码/code：None\n\n:heavy_check_mark:**Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation**\n\n论文/paper： https://arxiv.org/abs/2011.10033\n\n代码/code： https://github.com/xinge008/Cylinder3D\n\n:heavy_check_mark:**DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation**\n\n论文/paper：https://arxiv.org/abs/2011.09876\n\n代码/code：https://github.com/aliyun/DCT-Mask\n\n:heavy_check_mark:**Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers**\n\n论文/paper：https://arxiv.org/abs/2103.12340\n\n代码/code：https://github.com/lkeab/BCNet\n\n:heavy_check_mark:**DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation**(CVPR2021 Oral)\n\n论文/paper：https://arxiv.org/abs/2103.15954\n\n代码/code：None\n\n:heavy_check_mark:**End-to-End Video Instance Segmentation with Transformers** (CVPR2021 Oral)\n\n论文/paper： https://arxiv.org/abs/2011.14503\n\n代码/code：https://github.com/Epiphqny/VisTR\n\n:heavy_check_mark:**FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space**\n\n论文/paper：https://arxiv.org/abs/2103.06030\n\n代码/code：https://github.com/liuquande/FedDG-ELCFS\n\n:heavy_check_mark: **Fully Convolutional Networks for Panoptic Segmentation** (CVPR2021 Oral)\n\n论文/paper：https://arxiv.org/abs/2012.00720\n\n代码/code：https://github.com/Jia-Research-Lab/PanopticFCN\n\n:heavy_check_mark:**Incremental Few-Shot Instance Segmentation**\n\n论文/paper：https://arxiv.org/abs/2105.05312\n\n代码/code：https://github.com/danganea/iMTFA\n\n:heavy_check_mark:**Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild**\n\n解读：https://www.zhihu.com/question/446299297/answer/1769080299\n\n论文/paper：https://arxiv.org/abs/2103.10391\n\n代码/code：https://github.com/svip-lab/IVOS-W\n\n:heavy_check_mark:**Learning Statistical Texture for Semantic Segmentation**\n\n论文/paper：https://arxiv.org/abs/2103.04133\n\n代码/code： None\n\n:heavy_check_mark:**Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation**\n\n论文/paper：https://arxiv.org/abs/2104.05239\n\n代码/code： https://github.com/tinyalpha/BPR\n\n:heavy_check_mark:**MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation**\n\n论文/paper：https://arxiv.org/abs/2103.05254\n\n代码/code：None\n\n:heavy_check_mark:**Multi-Scale Aligned Distillation for Low-Resolution Detection**\n\n论文/paper：https://jiaya.me/papers/ms_align_distill_cvpr21.pdf\n\n代码/code：https://github.com/Jia-Research-Lab/MSAD\n\n:heavy_check_mark:**Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation**\n\n论文/paper：https://arxiv.org/abs/2103.04717\n\n代码/code：None\n\n:heavy_check_mark:**Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation**\n\n论文/paper： https://arxiv.org/abs/2103.14581\n\n代码/code：None\n\n:heavy_check_mark:**Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic Segmentation**\n\n论文/paper：https://arxiv.org/abs/2103.14962\n\n代码/code：https://github.com/edwardzhou130/Panoptic-PolarNet\n\n:heavy_check_mark:**PLOP: Learning without Forgetting for Continual Semantic Segmentation**\n\n论文/paper：https://arxiv.org/abs/2011.11390\n\n代码/code：https://github.com/arthurdouillard/CVPR2021_PLOP\n\n:heavy_check_mark:**Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation**\n\n论文/paper：https://arxiv.org/abs/2101.10979\n\n代码/code：https://github.com/microsoft/ProDA\n\n:heavy_check_mark:**RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features**\n\n论文/paper：https://arxiv.org/abs/2104.08569\n\n代码/code：https://github.com/zhanggang001/RefineMask/\n\n:heavy_check_mark:**Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers**\n\n论文/paper：https://arxiv.org/abs/2012.15840\n\n代码/code：https://github.com/fudan-zvg/SETR\n\n:heavy_check_mark:**RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening**\n\n论文/paper：https://arxiv.org/abs/2103.15597\n\n代码/code：https://github.com/shachoi/RobustNet\n\n:heavy_check_mark:**Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion**\n\n论文/paper： https://arxiv.org/abs/2103.07074\n\n代码/code：https://github.com/ShiQiu0419/BAAF-Net\n\n:heavy_check_mark:**Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation**\n\n解读：https://blog.csdn.net/msy1412/article/details/114938788\n\n论文/paper： https://arxiv.org/abs/2103.04705\n\n代码/code：None\n\n:heavy_check_mark: **SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation**\n\n视频demo：[youtube](https://www.youtube.com/watch?v=zft0T3YUgpM)\n\n论文/paper：https://arxiv.org/abs/2103.10284\n\n代码/code：https://github.com/goodproj13/SG-Net\n\n:heavy_check_mark:**Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges**\n\n论文/paper：http://arxiv.org/abs/2009.03137 | [主页 | Homepage](https://github.com/QingyongHu/SensatUrban)\n\n代码/code：https://github.com/QingyongHu/SensatUrban\n\n**Zero-Shot Instance Segmentation**\n\n解读：[CVPR 2021 | 创新奇智首次提出零样本实例分割，助力解决工业场景数据瓶颈难题](https://www.leiphone.com/news/202103/ioJRMuZHucU7cLp4.html)\n\n论文/paper：None\n\n代码/code：None\n\n[返回目录/back](#Contents)\n\n<a name=\"WSSS\"></a>\n\n## 弱监督语义分割/Weakly Supervised Semantic Segmentation\n\n:heavy_check_mark:**Railroad Is Not a Train: Saliency As Pseudo-Pixel Supervision for Weakly Supervised Semantic Segmentation**\n\n论文/paper：https://arxiv.org/abs/2105.08965\n\n代码/code：https://github.com/halbielee/EPS\n\n[返回目录/back](#Contents)\n\n<a name=\"MedicalImageSegmentation\"></a>\n\n# 医学图像分割/Medical Image Segmentation\n\n:heavy_check_mark:**FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space**\n\n论文/paper：https://arxiv.org/abs/2103.06030\n\n代码/code：https://github.com/liuquande/FedDG-ELCFS\n\n:heavy_check_mark:**DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation**\n\n论文/paper：https://arxiv.org/abs/2103.15954\n\n代码/code：None\n\n[返回目录/back](#Contents)\n\n<a name=\"VideoObjectSegmentation\"></a>\n\n# 视频目标分割/Video Object Segmentation\n\n:heavy_check_mark:**Learning Position and Target Consistency for Memory-based Video Object Segmentation**\n\n解读：[论文翻译](https://blog.csdn.net/weixin_39778049/article/details/116234177)\n\n论文/paper：https://arxiv.org/abs/2104.04329\n\n代码/code：None\n\n:heavy_check_mark:**SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation** (CVPR 2021 Oral)\n\n论文/paper：https://arxiv.org/abs/2101.08833\n\n代码/code：https://github.com/dukebw/SSTVOS\n\n:heavy_check_mark:**VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild**\n\n论文/paper：https://www.vspwdataset.com/CVPR2021__miao.pdf | [主页 / Homepage](https://www.vspwdataset.com/)\n\n代码/code：https://github.com/VSPW-dataset/VSPW_code\n\n[返回目录/back](#Contents)\n\n<a name=\"InteractiveVideoObjectSegmentation\"></a>\n\n# 交互式视频目标分割/Interactive Video Object Segmentation\n\n:heavy_check_mark:**Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild**\n\n解读：https://www.zhihu.com/question/446299297/answer/1769080299\n\n论文/paper：https://arxiv.org/abs/2103.10391\n\n代码/code：https://github.com/svip-lab/IVOS-W\n\n:heavy_check_mark:**Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion**\n\n视频/demo：https://hkchengrex.github.io/MiVOS/video.html#partb\n\n论文/paper：https://arxiv.org/abs/2103.07941 | [主页/ Homepage]( https://hkchengrex.github.io/MiVOS/)\n\n代码/code： https://github.com/hkchengrex/MiVOS\n\n[返回目录/back](#Contents)\n\n<a name=\"VisualTransformer\"></a>\n\n# Visual Transformer\n\n:heavy_check_mark:**End-to-End Human Object Interaction Detection with HOI Transformer**\n\n论文/paper： https://arxiv.org/abs/2103.04503\n\n代码/code：https://github.com/bbepoch/HoiTransformer\n\n:heavy_check_mark:**End-to-End Video Instance Segmentation with Transformers** (CVPR2021 Oral)\n\n论文/paper： https://arxiv.org/abs/2011.14503\n\n代码/code：https://github.com/Epiphqny/VisTR\n\n:heavy_check_mark:**HOTR: End-to-End Human-Object Interaction Detection with Transformers**\n\n论文/paper：https://arxiv.org/abs/2104.13682\n\n代码/code：https://github.com/bbepoch/HoiTransformer\n\n**HR-NAS: Searching Efficient High-Resolution Neural Architectures with Transformers** (CVPR 2021 Oral)\n\n论文/paper：None | [报道](https://news.ruc.edu.cn/archives/308662)\n\n代码/code：https://github.com/dingmyu/HR-NAS\n\n:heavy_check_mark:**Kaleido-BERT：Vision-Language Pre-training on Fashion Domain**\n\n视频讲解：http://dpfan.net/wp-content/uploads/Kaleido-BERT.mp4 \n\n论文/paper：https://arxiv.org/abs/2103.16110 | [中文版 / Chinese](https://github.com/mczhuge/Kaleido-BERT/blob/main/CVPR2021_KaleidoBERT_Chinese.pdf)\n\n代码/code： https://github.com/mczhuge/Kaleido-BERT\n\n:heavy_check_mark:**LoFTR: Detector-Free Local Feature Matching with Transformers**\n\n论文/paper： https://arxiv.org/abs/2104.00680 | [主页 / Homepage]( https://zju3dv.github.io/loftr/)\n\n代码/code：https://github.com/zju3dv/LoFTR\n\n:heavy_check_mark:**Multi-Modal Fusion Transformer for End-to-End Autonomous Driving**\n\n视频/demo：[youtube](https://www.youtube.com/watch?v=WxadQyQ2gMs)\n\n论文/paper： https://arxiv.org/abs/2104.09224\n\n代码/code： https://github.com/autonomousvision/transfuser\n\n:heavy_check_mark:**MIST: Multiple Instance Spatial Transformer Network**\n\n论文/paper：https://arxiv.org/abs/1811.10725\n\n代码/code：None\n\n:heavy_check_mark: **Multimodal Motion Prediction with Stacked Transformers**\n\n视频demo：[bilibil](https://www.bilibili.com/video/BV1bV411e7F7/) | [youtube](https://youtu.be/oUZQBGOEBMg)\n\n论文/paper：https://arxiv.org/pdf/2103.11624.pdf\n\n代码/code：https://github.com/decisionforce/mmTransformer\n\n:heavy_check_mark:**Pre-Trained Image Processing Transformer**\n\n论文/paper： https://arxiv.org/abs/2012.00364\n\n代码/code：https://github.com/huawei-noah/Pretrained-IPT\n\n:heavy_check_mark:**Pose Recognition with Cascade Transformers**\n\n论文/paper：https://arxiv.org/abs/2104.06976\n\n代码/code：https://github.com/mlpc-ucsd/PRTR\n\n:heavy_check_mark:**Revamping cross-modal recipe retrieval with hierarchical Transformers and self-supervised learning**\n\n论文/paper：https://www.amazon.science/publications/revamping-cross-modal-recipe-retrieval-with-hierarchical-transformers-and-self-supervised-learning\n\n代码/code：https://github.com/amzn/image-to-recipe-transformers\n\n:heavy_check_mark:**Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers**\n\n论文/paper：https://arxiv.org/abs/2012.15840\n\n代码/code：https://github.com/fudan-zvg/SETR\n\n:heavy_check_mark:**Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking** (CVPR 2021 Oral)\n\n论文/paper：https://arxiv.org/abs/2103.11681\n\n代码/code： https://github.com/594422814/TransformerTrack\n\n:heavy_check_mark:**Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers**\n\n论文/paper： https://arxiv.org/abs/2103.16553\n\n代码/code：None\n\n:heavy_check_mark:**Transformer Tracking**\n\n论文/paper：https://arxiv.org/abs/2103.15436\n\n代码/code： https://github.com/chenxin-dlut/TransT\n\n:heavy_check_mark:**Transformer Interpretability Beyond Attention Visualization**\n\n论文/paper：https://arxiv.org/abs/2012.09838\n\n代码/code：https://github.com/hila-chefer/Transformer-Explainability\n\n:heavy_check_mark:**UP-DETR: Unsupervised Pre-training for Object Detection with Transformers** (CVPR2021 Oral)\n\n解读：https://zhuanlan.zhihu.com/p/356349626 | [华南理工和微信AI提出的无监督预训练检测器UP-DETR](https://www.zhihu.com/question/432321109/answer/1606004872)\n\n论文/paper： https://arxiv.org/abs/2011.09094\n\n代码/code：https://github.com/dddzg/up-detr\n\n:heavy_check_mark:**Variational Transformer Networks for Layout Generation**\n\n论文/paper：https://arxiv.org/abs/2104.02416\n\n代码/code：None\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceRecognition\"></a>\n\n# 人脸识别/Face Recognition\n\n:heavy_check_mark:**A 3D GAN for Improved Large-pose Facial Recognition**\n\n论文/paper：https://arxiv.org/abs/2012.10545\n\n代码/code：None\n\n :heavy_check_mark: **MagFace: A Universal Representation for Face Recognition and Quality Assessment** (CVPR 2021 Oral)\n\n论文/paper：https://arxiv.org/abs/2103.06627\n\n代码/code：https://github.com/IrvingMeng/MagFace\n\n:heavy_check_mark: **WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition**\n\n论文/paper：https://arxiv.org/abs/2103.04098 | [主页 | Homepage](https://www.face-benchmark.org/) \n\n代码/code：None\n\n:heavy_check_mark: **When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework**(CVPR 2021 Oral)\n\n论文/paper：https://arxiv.org/abs/2103.01520\n\n代码/code：https://github.com/Hzzone/MTLFace\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceDetection\"></a>\n\n# 人脸检测/Face Detection\n\n:heavy_check_mark: **CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement**\n\n论文/paper：https://arxiv.org/abs/2103.07017\n\n代码/code：None\n\n:heavy_check_mark: **HLA-Face: Joint High-Low Adaptation for Low Light Face Detection**\n\n解读：[北京大学 | HLA-Face：用于暗光人脸检测的联合高-低自适应（CVPR2021](https://hub.baai.ac.cn/view/7478)\n\n论文/paper： https://arxiv.org/abs/2104.01984 |[主页 | Homepage]( https://daooshee.github.io/HLA-Face-Website/) \n\n代码/code：https://github.com/daooshee/HLA-Face-Code\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceAnti-Spoofing\"></a>\n\n# 人脸活体检测/Face Anti-Spoofing\n\n:heavy_check_mark: **Cross Modal Focal Loss for RGBD Face Anti-Spoofing**\n\n解读：[论文翻译](https://blog.csdn.net/bianlongpeng/article/details/114935866) | [CVPR2021 人脸防伪相关paper](https://zhuanlan.zhihu.com/p/355208676)\n\n论文/paper：https://arxiv.org/abs/2103.00948\n\n代码/code：None\n\n:heavy_check_mark: **Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection**\n\n论文/paper：https://arxiv.org/abs/2103.09096\n\n代码/code：None\n\n:heavy_check_mark:**ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis**\n\n解读：[CVPR 2021 Oral｜商汤提出最新深伪评测基准 ForgeryNet：左右互搏——论火眼金睛的修炼](https://zhuanlan.zhihu.com/p/357301664)\n\n论文/paper：https://arxiv.org/abs/2103.05630 | [主页 | Homepage](https://yinanhe.github.io/projects/forgerynet.html)\n\n代码/code：None\n\n:heavy_check_mark: **MagDR: Mask-guided Detection and Reconstruction for Defending Deepfakes**\n\n解读：[Deepfake防御新思路有了！腾讯首次公开MagDR框架，已被AI顶会接收](https://cloud.tencent.com/developer/news/797874)\n\n论文/paper：https://arxiv.org/abs/2103.14211\n\n代码/code：None\n\n:heavy_check_mark: **Multi-attentional Deepfake Detection** (CVPR2021 preview)\n\n论文/paper：https://arxiv.org/abs/2103.02406\n\n代码/code：https://github.com/yoctta/multiple-attention\n\n[返回目录/back](#Contents)\n\n<a name=\"AgeEstimation\"></a>\n\n# 人脸年龄估计/Age Estimation\n\n:heavy_check_mark:**Continuous Face Aging via Self-estimated Residual Age Embedding**\n\n论文/paper：https://arxiv.org/abs/2105.00020\n\n代码/code：\n\n:heavy_check_mark: **PML: Progressive Margin Loss for Long-tailed Age Classification**\n\n论文/paper：https://arxiv.org/abs/2103.02140\n\n代码/code：None\n\n[返回目录/back](#Contents)\n\n<a name=\"FacialExpressionRecognition\"></a>\n\n# 人脸表情识别/Facial Expression Recognition\n\n:heavy_check_mark: **Affective Processes: stochastic modelling of temporal context for emotion and facial expression recognition**\n\n论文/paper：https://arxiv.org/abs/2103.13372\n\n代码/code： None\n\n:heavy_check_mark:**Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition**\n\n论文/paper：https://arxiv.org/abs/2104.05160 | [厦大新闻](https://cs.xmu.edu.cn/info/1045/1926.htm)\n\n代码/code： None\n\n[返回目录/back](#Contents)\n\n<a name=\"FacialAttributeRecognition\"></a>\n\n# 人脸属性识别/Facial Attribute Recognition\n\n **Learning Spatial-Semantic Relationship for Facial Attribute Recognition with Limited Labeled Data**\n\n论文/paper：None | [厦大新闻](https://cs.xmu.edu.cn/info/1045/1926.htm)\n\n代码/code： None\n\n<a name=\"HumanPoseEstimation\"></a>\n\n# 人体姿态估计/Human Pose Estimation\n\n:heavy_check_mark:**Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video**\n\n视频demo：[youtube](https://www.youtube.com/watch?v=WB3nTnSQDII)  | [bilibili](https://www.bilibili.com/video/BV1vQ4y1d71M/)\n\n论文/paper：https://arxiv.org/abs/2011.08627\n\n代码/code：https://github.com/hongsukchoi/TCMR_RELEASE\n\n :heavy_check_mark: **Deep Dual Consecutive Network for Human Pose Estimation**\n\n视频demo：[bilibil](https://www.bilibili.com/video/BV1iy4y177Mf?from=search&seid=10675595845989092503)\n\n论文/paper：https://arxiv.org/abs/2103.07254\n\n代码/code：https://github.com/Pose-Group/DCPose\n\n:heavy_check_mark:**HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation**\n\n论文/paper：https://arxiv.org/abs/2011.14672 | [主页/Homepage](https://jeffli.site/HybrIK/)\n\n代码/code： https://github.com/Jeff-sjtu/HybrIK\n\n:heavy_check_mark:**Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors**\n\n视频demo：[video](http://virtualhumans.mpi-inf.mpg.de/hps/hps_video.mp4)\n\n论文/paper：https://arxiv.org/abs/2103.17265\n\n代码/code：| [主页/Homepage](http://virtualhumans.mpi-inf.mpg.de/hps/)\n\n:heavy_check_mark: **Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks**\n\n论文/paper： https://arxiv.org/abs/2104.01797\n\n代码/code：https://github.com/3dpose/3D-Multi-Person-Pose\n\n:heavy_check_mark:**PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation**（CVPR 2021 Oral）\n\n论文/paper： https://arxiv.org/abs/2105.02465\n\n代码/code：https://github.com/jfzhang95/PoseAug\n\n[返回目录/back](#Contents)\n\n<a name=\"6DPoseEstimation\"></a>\n\n# 6D位姿估计 /6D Pose Estimation\n\n:heavy_check_mark:**FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation**\n\n视频demo：[bilibili](https://www.bilibili.com/video/BV1YU4y1a7Kp?from=search&seid=8306279574921937158) | [youtube](https://www.youtube.com/watch?v=SSi2TnyD6Is)\n\n论文/paper：https://arxiv.org/abs/2103.02242\n\n代码/code： https://github.com/ethnhe/FFB6D\n\n:heavy_check_mark:**FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism** (CVPR2021 Oral)\n\n论文/paper：https://arxiv.org/abs/2103.07054 | [主页 | Homepage]( https://jeffli.site/HybrIK/)  \n\n代码/code：https://github.com/DC1991/FS-Net\n\n:heavy_check_mark:**GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation**\n\n论文/paper：http://arxiv.org/abs/2102.12145\n\n代码/code：https://git.io/GDR-Net\n\n[返回目录/back](#Contents)\n\n<a name=\"HandPoseEstimation\"></a>\n\n## 手势姿态估计（重建）/Hand Pose Estimation( Hand Mesh Recovery\n\n:heavy_check_mark:**Camera-Space Hand Mesh Recovery via Semantic Aggregationand Adaptive 2D-1D Registration**\n\n论文/paper：https://arxiv.org/abs/2103.02845.pdf\n\n代码/code：https://github.com/SeanChenxy/HandMesh\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"PersonRe-identification\"></a>\n\n## 行人重识别/Person Re-identification\n\n:heavy_check_mark:**Combined Depth Space based Architecture Search For Person Re-identification**\n\n论文/paper： https://arxiv.org/abs/2104.04163\n\n代码/code：None\n\n:heavy_check_mark:**Group-aware Label Transfer for Domain Adaptive Person Re-identification**\n\n论文/paper： https://arxiv.org/abs/2103.12366\n\n代码/code：https://github.com/zkcys001/UDAStrongBaseline\n\n:heavy_check_mark:**Intra-Inter Camera Similarity for Unsupervised Person Re-Identification**\n\n论文/paper： https://arxiv.org/abs/2103.11658\n\n代码/code：https://github.com/SY-Xuan/IICS\n\n:heavy_check_mark:**Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification**\n\n论文/paper： https://arxiv.org/abs/2103.04618\n\n代码/code：https://github.com/FlyingRoastDuck/MetaCam_DSCE\n\n:heavy_check_mark:**Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification**\n\n论文/paper：https://arxiv.org/abs/2012.00417\n\n代码/code：https://github.com/HeliosZhao/M3L\n\n:heavy_check_mark:**Lifelong Person Re-Identification via Adaptive Knowledge Accumulation**\n\n论文/paper：https://arxiv.org/abs/2103.12462\n\n代码/code：https://github.com/TPCD/LifelongReID\n\n:heavy_check_mark:**Meta Batch-Instance Normalization for Generalizable Person Re-Identification**\n\n论文/paper： https://arxiv.org/abs/2011.14670\n\n代码/code：https://github.com/bismex/MetaBIN\n\n:heavy_check_mark:**Neural Feature Search for RGB-Infrared Person Re-Identification**\n\n论文/paper：https://arxiv.org/abs/2104.02366\n\n代码/code：None\n\n:heavy_check_mark:**Unsupervised Multi-Source Domain Adaptation for Person Re-Identification**\n\n论文/paper：https://arxiv.org/abs/2104.12961\n\n代码/code：None\n\n:heavy_check_mark:**Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification**\n\n论文/paper：https://arxiv.org/abs/2103.04337\n\n代码/code：https://github.com/flysnowtiger/GRL\n\n**Coarse-To-Fine Person Re-Identification With Auxiliary-Domain Classification and Second-Order Information Bottleneck**\n\n论文/paper：None\n\n代码/code：None\n\n**Farewell to Mutual Information: Variational Distillation for Cross-Modal Person Re-Identification**\n\n论文/paper：None\n\n代码/code：None\n\n**Joint Generative and Contrastive Learning for Unsupervised Person Re-Identification**\n\n论文/paper：None\n\n代码/code：None\n\n**BiCnet-TKS: Learning Efficient Spatial-Temporal Representation for Video Person Re-Identification**\n\n论文/paper：None\n\n代码/code：None\n\n**Person30K: A Dual-Meta Generalization Network for Person Re-Identification**\n\n论文/paper：None\n\n代码/code：None\n\n**Diverse Part Discovery: Occluded Person Re-Identification With Part-Aware Transformer**\n\n论文/paper：None\n\n代码/code：None\n\n**Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification**\n\n论文/paper：None\n\n代码/code：None\n\n**Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos**\n\n论文/paper：None\n\n代码/code：None\n\n**Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification**\n\n论文/paper：None\n\n代码/code：None\n\n**Partial Person Re-Identification With Part-Part Correspondence Learning**\n\n论文/paper：None\n\n代码/code：None\n\n**Fine-Grained Shape-Appearance Mutual Learning for Cloth-Changing Person Re-Identification**\n\n论文/paper：None\n\n代码/code：None\n\n**UnrealPerson: An Adaptive Pipeline Towards Costless Person Re-Identification**\n\n论文/paper：None\n\n代码/code：None\n\n**Person Re-Identification Using Heterogeneous Local Graph Attention Networks**\n\n论文/paper：None\n\n代码/code：None\n\n**Wide-Baseline Multi-Camera Calibration Using Person Re-Identification**\n\n论文/paper：None\n\n代码/code：None\n\n**Unsupervised Pre-Training for Person Re-Identification**\n\n论文/paper：None\n\n代码/code：None\n\n**Generalizable Person Re-Identification With Relevance-Aware Mixture of Experts**\n\n论文/paper：None\n\n代码/code：None\n\n[返回目录/back](#Contents)\n\n<a name=\"PersonSearch\"></a>\n\n# 行人搜索/Person Search\n\n:heavy_check_mark:**Anchor-Free Person Search**\n\n解读：https://zhuanlan.zhihu.com/p/359617800\n\n论文/paper：https://arxiv.org/abs/2103.11617\n\n代码/code：https://github.com/daodaofr/AlignPS\n\n<a name=\"CrowdCounting\"></a>\n\n## 人群计数 / Crowd Counting\n\n:heavy_check_mark:**Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting**\n\n论文/paper：https://arxiv.org/abs/2012.04529\n\n代码/code：https://github.com/chen-judge/RGBTCrowdCounting\n\n:heavy_check_mark:**Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark**\n\n论文/paper：https://arxiv.org/abs/2105.02440\n\n代码/code：https://github.com/VisDrone/DroneCrowd\n\n[返回目录/back](#Contents)\n\n<a name=\"GAN\"></a>\n\n## GAN\n\n:heavy_check_mark: **Anycost GANs for Interactive Image Synthesis and Editing**\n\n视频/demo：[youtube](https://www.youtube.com/watch?v=_yEziPl9AkM)\n\n论文/paper：https://arxiv.org/abs/2103.03243\n\n代码/code：https://github.com/mit-han-lab/anycost-gan\n\n:heavy_check_mark:**A 3D GAN for Improved Large-pose Facial Recognition**\n\n论文/paper：https://arxiv.org/abs/2012.10545\n\n代码/code：None            \n\n:heavy_check_mark:**Closed-Form Factorization of Latent Semantics in GANs** （CVPR 2021 Oral）\n\n视频/demo：[youtube](https://www.youtube.com/watch?v=OFHW2WbXXIQ)\n\n论文/paper：https://arxiv.org/abs/2007.06600 | [主页 | Homepage](https://genforce.github.io/sefa/)\n\n代码/code：https://github.com/genforce/sefa\n\n:heavy_check_mark: **CoMoGAN: continuous model-guided image-to-image translation** \n\n论文/paper：https://arxiv.org/abs/2103.06879\n\n代码/code：https://github.com/cv-rits/CoMoGAN\n\n :heavy_check_mark: **DeFLOCNet: Deep Image Editing via Flexible Low-level Controls**\n\n论文/paper：https://arxiv.org/abs/2103.12723\n\n代码/code：https://github.com/KumapowerLIU/DeFLOCNet\n\n :heavy_check_mark: **DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network**\n\n解读：https://zhuanlan.zhihu.com/p/366231002\n\n论文/paper：https://arxiv.org/abs/2103.07893\n\n代码/code：https://github.com/ruiliu-ai/DivCo\n\n :heavy_check_mark: **Diverse Semantic Image Synthesis via Probability Distribution Modeling**\n\n论文/paper：https://arxiv.org/abs/2103.06878\n\n代码/code：https://github.com/tzt101/INADE\n\n :heavy_check_mark: **Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer**\n\n论文/paper：https://arxiv.org/abs/2104.05376\n\n代码/code： https://github.com/PaddlePaddle/PaddleGAN/\n\n :heavy_check_mark: **Efficient Conditional GAN Transfer with Knowledge Propagation across Classes**\n\n论文/paper：https://arxiv.org/abs/2102.06696\n\n代码/code：https://github.com/mshahbazi72/cGANTransfer\n\n:heavy_check_mark: **Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation**\n\n论文/paper：https://arxiv.org/abs/2008.00951 | [主页 | Homepage](https://eladrich.github.io/pixel2style2pixel/)\n\n代码/code：https://github.com/eladrich/pixel2style2pixel\n\n**Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing**\n\n论文/paper：https://yunjey.github.io/\n\n代码/code：None\n\n:heavy_check_mark:**Generative Hierarchical Features from Synthesizing Image**  (CVPR 2021 (Oral))\n\n论文/paper：https://arxiv.org/abs/2007.10379 | [主页 | Homepage](https://genforce.github.io/ghfeat/)\n\n代码/code：https://github.com/genforce/ghfeat\n\n:heavy_check_mark:**Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs**\n\n论文/paper：https://arxiv.org/abs/2011.14107 | [主页 | Homepage](https://a514514772.github.io/hijackgan/)\n\n代码/code：https://github.com/a514514772/hijackgan\n\n:heavy_check_mark:**HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms**\n\n论文/paper：https://arxiv.org/abs/2011.11731\n\n代码/code：https://github.com/mahmoudnafifi/HistoGAN\n\n:heavy_check_mark:**High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network**\n\n论文/paper： https://arxiv.org/abs/2105.09188\n\n代码/code：https://github.com/csjliang/LPTN\n\n:heavy_check_mark:**HumanGAN: A Generative Model of Humans Images**\n\n论文/paper：https://arxiv.org/abs/2103.06902\n\n代码/code：None\n\n:heavy_check_mark:**ID-Unet: Iterative Soft and Hard Deformation for View Synthesis**\n\n论文/paper：https://arxiv.org/abs/2103.02264\n\n代码/code： https://github.com/MingyuY/Iterative-view-synthesis\n\n:heavy_check_mark:**Image-to-image Translation via Hierarchical Style Disentanglement**\n\n论文/paper：https://arxiv.org/abs/2103.01456\n\n代码/code：https://github.com/imlixinyang/HiSD\n\n:heavy_check_mark:**LOHO: Latent Optimization of Hairstyles via Orthogonalization**\n\n论文/paper：https://arxiv.org/abs/2103.03891\n\n代码/code：https://github.com/dukebw/LOHO\n\n**PD-GAN: Probabilistic Diverse GAN for Image Inpainting**\n\n论文/paper：None\n\n代码/code：https://github.com/KumapowerLIU/PD-GAN\n\n:heavy_check_mark: **PISE: Person Image Synthesis and Editing with Decoupled GAN**\n\n论文/paper： https://arxiv.org/abs/2103.04023\n\n代码/code： https://github.com/Zhangjinso/PISE\n\n:heavy_check_mark:**pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis** (CVPR 2021 (Oral))\n\n论文/paper：https://arxiv.org/abs/2012.00926 | [主页 | Homepage]( https://marcoamonteiro.github.io/pi-GAN-website/)\n\n代码/code：None\n\n:heavy_check_mark:**Regularizing Generative Adversarial Networks under Limited Data**\n\n论文/paper：https://arxiv.org/abs/2104.03310 | [主页  | Homepage](https://hytseng0509.github.io/lecam-gan/)\n\n代码/code：https://github.com/google/lecam-gan\n\n:heavy_check_mark:**Teachers Do More Than Teach: Compressing Image-to-Image Models**\n\n论文/paper：https://arxiv.org/abs/2103.03467\n\n代码/code：https://github.com/snap-research/CAT\n\n:heavy_check_mark:**TediGAN: Text-Guided Diverse Image Generation and Manipulation**\n\n论文/paper：https://arxiv.org/abs/2012.03308\n\n代码/code：https://github.com/weihaox/TediGAN\n\n:heavy_check_mark: **Training Generative Adversarial Networks in One Stage**\n\n论文/paper：https://arxiv.org/abs/2103.00430\n\n代码/code：None\n\n:heavy_check_mark:**Towards Real-World Blind Face Restoration with Generative Facial Prior**\n\n解读：[解读1](https://www.linkresearcher.com/theses/77dbb9e3-bbdf-4f7e-83f1-da797cb98667) | [2](https://zhuanlan.zhihu.com/p/344602972)\n\n论文/paper： https://arxiv.org/abs/2101.04061\n\n代码/code：None\n\n[返回目录/back](#Contents)\n\n<a name=\"CPM\"></a>\n\n## 彩妆迁移 / Color-Pattern Makeup Transfer\n\n:heavy_check_mark:**Lipstick ain't enough: Beyond Color-Matching for In-the-Wild Makeup Transfer**\n\n论文/paper：https://arxiv.org/abs/2104.01867\n\n代码/code：https://github.com/VinAIResearch/CPM\n\n:heavy_check_mark:**Spatially-invariant Style-codes Controlled Makeup Transfer**\n\n论文/paper：http://www.shengfenghe.com/?attachment_id=26109&download=1\n\n代码/code：https://github.com/makeuptransfer/SCGAN\n\n[返回目录/back](#Contents)\n\n<a name=\"FontGeneration\"></a>\n\n## 字体生成 / Font Generation\n\n:heavy_check_mark:**DG-Font: Deformable Generative Networks for Unsupervised Font Generation**\n\n论文/paper：https://arxiv.org/abs/2104.03064\n\n代码/code：https://github.com/ecnuycxie/DG-Font\n\n[返回目录/back](#Contents)\n\n<a name=\"OCR\"></a>\n\n## OCR\n\n[CVPR 2021 OCR领域相关22篇论文速览](https://mp.weixin.qq.com/s?__biz=MzU4NTY4Mzg1Mw==&mid=2247508038&idx=3&sn=1d6715d16606a894287945e280d51367&chksm=fd8426a4caf3afb2619229f05a007b454f7eb53f1a88f663e9f61528863ce881c42c00d30d30&scene=27#wechat_redirect)\n\n### **文字图像处理（超分辨率增强、文字分割、文档版面分析）**\n\n**Scene Text Telescope: Text-Focused Scene Image Super-Resolution**\n\n论文/paper：https://openaccess.thecvf.com/content/CVPR2021/papers/Chen_Scene_Text_Telescope_Text-Focused_Scene_Image_Super-Resolution_CVPR_2021_paper.pdf\n\n代码/code：None\n\n**Rethinking Text Segmentation: A Novel Dataset and a Text-Specific Refinement Approach**\n\n论文/paper：http://arxiv.org/abs/2011.14021\n\n代码/code：None\n\n**Variational Transformer Networks for Layout Generation**\n\n论文/paper：http://arxiv.org/abs/2104.02416\n\n代码/code：None\n\n### 场景文本检测、识别/Scene Text Detection/Recognition\n\n:heavy_check_mark:**Fourier Contour Embedding for Arbitrary-Shaped Text Detection**\n\n论文/paper：https://arxiv.org/abs/2104.10442\n\n代码/code：https://github.com/open-mmlab/mmocr (https://www.pazhoulab.com/2021/04/1704/ 官方介绍将于 mmocr 开源)\n\n:heavy_check_mark:**MOST: A Multi-Oriented Scene Text Detector with Localization Refinement**\n\n解读：https://zhuanlan.zhihu.com/p/362679808\n\n论文/paper：https://arxiv.org/abs/2104.01070\n\n代码/code：None\n\n**Progressive Contour Regression for Arbitrary-Shape Scene Text Detection**\n\n论文/paper：https://openaccess.thecvf.com/content/CVPR2021/papers/Dai_Progressive_Contour_Regression_for_Arbitrary-Shape_Scene_Text_Detection_CVPR_2021_paper.pdf\n\n代码/code：None\n\n**Semantic-Aware Video Text Detection**\n\n论文/paper：https://openaccess.thecvf.com/content/CVPR2021/papers/Feng_Semantic-Aware_Video_Text_Detection_CVPR_2021_paper.pdf\n\n代码/code：None\n\n:heavy_check_mark:**What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels**\n\n论文/paper：https://arxiv.org/abs/2103.04400\n\n代码/code：https://github.com/ku21fan/STR-Fewer-Labels\n\n### **端到端文字识别**\n\n **Implicit Feature Alignment: Learn To Convert Text Recognizer to Text Spotter**\n\n论文/paper：http://arxiv.org/abs/2106.05920\n\n代码/code：https://github.com/WangTianwei/Implicit-feature-alignment\n\n**A Multiplexed Network for End-to-End, Multilingual OCR**\n\n论文/paper：http://arxiv.org/abs/2103.15992\n\n代码/code：None\n\n### **手写文字分析与识别**\n\n:heavy_check_mark:**MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition**\n\n论文/paper：https://arxiv.org/abs/2104.01876 | [主页 / Homepage](https://aneeshan95.github.io/)\n\n代码/code：None\n\n**Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting**\n\n论文/paper：https://arxiv.org/abs/2103.13716\n\n代码/code：https://github.com/AyanKumarBhunia/Self-Supervised-Learning-for-Sketch\n\n### **其它（文档图像预训练模型，Text VQA、数据集，Retrieval , 应用）**\n\n**SelfDoc: Self-Supervised Document Representation Learning**\n\n论文/paper：http://arxiv.org/abs/2106.03331\n\n代码/code：None\n\n**TAP: Text-Aware Pre-Training for Text-VQA and Text-Caption**\n\n论文/paper：http://arxiv.org/abs/2012.04638\n\n代码/code：None\n\n**Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship**\n\n论文/paper：https://openaccess.thecvf.com/content/CVPR2021/papers/Wang_Improving_OCR-Based_Image_Captioning_by_Incorporating_Geometrical_Relationship_CVPR_2021_paper.pdf\n\n代码/code：None\n\n**Deep Texture Recognition via Exploiting Cross-Layer Statistical Self-Similarity**\n\n论文/paper：None | 介绍：https://www.pazhoulab.com/2021/04/1704/\n\n代码/code：None\n\n:heavy_check_mark:**Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition**\n\n论文/paper：https://arxiv.org/abs/2103.06495\n\n代码/code：https://github.com/FangShancheng/ABINet\n\n:heavy_check_mark: **Scene Text Retrieval via Joint Text Detection and Similarity Learning**\n\n论文/paper：https://arxiv.org/abs/2104.01552\n\n代码/code：https://github.com/lanfeng4659/STR-TDSL\n\n**Self-attention based Text Knowledge Mining for Text Detection**\n\n论文/paper：None\n\n代码/code：https://github.com/CVI-SZU/STKM\n\n:heavy_check_mark:**TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text**\n\n论文/paper：https://arxiv.org/abs/2105.05486 | [主页 / Homepage](https://textvqa.org/textocr)\n\n代码/code：None\n\n:heavy_check_mark:**Towards Accurate Text-based Image Captioning with Content Diversity Exploration**\n\n论文/paper：https://github.com/guanghuixu/AnchorCaptioner/blob/main/projects/TextCap_CVPR.pdf\n\n代码/code：https://github.com/guanghuixu/AnchorCaptioner\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"Retrieval\"></a>\n\n## 图像、视频检索 / Image Retrieval/Video retrieval\n\n:heavy_check_mark:**StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval**\n\n论文/paper：https://arxiv.org/abs/2103.15706\n\n代码/code：None\n\n:heavy_check_mark:**QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval**\n\n论文/paper：https://arxiv.org/abs/2103.02927\n\n代码/code：None\n\n:heavy_check_mark:**On Semantic Similarity in Video Retrieval**\n\n论文/paper：https://arxiv.org/abs/2103.10095 | [主页 / Homepage](https://mwray.github.io/SSVR/)\n\n代码/code：https://github.com/mwray/Semantic-Video-Retrieval\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageAnimation\"></a>\n\n## Image Animation\n\n:heavy_check_mark: **Motion Representations for Articulated Animation**\n\n视频/demo：[youtube](https://www.youtube.com/watch?v=gpBYN8t8_yY)\n\n论文/paper： https://arxiv.org/abs/2104.11280 | [主页 | Homepage](https://snap-research.github.io/articulated-animation/)\n\n代码/code：https://github.com/snap-research/articulated-animation\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageMatting\"></a>\n\n## 抠图/Image Matting/Video Matting\n\n:heavy_check_mark:**Deep Video Matting via Spatio-Temporal Alignment and Aggregation**\n\n论文/paper：https://arxiv.org/abs/2104.11208\n\n代码/code：https://github.com/nowsyn/DVM\n\n:heavy_check_mark:**Improved Image Matting via Real-time User Clicks and Uncertainty Estimation**\n\n论文/paper：https://arxiv.org/abs/2012.08323\n\n代码/code：None\n\n**Mask Guided Matting via Progressive Refinement Network**\n\n论文/paper：https://arxiv.org/abs/2012.06722\n\n代码/code：https://github.com/yucornetto/MGMatting\n\n:heavy_check_mark:**Omnimatte: Associating Objects and Their Effects in Video(Omnimatte**\n\n论文/paper：https://arxiv.org/abs/2105.06993 | [主页/Homepage](https://omnimatte.github.io/)\n\n代码/code：None\n\n:heavy_check_mark:**Real-Time High Resolution Background Matting**\n\n视频/demo：[youtube](https://youtu.be/oMfPTeYDF9g)\n\n论文/paper：https://arxiv.org/abs/2012.07810 | [主页/Homepage](https://grail.cs.washington.edu/projects/background-matting-v2/)\n\n代码/code：https://github.com/PeterL1n/BackgroundMattingV2\n\n:heavy_check_mark:**Semantic Image Matting**\n\n论文/paper：https://arxiv.org/abs/2104.08201\n\n代码/code：https://github.com/nowsyn/SIM\n\n[返回目录/back](#Contents)\n\n<a name=\"SuperResolution\"></a>\n\n# 超分辨率/Super Resolution\n\n:heavy_check_mark:**AdderSR: Towards Energy Efficient Image Super-Resolution**\n\n论文/paper：https://arxiv.org/abs/2009.08891\n\n代码/code：None\n\n:heavy_check_mark:**BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond**\n\n论文/paper：https://arxiv.org/abs/2012.02181| [主页 / Homepage](https://ckkelvinchan.github.io/projects/BasicVSR/)\n\n代码/code：https://github.com/ckkelvinchan/BasicVSR-IconVSR\n\n:heavy_check_mark:**ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic**\n\n论文/paper：https://arxiv.org/abs/2103.04039\n\n代码/code：https://github.com/Xiangtaokong/ClassSR\n\n:heavy_check_mark:**Cross-MPI: Cross-scale Stereo for Image Super-Resolution using Multiplane Images**\n\n解读：[CVPR 2021，Cross-MPI以底层场景结构为线索的端到端网络，在大分辨率（x8）差距下也可完成高保真的超分辨率](https://zhuanlan.zhihu.com/p/354752197)\n\n论文/paper：https://arxiv.org/abs/2011.14631 | [主页 / Homepage](http://www.liuyebin.com/crossMPI/crossMPI.html)\n\n代码/code：None\n\n:heavy_check_mark: **Data-Free Knowledge Distillation For Image Super-Resolution**\n\n论文/paper：https://openaccess.thecvf.com/content/WACV2021/papers/Chawla_Data-Free_Knowledge_Distillation_for_Object_Detection_WACV_2021_paper.pdf\n\n代码/code：None\n\n:heavy_check_mark:**Exploring Sparsity in Image Super-Resolution for Efficient Inference**\n\n论文/paper：https://arxiv.org/abs/2006.09603\n\n代码/code：https://github.com/LongguangWang/SMSR\n\n:heavy_check_mark:**Flow-based Kernel Prior with Application to Blind Super-Resolution**\n\n论文/paper：https://arxiv.org/pdf/2103.15977.pdf\n\n代码/code：https://github.com/JingyunLiang/FKP\n\n:heavy_check_mark:**GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution**\n\n论文/paper：https://ckkelvinchan.github.io/papers/glean.pdf | [主页 / Homepage](https://ckkelvinchan.github.io/projects/GLEAN/)\n\n代码/code：None\n\n:heavy_check_mark:**Interpreting Super-Resolution Networks with Local Attribution Maps**\n\n解读：https://arxiv.org/pdf/2011.11036.pdf  |  [CVPR2021-顾津锦：使用局部归因图理解和可视化超分辨网络](https://zhuanlan.zhihu.com/p/363139999)\n\n论文/paper：https://x-lowlevel-vision.github.io/lam.html\n\n代码/code：None | [colab](https://colab.research.google.com/drive/1ZodQ8CRCfHw0y6BweG9zB3YrK_lYWcDk?usp=sharing)\n\n**LAU-Net: Latitude Adaptive Upscaling Network for Omnidirectional Image Super-resolution**\n\n论文/paper：None | [主页/Homepage](http://buaamc2.net/)\n\n代码/code：https://github.com/wangh-allen/LAU-Net\n\n:heavy_check_mark:**Learning Continuous Image Representation with Local Implicit Image Function**\n\n论文/paper：https://arxiv.org/abs/2012.09161 | [主页/Homepage](https://yinboc.github.io/liif/)\n\n代码/code：https://github.com/yinboc/liif\n\n:heavy_check_mark:**MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution**\n\n论文/paper：https://jiaya.me/papers/masasr_cvpr21.pdf\n\n代码/code：https://github.com/Jia-Research-Lab/MASA-SR\n\n:heavy_check_mark:**Robust Reference-based Super-Resolution via C²-Matching**\n\n论文/paper：https://arxiv.org/abs/2106.01863 | [主页 / Homepage](https://ckkelvinchan.github.io/)\n\n代码/code：https://github.com/yumingj/C2-Matching\n\n**Temporal Modulation Network for Controllable Space-Time Video Super-Resolution**\n\n论文/paper：None | [主页 / Homepage](https://csjunxu.github.io/)\n\n代码/code：https://github.com/CS-GangXu/CS-GangXu\n\n:heavy_check_mark: **Unsupervised Degradation Representation Learning for Blind Super-Resolution**\n\n论文/paper：https://arxiv.org/abs/2104.00416\n\n代码/code：https://github.com/LongguangWang/DASR\n\n:heavy_check_mark:**Video Rescaling Networks with Joint Optimization Strategies for Downscaling and Upscaling**\n\n论文/paper：https://arxiv.org/abs/2103.14858 | [主页 / Homepage](https://ding3820.github.io/MIMO-VRN/) \n\n代码/code：https://github.com/ding3820/MIMO-VRN\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageRestoration\"></a>\n\n# 图像复原/Image Restoration\n\n:heavy_check_mark:**Multi-Stage Progressive Image Restoration**\n\n论文/paper：https://arxiv.org/abs/2102.02808\n\n代码/code：https://github.com/swz30/MPRNet\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageInpainting\"></a>\n\n# 图像补全/Image Inpainting\n\n**Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE**\n\n论文/paper：https://arxiv.org/abs/2103.10022\n\n代码/code：https://github.com/USTC-JialunPeng/Diverse-Structure-Inpainting\n\n**Image Inpainting Guided by Coherence Priors of Semantics and Textures**\n\n论文/paper：https://arxiv.org/abs/2012.08054\n\n代码/code：None\n\n:heavy_check_mark:**Image Inpainting with External-internal Learning and Monochromic Bottleneck**\n\n解读：https://zhuanlan.zhihu.com/p/369717594\n\n论文/paper：https://arxiv.org/abs/2104.09068\n\n代码/code：https://github.com/Tengfei-Wang/external-internal-inpainting\n\n**PD-GAN: Probabilistic Diverse GAN for Image Inpainting**\n\n论文/paper： https://openaccess.thecvf.com/content/CVPR2021/papers/Liu_PD-GAN_Probabilistic_Diverse_GAN_for_Image_Inpainting_CVPR_2021_paper.pdf\n\n代码/code：https://github.com/KumapowerLIU/PD-GAN\n\n:heavy_check_mark:**TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations**\n\n论文/paper： https://arxiv.org/abs/2103.15982 | [主页 / Homepage](https://yzhouas.github.io/projects/TransFill/index.html)\n\n代码/code：None\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageEditing\"></a>\n\n# 图像编辑/Image Editing\n\n:heavy_check_mark:**Anycost GANs for Interactive Image Synthesis and Editing**\n\n论文/paper：https://arxiv.org/abs/2103.03243\n\n代码/code：https://github.com/mit-han-lab/anycost-gan\n\n:heavy_check_mark:**DeFLOCNet: Deep Image Editing via Flexible Low-level Controls**\n\n论文/paper：https://arxiv.org/abs/2103.12723 \n\n代码/code：https://github.com/KumapowerLIU/DeFLOCNet\n\n:heavy_check_mark:**Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing**\n\n视频/demo：[youtube](https://www.youtube.com/watch?v=qCapNyRA_Ng)\n\n论文/paper：None | [主页 / Homepage](https://yunjey.github.io/)\n\n代码/code：None\n\n:heavy_check_mark:**High-Fidelity and Arbitrary Face Editing**\n\n论文/paper：https://arxiv.org/abs/2103.15814\n\n代码/code：None\n\n:heavy_check_mark:**PISE: Person Image Synthesis and Editing with Decoupled GAN**\n\n论文/paper：https://arxiv.org/abs/2103.04023\n\n代码/code：https://github.com/Zhangjinso/PISE\n\n:heavy_check_mark:**StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing**\n\n视频/demo：[youtube](https://www.youtube.com/watch?v=qCapNyRA_Ng)\n\n论文/paper：https://arxiv.org/abs/2104.14754\n\n代码/code：https://github.com/naver-ai/StyleMapGAN\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageMatching\"></a>\n\n# 图像匹配/Image Matching\n\n:heavy_check_mark:**LoFTR: Detector-Free Local Feature Matching with Transformers**\n\n论文/paper： https://arxiv.org/abs/2104.00680 | [主页 / Homepage]( https://zju3dv.github.io/loftr/)\n\n代码/code：https://github.com/zju3dv/LoFTR\n\n:heavy_check_mark:**Convolutional Hough Matching Networks** （CVPR 2021 Oral）\n\n论文/paper：https://arxiv.org/abs/2103.16831 | [主页 / Homepage]( http://cvlab.postech.ac.kr/research/CHM/)\n\n代码/code：None\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageBlending\"></a>\n\n# 图像融合/Image Blending\n\n:heavy_check_mark:**Bridging the Visual Gap: Wide-Range Image Blending**\n\n论文/paper：https://arxiv.org/abs/2103.15149\n\n代码/code：https://github.com/julia0607/Wide-Range-Image-Blending\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageDehazing\"></a>\n\n## 图像去雾/Image Dehazing\n\n:heavy_check_mark:**Contrastive Learning for Compact Single Image Dehazing**\n\n解读：[对比学习去雾](https://www.jianshu.com/p/2b7fc8149ab2)\n\n论文/paper：https://arxiv.org/abs/2104.09367\n\n代码/code： https://github.com/GlassyWu/AECR-Net\n\n[返回目录/back](#Contents)\n\n<a name=\"ReflectionRemoval\"></a>\n\n## 反光去除/Reflection Removal\n\n:heavy_check_mark:**Panoramic Image Reflection Removal**\n\n视频/demo：[youtube](https://www.youtube.com/watch?v=ZumVEWN8Ito) | [bilibili](https://www.bilibili.com/video/BV1vv41177oc/)\n\n论文/paper：http://ci.idm.pku.edu.cn/CVPR21c.pdf\n\n代码/code：None \n\n :heavy_check_mark: **Robust Reflection Removal with Reflection-free Flash-only Cues**\n\n解读：[CVPR 2021 | 港科大：如何利用闪光图像(flash image)来去除反光？](https://zhuanlan.zhihu.com/p/358337679)\n\n论文/paper：https://arxiv.org/pdf/2103.04273\n\n代码/code：https://github.com/ChenyangLEI/flash-reflection-removal\n\n[返回目录/back](#Contents)\n\n<a name=\"LaneDetection\"></a>\n\n## 车道线检测/Lane Detection\n\n:heavy_check_mark:**Keep your Eyes on the Lane: Real-time Attention-guided Lane Detection**\n\n视频/demo：[youtube](https://www.youtube.com/watch?v=ghs93acwkBQ)\n\n论文/paper：https://arxiv.org/abs/2010.12035\n\n代码/code：https://github.com/lucastabelini/LaneATT\n\n:heavy_check_mark:**Focus on Local: Detecting Lane Marker from Bottom Up via Key Point**\n\n论文/paper：https://arxiv.org/abs/2105.13680\n\n代码/code：None\n\n[返回目录/back](#Contents)\n\n<a name=\"FluidReconstruction\"></a>\n\n## 流体重建/Fluid Reconstruction\n\n:heavy_check_mark:**Global Transport for Fluid Reconstruction with Learned Self-Supervision**\n\n视频/demo：[youtube](https://www.youtube.com/watch?v=h-JhMqcuOQA)\n\n论文/paper：https://arxiv.org/abs/2104.06031 | [主页 | Homepage](https://ge.in.tum.de/publications/2021-franz-globtrans/)\n\n代码/code：https://github.com/tum-pbs/Global-Flow-Transport\n\n[返回目录/back](#Contents)\n\n<a name=\"FrameInterpolation\"></a>\n\n## 视频插帧/Frame Interpolation\n\n:heavy_check_mark:**CDFI: Compression-Driven Network Design for Frame Interpolation**\n\n视频demo：[youtube](https://www.youtube.com/watch?v=KEUcw4xoB5E)\n\n论文/paper：https://arxiv.org/abs/2103.10559\n\n代码/code： https://github.com/tding1/CDFI\n\n:heavy_check_mark:**Deep Animation Video Interpolation in the Wild**\n\n论文/paper：https://arxiv.org/abs/2104.02495\n\n代码/code：https://github.com/lisiyao21/AnimeInterp/\n\n:heavy_check_mark:**FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation**\n\n视频demo：[youtube](https://www.youtube.com/watch?v=TcQd0LCLCzo)\n\n论文/paper：https://arxiv.org/abs/2012.08512 | [主页 | Homepage](https://tarun005.github.io/FLAVR/)\n\n代码/code：https://github.com/tarun005/FLAVR\n\n[返回目录/back](#Contents)\n\n<a name=\"3DPointCloud\"></a>\n\n## 3D点云/3D point cloud\n\n:heavy_check_mark:**3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding**\n\n论文/paper： https://arxiv.org/abs/2103.16397\n\n代码/code：https://github.com/Gorilla-Lab-SCUT/AffordanceNet\n\n[返回目录/back](#Contents)\n\n<a name=\"Label-Noise\"></a>\n\n# 标签噪声 Label-Noise\n\n:heavy_check_mark:**A Second-Order Approach to Learning with Instance-Dependent Label Noise** (CVPR 2021 Oral)\n\n论文/paper： https://arxiv.org/abs/2012.11854\n\n代码/code：https://github.com/UCSC-REAL/CAL\n\n[返回目录/back](#Contents)\n\n<a name=\"AdversarialExamples\"></a>\n\n# 对抗样本 / Adversarial Examples\n\n:heavy_check_mark:**Enhancing the Transferability of Adversarial Attacks through Variance Tuning**\n\n论文/paper： https://arxiv.org/abs/2103.15571\n\n代码/code：https://github.com/JHL-HUST/VT\n\n:heavy_check_mark:**LiBRe: A Practical Bayesian Approach to Adversarial Detection**\n\n论文/paper： https://arxiv.org/abs/2103.14835\n\n代码/code：None\n\n:heavy_check_mark:**Natural Adversarial Examples**\n\n论文/paper： https://arxiv.org/abs/1907.07174\n\n代码/code：https://github.com/hendrycks/natural-adv-examples\n\n [返回目录/back](#Contents)\n\n<a name=\"Other\"></a>\n\n### 其他/Other\n\n:heavy_check_mark:**Domain Consensus Clustering for Universal Domain Adaptation**\n\n论文/paper： http://reler.net/papers/guangrui_cvpr2021.pdf\n\n代码/code：https://github.com/Solacex/Domain-Consensus-Clustering\n\n\n\n:heavy_check_mark:**Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation**\n\n视频demo：[youtube](https://www.youtube.com/watch?v=lNQQHIggnUg) | [bilibili](https://www.bilibili.com/video/BV1eh411S77C)\n\n论文/paper： |  [主页 | Homepage](https://hangz-nju-cuhk.github.io/projects/PC-AVS)\n\n代码/code：https://github.com/Hangz-nju-cuhk/Talking-Face_PC-AVS\n\n[返回目录/back](#Contents)\n\n:heavy_check_mark:**Layout-Guided Novel View Synthesis from a Single Indoor Panorama**\n\n解读：https://www.zhihu.com/question/446299297/answer/1769080299\n\n论文/paper：https://arxiv.org/abs/2103.17022\n\n代码/code：https://github.com/bluestyle97/PNVS\n\n\n\n视频demo：\n\n解读：\n\n论文/paper：\n\n代码/code：\n\n\n\n参考：\n\nhttps://github.com/extreme-assistant/CVPR2021-Paper-Code-Interpretation/blob/master/CVPR2021.md\n\nhttps://github.com/Kobaayyy/Awesome-CVPR2021-CVPR2020-Low-Level-Vision/blob/master/CVPR2021.md\n\nhttps://github.com/Qingcsai/awesome-Deepfakes\n\nhttps://github.com/592McAvoy/fake-face-detection\n\nhttps://githubmemory.com/repo/lkeab/CVPR-2021-Papers\n\nhttps://github.com/52CV/CVPR-2021-Papers\n\nhttps://github.com/amusi/CVPR2021-Papers-with-Code\n\nhttps://github.com/xialeiliu/Awesome-Incremental-Learning\n\nhttps://www.codetd.com/en/article/12856576\n\n[如何评价CVPR 2021的论文/paper接收结果？](https://www.zhihu.com/question/446299297/answer/1756260197)\n\nhttps://duoli.org/\n\nhttps://cv.snu.ac.kr/index.php/publication-international/\n\nhttps://github.com/bismex/Awesome-person-re-identification#3-CVPR2021\n\nhttps://github.com/yflv-yanxia/scene_text\n\nhttps://githubmemory.com/repo/FLHonker/Awesome-Knowledge-Distillation\n"
  },
  {
    "path": "CVPR2022.md",
    "content": "# CVPR2022-Papers-with-Code-Demo\n\n :star_and_crescent:**CVPR2021论文下载：https://pan.baidu.com/share/init?surl=gjfUQlPf73MCk4vM8VbzoA**\n\n**密码：aicv**\n\n :star_and_crescent:**CVPR2022论文下载：添加微信: nvshenj125, 备注 CVPR 2022 即可获取全部论文pdf**\n \n :star_and_crescent:**福利 注册即可领取 200 块计算资源 : https://www.bkunyun.com/wap/console?source=aistudy**\n [使用说明](https://mp.weixin.qq.com/s?__biz=MzU4NTY4Mzg1Mw==&amp;mid=2247521550&amp;idx=1&amp;sn=db4c7f609bd61ae7734b9e012a763f98&amp;chksm=fd8413eccaf39afa686f69f2df2463f4a6a8233ba3b3edf698513bbee556c9f6c21e835b8eb8&token=705359263&lang=zh_CN#rd)\n\n\n欢迎关注公众号：AI算法与图像处理\n\n:star2: [CVPR 2022](https://cvpr2022.thecvf.com/) 持续更新最新论文/paper和相应的开源代码/code！\n\n:car: CVPR 2022 收录列表ID：https://drive.google.com/file/d/15JFhfPboKdUcIH9LdbCMUFmGq_JhaxhC/view\n\n:car: 官网链接：http://cvpr2022.thecvf.com/\n\nB站demo：https://space.bilibili.com/288489574\n\n> :hand: ​注：欢迎各位大佬提交issue，分享CVPR 2022论文/paper和开源项目！共同完善这个项目\n>\n> 往年顶会论文汇总：\n>\n> [CVPR2021](https://github.com/DWCTOD/CVPR2022-Papers-with-Code-Demo/blob/main/CVPR2021.md)\n>\n> [ICCV2021](https://github.com/DWCTOD/ICCV2021-Papers-with-Code-Demo)\n\n### **:fireworks: 欢迎进群** | Welcome\n\nCVPR 2022 论文/paper交流群已成立！已经收录的同学，可以添加微信：**nvshenj125**，请备注：**CVPR+姓名+学校/公司名称**！一定要根据格式申请，可以拉你进群。\n\n<a name=\"Contents\"></a>\n\n\n\n### :hammer: **目录 |Table of Contents（点击直接跳转）**\n\n<details open>\n<summary> 目录（右侧点击可折叠）</summary>\n\n- [Backbone](#Backbone)\n- [数据集/Dataset](#Dataset)\n- [NAS](#NAS)\n- [Knowledge Distillation](#KnowledgeDistillation)\n- [多模态 / Multimodal ](#Multimodal )\n- [对比学习/Contrastive Learning](#ContrastiveLearning)\n- [图神经网络 / Graph Neural Networks](#GNN)\n- [胶囊网络 / Capsule Network](#CapsuleNetwork)\n- [图像分类 / Image Classification](#ImageClassification)\n- [目标检测/Object Detection](#ObjectDetection)\n- [目标跟踪/Object Tracking](#ObjectTracking)\n- [轨迹预测/Trajectory Prediction](#TrajectoryPrediction)\n- [语义分割/Segmentation](#Segmentation)\n- [弱监督语义分割/Weakly Supervised Semantic Segmentation](#WSSS)\n- [医学图像分割](#MedicalImageSegmentation)\n- [视频目标分割/Video Object Segmentation](#VideoObjectSegmentation)\n- [交互式视频目标分割/Interactive Video Object Segmentation](#InteractiveVideoObjectSegmentation)\n- [Visual Transformer](#VisualTransformer)\n- [深度估计/Depth Estimation](#DepthEstimation)\n- [人脸识别/Face Recognition](#FaceRecognition)\n- [人脸检测/Face Detection](#FaceDetection)\n- [人脸活体检测/Face Anti-Spoofing](#FaceAnti-Spoofing)\n- [人脸年龄估计/Age Estimation](#AgeEstimation)\n- [人脸表情识别/Facial Expression Recognition](#FacialExpressionRecognition)\n- [人脸属性识别/Facial Attribute Recognition](#FacialAttributeRecognition)\n- [人脸编辑/Facial Editing](#FacialEditing)\n- [换脸/Face Swap](#FaceSwap)\n- [人体姿态估计/Human Pose Estimation](#HumanPoseEstimation)\n- [6D位姿估计 /6D Pose Estimation](#6DPoseEstimation)\n- [手势姿态估计（重建）/Hand Pose Estimation( Hand Mesh Recovery)](#HandPoseEstimation)\n- [视频动作检测/Video Action Detection](#VideoActionDetection)\n- [手语翻译/Sign Language Translation](#SignLanguageTranslation)\n- [3D人体重建](#3D人体重建)\n- [行人重识别/Person Re-identification](#PersonRe-identification)\n- [行人搜索/Person Search](#PersonSearch)\n- [人群计数 / Crowd Counting](#CrowdCounting)\n- [GAN](#GAN)\n- [彩妆迁移 / Color-Pattern Makeup Transfer](#CPM)\n- [字体生成 / Font Generation](#FontGeneration)\n- [场景文本检测、识别/Scene Text Detection/Recognition](#OCR)\n- [图像、视频检索 / Image Retrieval/Video retrieval](#Retrieval)\n- [Image Animation](#ImageAnimation)\n- [抠图/Image Matting](#ImageMatting)\n- [超分辨率/Super Resolution](#SuperResolution)\n- [图像复原/Image Restoration](#ImageRestoration)\n- [图像补全/Image Inpainting](#ImageInpainting)\n- [图像去噪/Image Denoising](#ImageDenoising)\n- [图像编辑/Image Editing](#ImageEditing)\n- [图像拼接/Image stitching](#Imagestitching)\n- [图像匹配/Image Matching](#ImageMatching)\n- [图像融合/Image Blending](#ImageBlending)\n- [图像去雾/Image Dehazing](#ImageDehazing)\n- [图像压缩/Image Compression](#ImageCompression)\n- [反光去除/Reflection Removal](#ReflectionRemoval)\n- [车道线检测/Lane Detection](#LaneDetection)\n- [自动驾驶 / Autonomous Driving](#AutonomousDriving)\n- [流体重建/Fluid Reconstruction](#FluidReconstruction)\n- [场景重建 / Scene Reconstruction](#SceneReconstruction)\n- [视频插帧/Frame Interpolation](#FrameInterpolation)\n- [视频超分 / Video Super-Resolution](#VideoSuper-Resolution)\n- [ 3D点云/3D point cloud]( #3DPointCloud)\n- [标签噪声 / Label-Noise](#Label-Noise)\n- [对抗样本/Adversarial Examples](#AdversarialExamples)\n- [其他/Other](#Other)\n\n\n</details>\n\n<a name=\"Backbone\"></a>\n\n## Backbone\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"Dataset\"></a> \n\n## 数据集/Dataset\n\n**3MASSIV: Multilingual, Multimodal and Multi-Aspect dataset of Social Media Short Videos**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14456\n- 代码/Code: None\n\n**Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14712\n- 代码/Code: None\n\n**DynamicEarthNet: Daily Multi-Spectral Satellite Dataset for Semantic Change Segmentation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12560\n- 代码/Code: https://mediatum.ub.tum.de/1650201\n\n**Dataset Distillation by Matching Training Trajectories**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11932\n- 代码/Code: https://github.com/GeorgeCazenavette/mtt-distillation\n\n**FERV39k: A Large-Scale Multi-Scene Dataset for Facial Expression Recognition in Videos**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09463\n\n- 代码/Code：\n\n**GrainSpace: A Large-scale Dataset for Fine-grained and Domain-adaptive Recognition of Cereal Grains**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05306\n- 代码/Code：https://github.com/hellodfan/GrainSpace\n\n**STCrowd: A Multimodal Dataset for Pedestrian Perception in Crowded Scenes**\n\n论文/Paper: http://arxiv.org/pdf/2204.01026\n\n代码/Code: https://github.com/4dvlab/stcrowd\n\n**ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02389\n- 代码/Code: None\n\n**BEHAVE: Dataset and Method for Tracking Human Object Interactions**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06950\n- 代码/Code: None\n\n**SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in Soccer Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06918\n- 代码/Code: None\n\n**Hephaestus: A large scale multitask dataset towards InSAR understanding**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09435\n- 代码/Code: None\n\n**A New Dataset and Transformer for Stereoscopic Video Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10039\n- 代码/Code: https://github.com/H-deep/Trans-SVSR/\n\n[返回目录/back](#Contents)\n\n<a name=\"NAS\"></a> \n\n## NAS\n\n\n\n**Optimizing Elimination Templates by Greedy Parameter Search**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14901\n- 代码/Code: None\n\n**Searching for Network Width with Bilaterally Coupled Network**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13714\n- 代码/Code: None\n\n**Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05941\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"KnowledgeDistillation\"></a> \n\n## Knowledge Distillation\n\n**Decoupled Knowledge Distillation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08679\n- 代码/Code：https://github.com/megvii-research/mdistiller\n\n**Knowledge Distillation with the Reused Teacher Classifier**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14001\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"Multimodal\"></a> \n\n## 多模态 / Multimodal\n\n**Balanced Multimodal Learning via On-the-fly Gradient Modulation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15332\n- 代码/Code: None\n\n**Conditional Prompt Learning for Vision-Language Models**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05557\n- 代码/Code：https://github.com/KaiyangZhou/CoOp\n\n**Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13161\n- 代码/Code: None\n\n**Motron: Multimodal Probabilistic Human Motion Forecasting**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04132\n- 代码/Code：\n\n**StyleT2I: Toward Compositional and High-Fidelity Text-to-Image Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15799\n- 代码/Code: https://github.com/zhihengli-UR/StyleT2I\n\n**Text2Pos: Text-to-Point-Cloud Cross-Modal Localization**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15125\n- 代码/Code: None\n\n**Towards Implicit Text-Guided 3D Shape Generation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14622\n- 代码/Code: None\n\n**UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12745\n- 代码/Code: None\n\n**Versatile Multi-Modal Pre-Training for Human-Centric Perception**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13815\n- 代码/Code: None\n\n**X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15086\n- 代码/Code: https://github.com/layer6ai-labs/xpool\n\n**ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16778\n- 代码/Code: None\n\n**STCrowd: A Multimodal Dataset for Pedestrian Perception in Crowded Scenes**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01026\n\n- 代码/Code: https://github.com/4dvlab/stcrowd\n\n**XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05084\n- 代码/Code: None\n\n**Robust Cross-Modal Representation Learning with Progressive Self-Distillation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04588\n- 代码/Code: None\n\n**Multimodal Transformer for Nursing Activity Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04564\n- 代码/Code: https://github.com/Momilijaz96/MMT_for_NCRC\n\n**Probabilistic Compositional Embeddings for Multimodal Image Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05845\n- 代码/Code: https://github.com/andreineculai/MPC.\n\n**Are Multimodal Transformers Robust to Missing Modality?**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05454\n- 代码/Code: None\n\n**Multimodal Token Fusion for Vision Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08721\n- 代码/Code: None\n\n**Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09730\n- 代码/Code: https://github.com/mshukor/TFood\n\n**CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.14244\n- 代码/Code: https://github.com/KeremTurgutlu/clip_art\n\n**Vision-Language Pre-Training for Boosting Scene Text Detectors**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13867\n- 代码/Code: None\n\n**Cross-modal Representation Learning for Zero-shot Action Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.01657\n- 代码/Code: None\n\n**Episodic Memory Question Answering**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.01652\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"ContrastiveLearning\"></a> \n\n## Contrastive Learning\n\n**Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13505\n- 代码/Code: None\n\n**ContrastMask: Contrastive Learning to Segment Every Thing**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09775\n- 代码/Code: None\n\n**Fair Contrastive Learning for Facial Attribute Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16209\n- 代码/Code: https://github.com/sungho-coolg/fscl\n\n**Frame-wise Action Representations for Long Videos via Sequence Contrastive Learning**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14957\n- 代码/Code: None\n\n**Rethinking Minimal Sufficient Representation in Contrastive Learning**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07004\n\n- 代码/Code：https://github.com/Haoqing-Wang/InfoCL\n\n**Selective-Supervised Contrastive Learning with Noisy Labels**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04181\n- 代码/Code：https://github.com/ShikunLi/Sel-CL\n\n**Unsupervised Deraining: Where Contrastive Learning Meets Self-similarity**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11509\n- 代码/Code: None\n\n**Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16800\n- 代码/Code: https://github.com/MengyuanChen21/CVPR2022-FTCL\n\n**Unified Contrastive Learning in Image-Text-Label Space**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03610\n- 代码/Code: https://github.com/microsoft/unicl\n\n**Probabilistic Representations for Video Contrastive Learning**\n\n- 论文/Paper: http://arxiv.org/abs/2204.03946\n- 代码/Code: None\n\n**Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13207\n- 代码/Code: https://github.com/salesforce/hierarchicalContrastiveLearning.\n\n**UTC: A Unified Transformer with Inter-Task Contrastive Learning for Visual Dialog**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.00423\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"GNN\"></a> \n\n## 图神经网络 / Graph Neural Networks\n\n**Lifelong Graph Learning**\n\n- 论文/paper：https://arxiv.org/abs/2009.00647\n- 代码/code：https://github.com/wang-chen/LGL\n\n**Long-term Visual Map Sparsification with Heterogeneous GNN**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15182\n- 代码/Code: None\n\n**SkinningNet: Two-Stream Graph Convolutional Neural Network for Skinning Prediction of Synthetic Characters**\n\n- 论文/paper：https://arxiv.org/abs/2203.04746\n- 代码/code：https://imatge-upc.github.io/skinningnet/\n\n[返回目录/back](#Contents)\n\n<a name=\"CapsuleNetwork\"></a> \n\n# 胶囊网络 / Capsule Network\n\n**HP-Capsule: Unsupervised Face Part Discovery by Hierarchical Parsing Capsule Network**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10699\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageClassification\"></a> \n\n# 图像分类 / Image Classification\n\n**CAD: Co-Adapting Discriminative Features for Improved Few-Shot Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13465\n- 代码/Code: None\n\n**Integrative Few-Shot Learning for Classification and Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15712\n- 代码/Code: None\n\n**Matching Feature Sets for Few-Shot Image Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00949\n- 代码/Code: None\n\n**Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04567\n- 代码/Code: None\n\n**Regression or Classification? Reflection on BP prediction from PPG data using Deep Neural Networks in the scope of practical applications**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05605\n- 代码/Code: None\n\n**Revisiting Vicinal Risk Minimization for Partially Supervised Multi-Label Classification Under Data Scarcity**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08954\n- 代码/Code: None\n\n**Self-supervised Learning for Sonar Image Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09323\n- 代码/Code: https://github.com/agrija9/ssl-sonar-images\n\n**Generating Representative Samples for Few-Shot Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02918\n- 代码/Code: https://github.com/cvlab-stonybrook/fsl-rsvae.\n\n[返回目录/back](#Contents)\n\n<a name=\"ObjectDetection\"></a> \n\n## 目标检测/Object Detection\n\n**A Dual Weighting Label Assignment Scheme for Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09730\n- 代码/Code: https://github.com/strongwolf/dw\n\n**Implicit Motion Handling for Video Camouflaged Object Detection**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07363\n\n- 代码/Code：\n\n**Democracy Does Matter: Comprehensive Feature Mining for Co-Salient Object Detection**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05787\n- 代码/Code：\n\n**DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08195\n\n- 代码/Code：https://github.com/tensorflow/lingvo/tree/master/lingvo/\n\n**Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer**\n\n- 论文/paper：https://arxiv.org/abs/2112.01838 | [主页](https://fredzzhang.com/unary-pairwise-transformers/)\n- 代码/code：https://github.com/fredzzhang/upt\n\n**Expanding Low-Density Latent Regions for Open-Set Object Detection**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14911\n- 代码/Code: None\n\n**Ev-TTA: Test-Time Adaptation for Event-Based Object Recognition**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12247\n- 代码/Code: None\n\n**Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes**\n\n- 论文/paper：https://arxiv.org/abs/2011.12001\n- 代码/code：https://github.com/qq456cvb/CanonicalVoting\n\n**Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05238\n- 代码/Code：https://github.com/xuxw98/BackToReality\n\n**LiDAR Snowfall Simulation for Robust 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15118\n- 代码/Code: None\n\n**Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14940\n- 代码/Code: None\n\n**Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05180\n- 代码/Code：https://github.com/CVMI-Lab/KDEP\n\n**Optimal Correction Cost for Object Detection Evaluation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14438\n- 代码/Code: None\n\n**Point2Seq: Detecting 3D Objects as Sequences**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13394\n- 代码/Code: None\n\n**Point Density-Aware Voxels for LiDAR 3D Object Detection**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05662\n- 代码/Code：https://github.com/TRAILab/PDV\n\n**MonoJSG: Joint Semantic and Geometric Cost Volume for Monocular 3D Object Detection**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08563\n- 代码/Code：\n\n**MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10981\n- 代码/Code: None\n\n**MonoDETR: Depth-aware Transformer for Monocular 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13310\n\n**Real-time Object Detection for Streaming Perception**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12338\n- 代码/Code: https://github.com/yancie-yjr/StreamYOLO\n\n**SIOD: Single Instance Annotated Per Category Per Image for Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15353\n- 代码/Code: None\n\n**SIGMA: Semantic-complete Graph Matching for Domain Adaptive Object Detection**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06398\n- 代码/Code：https://github.com/CityU-AIM-Group/SIGMA\n\n**Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09780\n- 代码/Code: None\n\n**Task-specific Inconsistency Alignment for Domain Adaptive Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15345\n- 代码/Code: None\n\n**TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11496\n- 代码/Code: https://github.com/XuyangBai/TransFusion\n\n**VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09704\n- 代码/Code: https://github.com/gorilla-lab-scut/vista\n\n**Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10314\n- 代码/Code: None\n\n**Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13608\n- 代码/Code: None\n\n**Understanding 3D Object Articulation in Internet Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16531\n- 代码/Code: None\n\n**AdaMixer: A Fast-Converging Query-Based Object Detector**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16507\n- 代码/Code: https://github.com/mcg-nju/adamixer\n\n**Forecasting from LiDAR via Future Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16297\n- 代码/Code: None\n\n**Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16220\n- 代码/Code: https://github.com/dlut-dimt/tardal\n\n**Learning of Global Objective for Network Flow in Multi-Object Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16210\n- 代码/Code: None\n\n**FLOAT: Factorized Learning of Object Attributes for Improved Multi-object Multi-part Scene Parsing**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16168\n- 代码/Code: None\n\n**Omni-DETR: Omni-Supervised Object Detection with Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16089\n- 代码/Code: None\n\n**Learning to Detect Mobile Objects from LiDAR Scans Without Labels**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15882\n- 代码/Code: https://github.com/yurongyou/modest\n\n**Multi-Granularity Alignment Domain Adaptation for Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16897\n- 代码/Code: None\n\n**CAT-Det: Contrastively Augmented Transformer for Multi-modal 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00325\n\n- 代码/Code: None\n\n**R(Det)^2: Randomized Decision Routing for Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00794\n\n- 代码/Code: None\n\n**Homography Loss for Monocular 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00754\n- 代码/Code: https://github.com/gujiaqivadin/HomographyLoss\n\n**Overcoming Catastrophic Forgetting in Incremental Object Detection via Elastic Response Distillation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02136\n- 代码/Code: None\n\n**Towards Robust Adaptive Object Detection under Noisy Annotations**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02620\n- 代码/Code: None\n\n**Towards Open-Set Object Detection and Discovery**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05604\n- 代码/Code: None\n\n**DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05575\n- 代码/Code: https://github.com/AIR-THU/DAIR-V2X.\n\n**HyperDet3D: Learning a Scene-conditioned 3D Object Detector**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05599\n- 代码/Code: None\n\n**Dense Learning based Semi-Supervised Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07300\n- 代码/Code: https://github.com/chenbinghui1/DSL\n\n**Entropy-based Active Learning for Object Detection with Progressive Diversity Constraint**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07965\n- 代码/Code: None\n\n**Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07964\n- 代码/Code: None\n\n**Modeling Missing Annotations for Incremental Learning in Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08766\n- 代码/Code: https://github.com/fcdl94/MMA\n\n**Augmentation of Atmospheric Turbulence Effects on Thermal Adapted Object Detection Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08745\n- 代码/Code: None\n\n**Focal Sparse Convolutional Networks for 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/abs/2204.12463\n- 代码/Code: http://github.com/dvlab-research/FocalsConv\n\n**Rotationally Equivariant 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13630\n- 代码/Code: None\n\n**Cross Domain Object Detection by Target-Perceived Dual Branch Distillation**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.01291\n- 代码/Code: https://github.com/feobi1999/tdd\n\n**Dynamic Sparse R-CNN**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02101\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n\n\n<a name=\"ObjectTracking\"></a> \n\n## 目标跟踪/Object Tracking\n\n**DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion**\n\n- 论文/Paper：https://arxiv.org/abs/2111.14690\n- 代码/Code：https://github.com/DanceTrack/DanceTrack\n\n**Global Tracking Transformers**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13250\n- 代码/Code: None\n\n**MixFormer: End-to-End Tracking with Iterative Mixed Attention**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11082\n- 代码/Code: None\n\n**Transforming Model Prediction for Tracking**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11192\n- 代码/Code: None\n\n**TCTrack: Temporal Contexts for Aerial Tracking**\n\n- 论文/Paper：https://arxiv.org/abs/2203.01885\n- 代码/Code：https://github.com/vision4robotics/TCTrack\n\n**Unified Transformer Tracker for Object Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15175\n- 代码/Code: None\n\n**Learning of Global Objective for Network Flow in Multi-Object Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16210\n- 代码/Code: None\n\n**Global Tracking via Ensemble of Local Trackers**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16092\n- 代码/Code: https://github.com/zikunzhou/gtelt\n\n**MeMOT: Multi-Object Tracking with Memory**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16761\n- 代码/Code: None\n\n**Unsupervised Learning of Accurate Siamese Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01475\n- 代码/Code: https://github.com/florinshum/ulast\n\n**Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline**\n\n- 论文/Paper: http://arxiv.org/abs/2204.04120\n- 代码/Code: None\n\n**BEHAVE: Dataset and Method for Tracking Human Object Interactions**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06950\n- 代码/Code: None\n\n**SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in Soccer Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06918\n- 代码/Code: None\n\n**Detecting, Tracking and Counting Motorcycle Rider Traffic Violations on Unconstrained Roads**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08364\n- 代码/Code: None\n\n# 3D Object Tracking\n\n**Iterative Corresponding Geometry: Fusing Region and Depth for Highly Efficient 3D Tracking of Textureless Objects**\n\n- 视频/Demo：[Youtube](https://www.youtube.com/watch?v=qMr1RHCsnDk) \n- 论文/Paper：https://arxiv.org/abs/2203.05334\n- 代码/Code：https://github.com/DLR-RM/3DObjectTracking\n\n**Multi-Camera Multiple 3D Object Tracking on the Move for Autonomous Vehicles**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09151\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"TrajectoryPrediction\"></a> \n\n## 轨迹预测/Trajectory Prediction\n\n**How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04781\n- 代码/Code：\n\n**Non-Probability Sampling Network for Stochastic Human Trajectory Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13471\n- 代码/Code: None\n\n**Remember Intentions: Retrospective-Memory-based Trajectory Prediction**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11474\n- 代码/Code: None\n\n**Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13777\n- 代码/Code: None\n\n**Goal-driven Self-Attentive Recurrent Networks for Trajectory Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11561\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"Segmentation\"></a> \n\n## 语义分割/Segmentation\n\n**Class-Balanced Pixel-Level Self-Labeling for Domain Adaptive Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09744\n- 代码/Code: https://github.com/lslrh/cpsl\n\n**Deep Hierarchical Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14335\n- 代码/Code: None\n\n**E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04074\n- 代码/Code：https://github.com/zhang-tao-whu/e2ec\n\n**Hyperbolic Image Segmentation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05898\n- 代码/Code：\n\n**Mask Transfiner for High-Quality Instance Segmentation**\n\n- 论文/Paper： https://arxiv.org/abs/2111.13673\n- 代码/Code：https://github.com/SysCV/transfiner\n\n**Noisy Boundaries: Lemon or Lemonade for Semi-supervised Instance Segmentation?**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13427\n- 代码/Code: None\n\n**Rethinking Semantic Segmentation: A Prototype View**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15102\n- 代码/Code: None\n\n**Regional Semantic Contrast and Aggregation for Weakly Supervised Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09653\n- 代码/Code: https://github.com/maeve07/rca\n\n**Representation Compensation Networks for Continual Semantic Segmentation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05402\n- 代码/Code：https://github.com/zhangchbin/RCIL\n\n**SimT: Handling Open-set Noise for Domain Adaptive Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15202\n- 代码/Code: None\n\n**Semantic Segmentation by Early Region Proxy**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14043\n- 代码/Code: None\n\n**Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels**\n\n- 论文/Paper：https://arxiv.org/abs/2203.03884\n- 代码/Code：\n\n**SharpContour: A Contour-based Boundary Refinement Approach for Efficient and Accurate Instance Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13312\n- 代码/Code: None\n\n**ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation**\n\n- 论文/paper：https://arxiv.org/abs/2106.05095\n- 代码/code：https://github.com/LiheYoung/ST-PlusPlus\n\n**Scribble-Supervised LiDAR Semantic Segmentation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08537\n- 代码/Code：https://github.com/ouenal/scribblekitti\n\n**Sparse Instance Activation for Real-Time Instance Segmentation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12827\n- 代码/Code: None\n\n**Tree Energy Loss: Towards Sparsely Annotated Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10739\n- 代码/Code: None\n\n**Towards Fewer Annotations: Active Learning via Region Impurity and Prediction Uncertainty for Domain Adaptive Semantic Segmentation**\n\n- 论文/Paper: https://arxiv.org/pdf/2111.12940.pdf\n- 代码/Code:https://github.com/BIT-DA/RIPU\n\n**Weakly Supervised Semantic Segmentation using Out-of-Distribution Data**\n\n- 论文/Paper：https://arxiv.org/abs/2203.03860\n- 代码/Code：None\n\n**ReSTR: Convolution-free Referring Image Segmentation Using Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16768\n- 代码/Code: None\n\n**FIFO: Learning Fog-invariant Features for Foggy Scene Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01587\n\n- 代码/Code: None\n\n**WildNet: Learning Domain Generalized Semantic Segmentation from the Wild**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01446\n\n- 代码/Code: https://github.com/suhyeonlee/wildnet\n\n**Semantic-Aware Domain Generalized Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00822\n- 代码/Code: https://github.com/leolyj/san-saw\n\n**FocalClick: Towards Practical Interactive Image Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02574\n- 代码/Code: https://github.com/XavierCHEN34/ClickSEG\n\n**Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02547\n- 代码/Code: None\n\n**Pin the Memory: Learning to Generalize Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03609\n- 代码/Code: None\n\n**Coarse-to-Fine Feature Mining for Video Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03330\n- 代码/Code: https://github.com/guoleisun/vss-cffm\n\n**L2G: A Simple Local-to-Global Knowledge Transfer Framework for Weakly Supervised Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03206\n- 代码/Code: https://github.com/PengtaoJiang/L2G\n\n**Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04656\n- 代码/Code: https://github.com/lxtGH/Video-K-Net\n\n**NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05538\n- 代码/Code: None\n\n**TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05525\n- 代码/Code: https://github.com/hustvl/TopFormer\n\n**Panoptic, Instance and Semantic Relations: A Relational Context Encoder to Enhance Panoptic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05370\n- 代码/Code: None\n\n**Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06107\n- 代码/Code: None\n\n**Joint Forecasting of Panoptic Segmentations with Difference Attention  (Oral)**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07157\n- 代码/Code: None\n\n**Cross-Image Relational Knowledge Distillation for Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06986\n- 代码/Code: https://github.com/winycg/cirkd\n\n**Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation  (Oral)**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07548\n- 代码/Code: https://github.com/drprojects/DeepViewAgg\n\n**Temporally Efficient Vision Transformer for Video Instance Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08412\n- 代码/Code: https://github.com/hustvl/TeViT.\n\n**Augmentation Invariance and Adaptive Sampling in Semantic Segmentation of Agricultural Aerial Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07969\n- 代码/Code: None\n\n**Dual-Domain Image Synthesis using Segmentation-Guided GAN**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09015\n- 代码/Code: https://github.com/denabazazian/Dual-Domain-Synthesis.\n\n**Proposal-free Lidar Panoptic Segmentation with Pillar-level Affinity**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08744\n- 代码/Code: None\n\n**Unsupervised Domain Adaptation for Cardiac Segmentation: Towards Structure Mutual Information Maximization**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09334\n- 代码/Code: https://github.com/LOUEY233/Toward-Mutual-Information}{https://github.com/LOUEY233/Toward-Mutual-Information\n\n**Dynamic Prototype Convolution Network for Few-Shot Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10638\n- 代码/Code: None\n\n**Interactive Segmentation and Visualization for Tiny Objects in Multi-megapixel Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10356\n- 代码/Code: https://github.com/cy-xu/cosmic-conn\n\n**Multi-Head Distillation for Continual Unsupervised Domain Adaptation in Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11667\n- 代码/Code: None\n\n**Unsupervised Hierarchical Semantic Segmentation with Multiview Cosegmentation and Clustering Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11432\n- 代码/Code: https://github.com/twke18/HSG\n\n**Transfer Learning from Synthetic In-vitro Soybean Pods Dataset for In-situ Segmentation of On-branch Soybean Pod**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10902\n- 代码/Code: None\n\n**DArch: Dental Arch Prior-assisted 3D Tooth Instance Segmentation**\n\n- 论文/Paper: http://arxiv.org/abs/2204.11911\n- 代码/Code: None\n\n**Self-Supervised Learning of Object Parts for Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13101\n- 代码/Code: None\n\n**MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.12667\n- 代码/Code: None\n\n**Cross-view Transformers for real-time Map-view Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02833\n- 代码/Code: https://github.com/bradyz/cross_view_transformers\n\n[返回目录/back](#Contents)\n\n\n\n<a name=\"WSSS\"></a>\n\n## 弱监督语义分割/Weakly Supervised Semantic Segmentation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"MedicalImageSegmentation\"></a>\n\n# 医学图像分割/Medical Image Segmentation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"VideoObjectSegmentation\"></a>\n\n# 视频目标分割/Video Object Segmentation\n\n**Language as Queries for Referring Video Object Segmentation**\n\n- 论文/paper：https://arxiv.org/abs/2201.00487\n\n- 代码/code：https://github.com/wjn922/ReferFormer\n\n[返回目录/back](#Contents)\n\n<a name=\"InteractiveVideoObjectSegmentation\"></a>\n\n# 交互式视频目标分割/Interactive Video Object Segmentation\n\n**MSTR: Multi-Scale Transformer for End-to-End Human-Object Interaction Detection**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14709\n- 代码/Code: None\n\n**OakInk: A Large-scale Knowledge Repository for Understanding Hand-Object Interaction**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15709\n- 代码/Code: None\n\n**What to look at and where: Semantic and Spatial Refined Transformer for detecting human-object interactions**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00746\n\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"VisualTransformer\"></a>\n\n# Visual Transformer\n\n**Affine Medical Image Registration with Coarse-to-Fine Vision Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15216\n- 代码/Code: https://github.com/cwmok/C2FViT\n\n**Automated Progressive Learning for Efficient Training of Vision Transformers**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14509\n- 代码/Code: None\n\n**Attribute Surrogates Learning and Spectral Tokens Pooling in Transformers for Few-shot Learning**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09064\n- 代码/Code：https://github.com/StomachCold/HCTransformers\n\n**Cascade Transformers for End-to-End Person Search**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09642\n- 代码/Code: https://github.com/kitware/coat\n\n**EDTER: Edge Detection with Transformer**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08566\n- 代码/Code：\n\n**Few-Shot Object Detection with Fully Cross-Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15021\n- 代码/Code: None\n\n**Global Tracking Transformers**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13250\n- 代码/Code: None\n\n**GradViT: Gradient Inversion of Vision Transformers**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11894\n- 代码/Code: https://gradvit.github.io/\n\n**Hyperbolic Vision Transformers: Combining Improvements in Metric Learning**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10833\n- 代码/Code: None\n\n**Meta-attention for ViT-backed Continual Learning**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11684\n- 代码/Code: None\n\n**MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation**\n\n- 论文/Paper: https://arxiv.org/pdf/2111.12707.pdf\n\n- 代码/Code: https://github.com/Vegetebird/MHFormer\n\n**Self-Supervised Transformers for Unsupervised Object Discovery using Normalized Cut**\n\n- 论文/Paper：https://arxiv.org/abs/2202.11539 | [主页](https://www.m-psi.fr/Papers/TokenCut2022/)\n- 代码/Code：https://github.com/YangtaoWANG95/TokenCut\n\n**Training-free Transformer Architecture Search**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12217\n- 代码/Code: None\n\n**Towards Practical Certifiable Patch Defense with Vision Transformer**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08519\n\n- 代码/Code：\n\n**Towards Robust Vision Transformer**\n\n- 论文/Paper: https://arxiv.org/abs/2105.07926\n- 代码/Code: https://github.com/vtddggg/Robust-Vision-Transformer\n\n**Collaborative Transformers for Grounded Situation Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16518\n- 代码/Code: https://github.com/jhcho99/coformer\n\n**TubeDETR: Spatio-Temporal Video Grounding with Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16434\n- 代码/Code: https://github.com/antoyang/TubeDETR\n\n**InstaFormer: Instance-Aware Image-to-Image Translation with Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16248\n- 代码/Code: None\n\n**Spatial-Temporal Parallel Transformer for Arm-Hand Dynamic Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16202\n- 代码/Code: None\n\n**Omni-DETR: Omni-Supervised Object Detection with Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16089\n- 代码/Code: None\n\n**TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17266\n- 代码/Code: https://github.com/BillyXYB/TransEditor\n\n**VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17247\n- 代码/Code: None\n\n**CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16896\n- 代码/Code: None\n\n**Deformable Video Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16795\n- 代码/Code: None\n\n**ReSTR: Convolution-free Referring Image Segmentation Using Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16768\n- 代码/Code: None\n\n**TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01018\n\n- 代码/Code: https://github.com/sviprepetitioncounting/transrac\n\n**Consistency driven Sequential Transformers Attention Model for Partially Observable Scenes**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00656\n- 代码/Code: None\n\n**Multi-View Transformer for 3D Visual Grounding**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02174\n- 代码/Code: None\n\n**Dual-AI: Dual-path Action Interaction Learning for Group Activity Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02148\n- 代码/Code: None\n\n**Detector-Free Weakly Supervised Group Activity Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02139\n- 代码/Code: None\n\n**Text Spotting Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01918\n- 代码/Code: None\n\n**PSTR: End-to-End One-Step Person Search With Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03340\n- 代码/Code: https://github.com/jialecao001/pstr\n\n**Consistency Learning via Decoding Path Augmentation for Transformers in Human Object Interaction Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04836\n- 代码/Code: https://github.com/mlvlab/CPChoi.\n\n**Multimodal Transformer for Nursing Activity Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04564\n- 代码/Code: https://github.com/Momilijaz96/MMT_for_NCRC\n\n**Learning Trajectory-Aware Transformer for Video Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04216\n- 代码/Code: https://github.com/researchmm/TTVSR\n\n**TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05525\n- 代码/Code: https://github.com/hustvl/TopFormer\n\n**Are Multimodal Transformers Robust to Missing Modality?**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05454\n- 代码/Code: None\n\n**MiniViT: Compressing Vision Transformers with Weight Multiplexing**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07154\n- 代码/Code: https://github.com/microsoft/cream\n\n**ViTOL: Vision Transformer for Weakly Supervised Object Localization**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06772\n- 代码/Code: https://github.com/Saurav-31/ViTOL\n\n**Temporally Efficient Vision Transformer for Video Instance Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08412\n- 代码/Code: https://github.com/hustvl/TeViT.\n\n**Safe Self-Refinement for Transformer-based Domain Adaptation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07683\n- 代码/Code: None\n\n**Multi-Frame Self-Supervised Depth with Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07616\n- 代码/Code: None\n\n**Self-Calibrated Efficient Transformer for Lightweight Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08913\n- 代码/Code: https://github.com/AlexZou14/SCET.\n\n**Multimodal Token Fusion for Vision Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08721\n- 代码/Code: None\n\n**Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer  (Oral)**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08680\n- 代码/Code: https://github.com/zengwang430521/TCFormer\n\n**NFormer: Robust Person Re-identification with Neighbor Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09331\n- 代码/Code: https://github.com/haochenheheda/NFormer\n\n**Human-Object Interaction Detection via Disentangled Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09290\n- 代码/Code: None\n\n**A New Dataset and Transformer for Stereoscopic Video Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10039\n- 代码/Code: https://github.com/H-deep/Trans-SVSR/\n\n**Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09730\n- 代码/Code: https://github.com/mshukor/TFood\n\n**Unsupervised Hierarchical Semantic Segmentation with Multiview Cosegmentation and Clustering Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11432\n- 代码/Code: https://github.com/twke18/HSG\n\n**VISTA: Vision Transformer enhanced by U-Net and Image Colorfulness Frame Filtration for Automatic Retail Checkout**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11024\n- 代码/Code: None\n\n**DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.12997\n- 代码/Code: None\n\n**UTC: A Unified Transformer with Inter-Task Contrastive Learning for Visual Dialog**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.00423\n- 代码/Code: None\n\n**TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02028\n- 代码/Code: https://github.com/kennymckormick/TransRank\n\n**Cross-view Transformers for real-time Map-view Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02833\n- 代码/Code: https://github.com/bradyz/cross_view_transformers\n\n[返回目录/back](#Contents)\n\n<a name=\"DepthEstimation\"></a>\n\n## 深度估计/Depth Estimation\n\n**OACC-Net: Occlusion-Aware Cost Constructor for Light Field Depth Estimation**\n\n- 论文/Paper: https://arxiv.org/pdf/2203.01576.pdf\n- 代码/Code: https://github.com/YingqianWang/OACC-Net\n\n**P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02091\n- 代码/Code: None\n\n**HiMODE: A Hybrid Monocular Omnidirectional Depth Estimation Model**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05007\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceRecognition\"></a>\n\n# 人脸识别/Face Recognition\n\n**Adaface: Quality Adaptive Margin for Face Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00964\n\n- 代码/Code: https://github.com/mk-minchul/adaface\n\n**WebFace260M: A Benchmark for Million-Scale Deep Face Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10149\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceDetection\"></a>\n\n# 人脸检测/Face Detection\n\n**Privacy-preserving Online AutoML for Domain-Specific Face Detection**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08399\n- 代码/Code：None\n\n**Robust Neonatal Face Detection in Real-world Clinical Settings**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00655\n\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceAnti-Spoofing\"></a>\n\n# 人脸活体检测/Face Anti-Spoofing\n\n**Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing**\n\n- 论文/Paper: https://arxiv.org/abs/2203.05340\n- 代码/Code: \n\n**PatchNet: A Simple Face Anti-Spoofing Framework via Fine-Grained Patch Recognition**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14325\n- 代码/Code: None\n\n**Self-supervised Learning of Adversarial Example: Towards Good Generalizations for Deepfake Detection**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12208\n- 代码/Code: https://github.com/liangchen527/sladd\n\n[返回目录/back](#Contents)\n\n<a name=\"AgeEstimation\"></a>\n\n# 人脸年龄估计/Age Estimation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"FacialExpressionRecognition\"></a>\n\n# 人脸表情识别/Facial Expression Recognition\n\n**MDAN: Multi-level Dependent Attention Network for Visual Emotion Analysis**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13443\n- 代码/Code: None\n\n**Towards Semi-Supervised Deep Facial Expression Recognition with An Adaptive Confidence Margin**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12341\n- 代码/Code: https://github.com/hangyu94/ada-cm\n\n[返回目录/back](#Contents)\n\n<a name=\"FacialAttributeRecognition\"></a>\n\n# 人脸属性识别/Facial Attribute Recognition\n\n**Fair Contrastive Learning for Facial Attribute Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16209\n- 代码/Code: https://github.com/sungho-coolg/fscl\n\n<a name=\"FacialEditing\"></a>\n\n## 人脸编辑/Facial Editing\n\n**TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17266\n- 代码/Code: https://github.com/BillyXYB/TransEditor\n\n**Face Relighting with Geometrically Consistent Shadows**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16681\n- 代码/Code: None\n\n**Escaping Data Scarcity for High-Resolution Heterogeneous Face Hallucination**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16669\n- 代码/Code: None\n\n**EMOCA: Emotion Driven Monocular Face Capture and Animation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11312\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceSwap\"></a>\n\n## 换脸/Face Swap\n\n**High-resolution Face Swapping via Latent Semantics Disentanglement**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15958\n- 代码/Code: None\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"HumanPoseEstimation\"></a>\n\n# 人体姿态估计/Human Pose Estimation\n\n**Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08534\n- 代码/Code：https://mps-net.github.io/MPS-Net/\n\n**DiffPoseNet: Direct Differentiable Camera Pose Estimation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11174\n- 代码/Code: None\n\n**EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13254\n- 代码/Code: None\n\n**GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07918\n- 代码/Code：\n\n**MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video**\n\n- 论文/Paper：https://arxiv.org/abs/2203.00859\n- 代码/Code：https://github.com/JinluZhang1126/MixSTE\n\n**MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation**\n\n- 论文/Paper: https://arxiv.org/pdf/2111.12707.pdf\n- 代码/Code: https://github.com/Vegetebird/MHFormer\n\n**OSOP: A Multi-Stage One Shot Object Pose Estimation Framework**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15533\n- 代码/Code: None\n\n**Temporal Feature Alignment and Mutual Information Maximization for Video-Based Human Pose Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15227\n- 代码/Code: None\n\n**PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15625\n- 代码/Code: None\n\n**Ray3D: ray-based 3D human pose estimation for monocular absolute 3D localization**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11471\n- 代码/Code: https://github.com/YxZhxn/Ray3D\n\n**Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15293\n- 代码/Code: None\n\n**Templates for 3D Object Pose Estimation Revisited: Generalization to New Objects and Robustness to Occlusions**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17234\n- 代码/Code: None\n\n**Focal Length and Object Pose Estimation via Render and Compare**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05145\n- 代码/Code: http://github.com/ponimatkin/focalpose\n\n**DGECN: A Depth-Guided Edge Convolutional Network for End-to-End 6D Pose Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09983\n- 代码/Code: None\n\n**Coupled Iterative Refinement for 6D Multi-Object Pose Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.12516\n- 代码/Code: https://github.com/princeton-vl/Coupled-Iterative-Refinement.\n\n[返回目录/back](#Contents)\n\n<a name=\"6DPoseEstimation\"></a>\n\n# 6D位姿估计 /6D Pose Estimation\n\n**FS6D: Few-Shot 6D Pose Estimation of Novel Objects**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14628\n- 代码/Code: None\n\n**Uni6D: A Unified CNN Framework without Projection Breakdown for 6D Pose Estimation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14531\n- 代码/Code: None\n\n**ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09418\n- 代码/Code：\n\n**RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust Correspondence Field Estimation and Pose Optimization**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12870\n- 代码/Code: None\n\n**ES6D: A Computation Efficient and Symmetry-Aware 6D Pose Regression Framework**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01080\n\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"HandPoseEstimation\"></a>\n\n## 手势姿态估计（重建）/Hand Pose Estimation( Hand Mesh Recovery\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"VideoActionDetection\"></a>\n\n## 视频动作检测/Video Action Detection\n\n**DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10233\n- 代码/Code: None\n\n**End-to-End Semi-Supervised Learning for Video Action Detection**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04251\n- 代码/Code：\n\n**How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12344\n- 代码/Code: https://github.com/hazeld/pseudoadverbs\n\n**Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11637\n- 代码/Code: https://github.com/soCzech/LookForTheChange\n\n**RCL: Recurrent Continuous Localization for Temporal Action Detection**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07112\n- 代码/Code：\n\n**SPAct: Self-supervised Privacy Preservation for Action Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15205\n- 代码/Code: None\n\n**An Empirical Study of End-to-End Temporal Action Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02932\n- 代码/Code: https://github.com/xlliu7/E2E-TAD\n\n**SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric Action Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04796\n- 代码/Code: None\n\n**Video Action Detection: Analysing Limitations and Challenges**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07892\n- 代码/Code: None\n\n**Hybrid Relation Guided Set Matching for Few-shot Action Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13423\n- 代码/Code: None\n\n**Cross-modal Representation Learning for Zero-shot Action Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.01657\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"SignLanguageTranslation\"></a>\n\n## 手语翻译/Sign Language Translation\n\n**A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04287\n- 代码/Code：\n\n[返回目录/back](#Contents)\n\n<a name=\"3D人体重建\"></a>\n\n## 3D人体重建/Person Reconstruction\n\n**ImFace: A Nonlinear 3D Morphable Face Model with Implicit Neural Representations**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14510\n- 代码/Code: None\n\n**AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09516\n- 代码/Code：https://yccyenchicheng.github.io/AutoSDF/\n\n**Learning Motion-Dependent Appearance for High-Fidelity Rendering of Dynamic Humans from a Single Camera**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12780\n- 代码/Code: None\n\n**MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation**\n\n- 论文/Paper: https://arxiv.org/pdf/2111.12707.pdf\n\n- 代码/Code: https://github.com/Vegetebird/MHFormer\n\n**OcclusionFusion: Occlusion-aware Motion Estimation for Real-time Dynamic 3D Reconstruction**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07977\n- 代码/Code：https://wenbin-lin.github.io/OcclusionFusion\n\n**Structured Local Radiance Fields for Human Avatar Modeling**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14478\n- 代码/Code: None\n\n**JIFF: Jointly-aligned Implicit Face Function for High Quality Single View Clothed Human Reconstruction**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10549\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"PersonRe-identification\"></a>\n\n## 行人重识别/Person Re-identification\n\n**Camera-Conditioned Stable Feature Generation for Isolated Camera Supervised Person Re-IDentification**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15210\n- 代码/Code: None\n\n**Part-based Pseudo Label Refinement for Unsupervised Person Re-identification**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14675\n- 代码/Code: None\n\n**Cloning Outfits from Real-World Images to 3D Characters for Generalizable Person Re-Identification**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02611\n- 代码/Code: https://github.com/Yanan-Wang-cs/ClonedPerson\n\n**Implicit Sample Extension for Unsupervised Person Re-Identification**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06892\n- 代码/Code: https://github.com/PaddlePaddle/PaddleClas\n\n**Clothes-Changing Person Re-identification with RGB Modality Only**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06890\n- 代码/Code: https://github.com/guxinqian/Simple-CCReID.\n\n**NFormer: Robust Person Re-identification with Neighbor Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09331\n- 代码/Code: https://github.com/haochenheheda/NFormer\n\n[返回目录/back](#Contents)\n\n<a name=\"PersonSearch\"></a>\n\n# 行人搜索/Person Search\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"CrowdCounting\"></a>\n\n## 人群计数 / Crowd Counting\n\n**Cross-View Cross-Scene Multi-View Crowd Counting**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.01551\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"GAN\"></a>\n\n## GAN\n\n**A Style-aware Discriminator for Controllable Image Translation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15375\n- 代码/Code: None\n\n**Attribute Group Editing for Reliable Few-shot Image Generation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08422\n- 代码/Code：\n\n**Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13055\n- 代码/Code: None\n\n**Compound Domain Generalization via Meta-Knowledge Encoding**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13006\n- 代码/Code: None\n\n**Diverse Plausible 360-Degree Image Outpainting for Efficient 3DCG Background Creation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14668\n- 代码/Code: None\n\n**Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07740\n- 代码/Code：https://github.com/YBZh/EFDM\n\n**FlexIT: Towards Flexible Semantic Image Translation** \n\n- 论文/paper：https://arxiv.org/abs/2203.04705 \n- 代码/code：\n\n**GCFSR: a Generative and Controllable Face Super Resolution Method Without Facial and GAN Priors**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07319\n\n- 代码/Code：\n\n**GAN-Supervised Dense Visual Alignment** \n\n- 论文/paper：https://arxiv.org/abs/2112.05143\n- 代码/code：https://github.com/wpeebles/gangealing\n\n**GIRAFFE HD: A High-Resolution 3D-aware Generative Model**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14954\n- 代码/Code: None\n\n**HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing**\n\n- 论文/paper：https://arxiv.org/abs/2111.15666 | [主页](https://yuval-alaluf.github.io/hyperstyle/)\n- 代码/code：https://github.com/yuval-alaluf/hyperstyle\n\n**Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single Image**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09457\n- 代码/Code：https://xrenaa.github.io/look-outside-room/\n\n**Modulated Contrast for Versatile Image Synthesis**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09333\n- 代码/Code：https://github.com/fnzhan/MoNCE\n\n**Maximum Spatial Perturbation Consistency for Unpaired Image-to-Image Translation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12707\n- 代码/Code: None\n\n**Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13248\n- 代码/Code: None\n\n**QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation**\n\n- 论文/Paperhttps://arxiv.org/abs/2203.08483\n- 代码/Code：\n\n**RGB-Depth Fusion GAN for Indoor Depth Completion**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10856\n- 代码/Code: None\n\n**Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09811\n- 代码/Code: https://github.com/dongxingning/sha-gcl-for-sgg\n\n**Style Transformer for Image Inversion and Editing**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07932\n- 代码/Code：\n\n**Unsupervised Domain Adaptation for Nighttime Aerial Tracking**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10541\n- 代码/Code: None\n\n**Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06321\n- 代码/Code：\n\n**Industrial Style Transfer with Large-scale Geometric Warping and Content Preservation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12835\n- 代码/Code: None\n\n**TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17266\n- 代码/Code: https://github.com/BillyXYB/TransEditor\n\n**TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17266\n- 代码/Code: https://github.com/BillyXYB/TransEditor\n\n**Marginal Contrastive Correspondence for Guided Image Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00442\n- 代码/Code: None\n\n**Style-Based Global Appearance Flow for Virtual Try-On**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01046\n\n- 代码/Code: https://github.com/senhe/flow-style-vton\n\n**Arbitrary-Scale Image Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02273\n- 代码/Code: https://github.com/vglsd/ScaleParty\n\n**Unsupervised Image-to-Image Translation with Generative Prior**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03641\n- 代码/Code: https://github.com/williamyang1991/gp-unit\n\n**Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04950\n- 代码/Code: None\n\n**medXGAN: Visual Explanations for Medical Classifiers through a Generative Latent Space**\n\n- 论文/Paper: http://arxiv.org/abs/2204.05376\n- 代码/Code: https://github.com/avdravid/medXGAN_explanations\n\n**Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06307\n- 代码/Code: None\n\n**Dual-Domain Image Synthesis using Segmentation-Guided GAN**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09015\n- 代码/Code: https://github.com/denabazazian/Dual-Domain-Synthesis.\n\n**ClothFormer:Taming Video Virtual Try-on in All Module**\n\n- 论文/Paper: http://arxiv.org/abs/2204.12151\n- 代码/Code: None\n\n**OSSGAN: Open-Set Semi-Supervised Image Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.14249\n- 代码/Code: https://github.com/raven38/ossgan\n\n**Fix the Noise: Disentangling Source Feature for Transfer Learning of StyleGAN**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.14079\n- 代码/Code: None\n\n**GenDR: A Generalized Differentiable Renderer**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13845\n- 代码/Code: https://github.com/Felix-Petersen/gendr\n\n**HL-Net: Heterophily Learning Network for Scene Graph Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.01316\n- 代码/Code: https://github.com/siml3/HL-Net.\n\n**RU-Net: Regularized Unrolling Network for Scene Graph Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.01297\n- 代码/Code: https://github.com/siml3/RU-Net\n\n**Comparison of CoModGANs, LaMa and GLIDE for Art Inpainting- Completing M.C Escher's Print Gallery**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.01741\n- 代码/Code: None\n\n**Generate and Edit Your Own Character in a Canonical View**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02974\n- 代码/Code: None\n\n**Scene Graph Expansion for Semantics-Guided Image Outpainting**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02958\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"CPM\"></a>\n\n## 彩妆迁移 / Color-Pattern Makeup Transfer\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"FontGeneration\"></a>\n\n## 字体生成 / Font Generation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"OCR\"></a>\n\n## OCR\n\n**Fourier Document Restoration for Robust Document Dewarping and Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09910\n- 代码/Code: None\n\n**SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware Normalization**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10492\n- 代码/Code: None\n\n### **文字图像处理（超分辨率增强、文字分割、文档版面分析）**\n\n**A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09388\n- 代码/Code：https://github.com/mjq11302010044/TATT\n\n### 场景文本检测、识别/Scene Text Detection/Recognition\n\n**Kernel Proposal Network for Arbitrary Shape Text Detection**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06410\n- 代码/Code：https://github.com/GXYM/KPN\n\n**SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10209\n- 代码/Code: None\n\n**Towards End-to-End Unified Scene Text Detection and Layout Analysis**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15143\n- 代码/Code: None\n\n**Pushing the Performance Limit of Scene Text Recognizer without Human Annotation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07714\n- 代码/Code: None\n\n**Vision-Language Pre-Training for Boosting Scene Text Detectors**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13867\n- 代码/Code: None\n\n### **端到端文字识别**\n\n**Open-set Text Recognition via Character-Context Decoupling**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05535\n- 代码/Code: None\n\n### **手写文字分析与识别**\n\n\n\n### **其它（文档图像预训练模型，Text VQA、数据集，Retrieval , 应用）**\n\n\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"Retrieval\"></a>\n\n## 图像、视频检索 / Image Retrieval/Video retrieval\n\n**Correlation Verification for Image Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01458\n\n- 代码/Code: https://github.com/sungonce/cvnet\n\n**Sketching without Worrying: Noise-Tolerant Sketch-Based Image Retrieval**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14817\n- 代码/Code: None\n\n**Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04752\n- 代码/Code: None\n\n**Probabilistic Compositional Embeddings for Multimodal Image Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05845\n- 代码/Code: https://github.com/andreineculai/MPC.\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageAnimation\"></a>\n\n## Image Animation\n\n**Thin-Plate Spline Motion Model for Image Animation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14367\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageMatting\"></a>\n\n## 抠图/Image Matting/Video Matting\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"SuperResolution\"></a>\n\n# 超分辨率/Super Resolution\n\n**Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09195\n- 代码/Code：https://github.com/csjliang/LDL\n\n**Learning Graph Regularisation for Guided Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14297\n- 代码/Code: None\n\n**Reflash Dropout in Image Super-Resolution**\n\n- 论文/Paper：https://arxiv.org/pdf/2112.12089.pdf\n- 代码/Code：https://github.com/Xiangtaokong/Reflash-Dropout-in-Image-Super-Resolution\n\n**Look Back and Forth: Video Super-Resolution with Explicit Temporal Difference Modeling**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07114\n- 代码/Code: None\n\n**Fast and Memory-Efficient Network Towards Efficient Image Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08397\n- 代码/Code: https://github.com/NJU-Jet/FMEN.\n\n**Self-Calibrated Efficient Transformer for Lightweight Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08913\n- 代码/Code: https://github.com/AlexZou14/SCET.\n\n**Edge-enhanced Feature Distillation Network for Efficient Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08759\n- 代码/Code: https://github.com/icandle/EFDN.\n\n**A New Dataset and Transformer for Stereoscopic Video Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10039\n- 代码/Code: https://github.com/H-deep/Trans-SVSR/\n\n**FS-NCSR: Increasing Diversity of the Super-Resolution Space via Frequency Separation and Noise-Conditioned Normalizing Flow**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09679\n- 代码/Code: None\n\n**IMDeception: Grouped Information Distilling Super-Resolution Network**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11463\n- 代码/Code: None\n\n**Self-Supervised Super-Resolution for Multi-Exposure Push-Frame Satellites**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02031\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageRestoration\"></a>\n\n# 图像复原/Image Restoration\n\n**Exploring and Evaluating Image Restoration Potential in Dynamic Scenes**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11754\n- 代码/Code: None\n\n**Interacting Attention Graph for Single Image Two-Hand Reconstruction**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09364\n- 代码/Code：https://github.com/Dw1010/IntagHand\n\n**Deep Generalized Unfolding Networks for Image Restoration**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13348\n- 代码/Code: https://github.com/MC-E/Deep-Generalized-Unfolding-Networks-for-Image-Restoration.\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageInpainting\"></a>\n\n# 图像补全/Image Inpainting\n\n**Bridging Global Context Interactions for High-Fidelity Image Completion**\n\n- 论文/Paper：https://arxiv.org/abs/2104.00845\n- 代码/Code：https://github.com/lyndonzheng/TFill\n\n**MAT: Mask-Aware Transformer for Large Hole Image Inpainting**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15270\n- 代码/Code: None\n\n**MISF: Multi-level Interactive Siamese Filtering for High-Fidelity Image Inpainting**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06304\n- 代码/Code：https://github.com/tsingqguo/misf\n\n**Towards An End-to-End Framework for Flow-Guided Video Inpainting**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02663\n- 代码/Code: https://github.com/MCG-NKU/E2FGVI\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageDenoising\"></a>\n\n## 图像去噪/Image Denoising\n\n**AP-BSN: Self-Supervised Denoising for Real-World Images via Asymmetric PD and Blind-Spot Network**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11799\n- 代码/Code: None\n\n**Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06967\n- 代码/Code：https://github.com/demonsjin/Blind2Unblind\n\n**CVF-SID: Cyclic multi-Variate Function for Self-Supervised Image Denoising by Disentangling Noise from Image**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13009\n- 代码/Code: None\n\n**Learning to Deblur using Light Field Generated and Real Defocus Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00367\n- 代码/Code: https://github.com/lingyanruan/DRBNet\n\n**Dancing under the stars: video denoising in starlight**\n\n- 论文/Paper: http://arxiv.org/abs/2204.04210\n- 代码/Code: None\n\n**Multiple Degradation and Reconstruction Network for Single Image Denoising via Knowledge Distillation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13873\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageEditing\"></a>\n\n# 图像编辑/Image Editing\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"Imagestitching\"></a>\n\n# 图像拼接/Image stitching\n\n**Deep Rectangling for Image Stitching: A Learning Baseline**\n\n- 论文/Paper：https://arxiv.org/abs/2203.03831\n- 代码/Code：https://github.com/nie-lang/DeepRectangling\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageMatching\"></a>\n\n# 图像匹配/Image Matching\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageBlending\"></a>\n\n# 图像融合/Image Blending\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageDehazing\"></a>\n\n## 图像去雾/Image Dehazing\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageCompression\"></a>\n\n## 图像压缩/Image Compression\n\n**ELIC: Efficient Learned Image Compression with Unevenly Grouped Space-Channel Contextual Adaptive Coding**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10886\n- 代码/Code: None\n\n**Unified Multivariate Gaussian Mixture for Efficient Neural Image Compression**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10897\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"ReflectionRemoval\"></a>\n\n## 反光去除/Reflection Removal\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"LaneDetection\"></a>\n\n## 车道线检测/Lane Detection\n\n**CLRNet: Cross Layer Refinement Network for Lane Detection**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10350\n- 代码/Code: None\n\n**Eigenlanes: Data-Driven Lane Descriptors for Structurally Diverse Lanes**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15302\n- 代码/Code: None\n\n**Rethinking Efficient Lane Detection via Curve Modeling**\n\n- 论文/Paper：https://arxiv.org/abs/2203.02431\n- 代码/Code：https://github.com/voldemortX/pytorch-auto-drive\n\n**Towards Driving-Oriented Metric for Lane Detection Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16851\n- 代码/Code: None\n\n**A Keypoint-based Global Association Network for Lane Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07335\n- 代码/Code: https://github.com/Wolfwjs/GANet.\n\n**ONCE-3DLanes: Building Monocular 3D Lane Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.00301\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"AutonomousDriving\"></a>\n\n## 自动驾驶 / Autonomous Driving\n\n**Learning from All Vehicles**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11934\n- 代码/Code: https://github.com/dotchen/LAV\n\n[返回目录/back](#Contents)\n\n<a name=\"FluidReconstruction\"></a>\n\n## 流体重建/Fluid Reconstruction\n\n[返回目录/back](#Contents)\n\n<a name=\"SceneReconstruction\"></a>\n\n## 场景重建 / Scene Reconstruction\n\n**3D Shape Reconstruction from 2D Images with Disentangled Attribute Flow**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15190\n- 代码/Code: None\n\n**NeRFusion: Fusing Radiance Fields for Large-Scale Scene Reconstruction**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11283\n- 代码/Code: None\n\n**PlaneMVS: 3D Plane Reconstruction from Multi-View Stereo**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12082\n- 代码/Code: None\n\n**Neural 3D Scene Reconstruction with the Manhattan-world Assumption**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02836\n- 代码/Code: https://github.com/zju3dv/manhattan_sdf\n\n[返回目录/back](#Contents)\n\n<a name=\"FrameInterpolation\"></a>\n\n## 视频插帧/Frame Interpolation\n\n**Long-term Video Frame Interpolation via Feature Propagation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15427\n- 代码/Code: None\n\n**TimeReplayer: Unlocking the Potential of Event Cameras for Video Interpolation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13859\n- 代码/Code: None\n\n**Unifying Motion Deblurring and Frame Interpolation with Events**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12178\n- 代码/Code: None\n\n**Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17191\n- 代码/Code: None\n\n**Many-to-many Splatting for Efficient Video Frame Interpolation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03513\n- 代码/Code: https://github.com/feinanshan/m2m_vfi\n\n[返回目录/back](#Contents)\n\n<a name=\"VideoSuper-Resolution\"></a>\n\n## 视频超分 / Video Super-Resolution\n\n**Reference-based Video Super-Resolution Using Multi-Camera Video Triplets**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14537\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"3DPointCloud\"></a>\n\n## 3D点云/3D point cloud\n\n**ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via Adversarial Rotation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.03888\n- 代码/Code：\n\n**AziNorm: Exploiting the Radial Symmetry of Point Cloud for Azimuth-Normalized 3D Perception**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13090\n- 代码/Code: None\n\n**Contrastive Boundary Learning for Point Cloud Segmentation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05272\n- 代码/Code：https://github.com/LiyaoTang/contrastBoundary\n\n**Equivariant Point Cloud Analysis via Learning Orientations for Message Passing**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14486\n- 代码/Code: None\n\n**IDEA-Net: Dynamic 3D Point Cloud Interpolation via Deep Embedding Alignment**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11590\n- 代码/Code: None\n\n**Learning a Structured Latent Space for Unsupervised Point Cloud Completion**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15580\n- 代码/Code: None\n\n**Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11139\n- 代码/Code: None\n\n**No Pain, Big Gain: Classify Dynamic Point Cloud Sequences with Static Models by Fitting Feature-level Space-time Surfaces**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11113\n- 代码/Code: None\n\n**REGTR: End-to-end Point Cloud Correspondences with Transformers**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14517\n- 代码/Code: None\n\n**SC^2-PCR: A Second Order Spatial Compatibility for Efficient and Robust Point Cloud Registration**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14453\n- 代码/Code: None\n\n**Stratified Transformer for 3D Point Cloud Segmentation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14508\n- 代码/Code: None\n\n**Shape-invariant 3D Adversarial Point Clouds**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04041\n- 代码/Code：https://github.com/shikiw/SI-Adv\n\n**WarpingGAN: Warping Multiple Uniform Priors for Adversarial 3D Point Cloud Generation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12917\n- 代码/Code: None\n\n**Deformation and Correspondence Aware Unsupervised Synthetic-to-Real Scene Flow Estimation for Point Clouds**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16895\n- 代码/Code: None\n\n**Deformation and Correspondence Aware Unsupervised Synthetic-to-Real Scene Flow Estimation for Point Clouds**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16895\n- 代码/Code: None\n\n**Learning Local Displacements for Point Cloud Completion**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16600\n- 代码/Code: None\n\n**3DeformRS: Certifying Spatial Deformations on Point Clouds**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05687\n- 代码/Code: None\n\n**Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10603\n- 代码/Code: https://github.com/mabaorui/onsurfaceprior\n\n**Surface Reconstruction from Point Clouds by Learning Predictive Context Priors**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11015\n- 代码/Code: None\n\n**Density-preserving Deep Point Cloud Compression**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.12684\n- 代码/Code: None\n\n**Why Discard if You can Recycle?: A Recycling Max Pooling Module for 3D Point Cloud Analysis**\n\n- 论文/Paper: https://openaccess.thecvf.com/content/CVPR2022/papers/Chen_Why_Discard_if_You_Can_Recycle_A_Recycling_Max_Pooling_CVPR_2022_paper.pdf\n- 代码/Code: https://github.com/jiajingchen113322/Recycle_Maxpooling_Module\n\n**Boosting 3D Object Detection by Simulating Multimodality on Point Clouds**\n\n- 论文/Paper: https://arxiv.org/abs/2206.14971\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"Label-Noise\"></a>\n\n# 标签噪声 Label-Noise\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"AdversarialExamples\"></a>\n\n# 对抗样本 / Adversarial Examples\n\n**LAS-AT: Adversarial Training with Learnable Attack Strategy**\n\n- 论文/Paper：https://arxiv.org/pdf/2203.06616.pdf\n\n- 代码/Code：https://github.com/jiaxiaojunQAQ/LAS-AT\n\n [返回目录/back](#Contents)\n\n<a name=\"Other\"></a>\n\n## 其他/Other\n\n**DINE: Domain Adaptation from Single and Multiple Black-box Predictors**\n\n- 论文/Paper：https://arxiv.org/abs/2104.01539\n\n- 代码/Code：https://github.com/tim-learn/DINE\n\n**It's About Time: Analog clock Reading in the Wild**\n\n- 论文/Paper：https://arxiv.org/abs/2111.09162\n- 代码/Code：https://github.com/charigyang/itsabouttime\n\n**Neural Face Identification in a 2D Wireframe Projection of a Manifold Object**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04229 | [主页](https://manycore-research.github.io/faceformer/)\n- 代码/Code：https://github.com/manycore-research/faceformer\n\n**Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04279\n- 代码/Code：https://github.com/PruneTruong/DenseMatching\n\n**TeachAugment: Data Augmentation Optimization Using Teacher Knowledge**\n\n- 论文/Paper：https://arxiv.org/abs/2202.12513\n\n- 代码/Code：https://github.com/DensoITLab/TeachAugment\n\n**UKPGAN: Unsupervised KeyPoint GANeration**\n\n- 论文/Paper：\n\n- 代码/Code：https://github.com/qq456cvb/UKPGAN\n\n**DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos**\n\n- 论文/Paper：https://arxiv.org/abs/2203.03996\n- 代码/Code：\n\n**Generative Cooperative Learning for Unsupervised Video Anomaly Detection**\n\n- 论文/Paper：https://arxiv.org/abs/2203.03962\n- 代码/Code：\n\n**Shadows can be Dangerous: Stealthy and Effective Physical-world Adversarial Attack by Natural Phenomenon**\n\n- 论文/Paper：https://arxiv.org/abs/2203.03818\n- 代码/Code：\n\n**Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild**\n\n- 论文/Paper：https://arxiv.org/abs/2203.03800\n- 代码/Code：https://github.com/deeplearning-wisc/stud\n\n**On Generalizing Beyond Domains in Cross-Domain Continual Learning**\n\n- 论文/Paper：https://arxiv.org/abs/2203.03970\n- 代码/Code：\n\n**Generating 3D Bio-Printable Patches Using Wound Segmentation and Reconstruction to Treat Diabetic Foot Ulcers**\n\n- 论文/Paper：https://arxiv.org/abs/2203.03814\n- 代码/Code：\n\n**What Matters For Meta-Learning Vision Regression Tasks?**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04905\n- 代码/Code：\n\n**ChiTransformer:Towards Reliable Stereo from Cues**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04554\n\n- 代码/Code：\n\n**Dynamic Dual-Output Diffusion Models**\n\n- 论文/Paper：https://arxiv.org/abs/2203.04304\n\n- 代码/Code：\n\n**Spatial Commonsense Graph for Object Localisation in Partial Scenes**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05380\n\n- 代码/Code：https://fgiuliari.github.io/projects/SpatialCommonsenseGraph/\n\n**Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05154\n\n- 代码/Code：https://github.com/liuye6666/adaptive_auto_attack\n\n**Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05151\n\n- 代码/Code： \n\n**REX: Reasoning-aware and Grounded Explanation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06107\n- 代码/Code：\n\n**FLAG: Flow-based 3D Avatar Generation from Sparse Observations**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05789\n- 代码/Code：\n\n**Learning Distinctive Margin toward Active Domain Adaptation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.05738\n- 代码/Code：https://github.com/TencentYoutuResearch/ActiveLearning-SDM\n\n**Active Learning by Feature Mixing**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07034\n- 代码/Code：\n\n**UniVIP: A Unified Framework for Self-Supervised Visual Pre-training**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06965\n- 代码/Code：\n\n**Forward Compatible Few-Shot Class-Incremental Learning**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06953\n- 代码/Code：https://github.com/zhoudw-zdw/CVPR22-Fact\n\n**XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06947\n- 代码/Code：\n\n**Accelerating DETR Convergence via Semantic-Aligned Matching**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06883\n- 代码/Code：https://github.com/ZhangGongjie/SAM-DETR\n\n**ADAS: A Direct Adaptation Strategy for Multi-Target Domain Adaptive Semantic Segmentation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06811\n- 代码/Code：\n\n**Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06717\n- 代码/Code：https://github.com/megvii-research/RepLKNet\n\n**LAS-AT: Adversarial Training with Learnable Attack Strategy**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06616\n- 代码/Code：https://github.com/jiaxiaojunQAQ/LAS-AT\n\n**Depth-Aware Generative Adversarial Network for Talking Head Video Generation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06605\n- 代码/Code：https://github.com/harlanhong/CVPR2022-DaGAN\n\n**AutoGPart: Intermediate Supervision Search for Generalizable 3D Part Segmentation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06558\n\n- 代码/Code：\n\n**Sparse Local Patch Transformer for Robust Face Alignment and Landmarks Inherent Relation Learning**\n\n- 论文/Paper：https://arxiv.org/abs/2203.06541\n- 代码/Code：https://github.com/Jiahao-UTS/SLPT-master\n\n**Implicit Feature Decoupling with Depthwise Quantization**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08080\n- 代码/Code：\n\n**Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07808\n- 代码/Code：\n\n**Learning What Not to Segment: A New Perspective on Few-Shot Segmentation**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07615\n- 代码/Code：https://github.com/chunbolang/BAM\n\n**Can Neural Nets Learn the Same Mode**\n\n**l Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08124\n- 代码/Code：https://github.com/somepago/dbViz\n\n**Scalable Penalized Regression for Noise Detection in Learning with Noisy Labels**\n\n- 论文/Paper：https://arxiv.org/abs/2203.07788\n- 代码/Code：https://github.com/Yikai-Wang/SPR-LNL\n\n**Deep vanishing point detection: Geometric priors make dataset variations vanish**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08586\n- 代码/Code：https://github.com/yanconglin/VanishingPoint_HoughTransform_GaussianSphere\n\n**Non-isotropy Regularization for Proxy-based Deep Metric Learning**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08563\n\n- 代码/Code：https://github.com/ExplainableML/NonIsotropicProxyDML\n\n**Integrating Language Guidance into Vision-based Deep Metric Learning**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08543\n- 代码/Code：https://github.com/ExplainableML/LanguageGuidance_for_DML\n\n**Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08481\n- 代码/Code：https://github.com/LeapLabTHU/Pseudo-Q\n\n**The Devil Is in the Details: Window-based Attention for Image Compression**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08450\n- 代码/Code：https://github.com/Googolxx/STF\n\n**Represent, Compare, and Learn: A Similarity-Aware Framework for Class-Agnostic Counting**\n\n- 论文/Paper：https://arxiv.org/abs/2203.08354\n- 代码/Code：https://github.com/flyinglynx/Bilinear-Matching-Network\n\n**Vox2Cortex: Fast Explicit Reconstruction of Cortical Surfaces from 3D MRI Scans with Geometric Deep Neural Networks**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09446\n- 代码/Code：\n\n**Bi-directional Object-context Prioritization Learning for Saliency Ranking**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09416\n- 代码/Code：https://github.com/GrassBro/OCOR\n\n**Object Localization under Single Coarse Point Supervision**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09338\n- 代码/Code：https://github.com/ucas-vg/PointTinyBenchmark/\n\n**Neural Compression-Based Feature Learning for Video Restoration**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09208\n- 代码/Code：\n\n**MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09138\n- 代码/Code：https://github.com/AndersonStra/MuKEA\n\n**Improving the Transferability of Targeted Adversarial Examples through Object-Based Diverse Input**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09123\n- 代码/Code：https://github.com/dreamflake/ODI\n\n**DATA: Domain-Aware and Task-Aware Pre-training**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09041\n- 代码/Code：https://github.com/GAIA-vision/GAIA-ssl\n\n**Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09249\n\n- 代码/Code：\n\n**Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning**\n\n- 论文/Paper：https://arxiv.org/abs/2203.09137\n- 代码/Code：https://github.com/YiteWang/MetaNTK-NAS\n\n**Learning Affordance Grounding from Exocentric Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09905\n- 代码/Code: https://github.com/lhc1224/cross-view-affordance-grounding\n\n**DTA: Physical Camouflage Attacks using Differentiable Transformation Network**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09831\n- 代码/Code: None\n\n**Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.09824\n- 代码/Code: None\n\n**Revisiting Domain Generalized Stereo Matching Networks from a Feature Consistency Perspective**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10887\n- 代码/Code: None\n\n**ViM: Out-Of-Distribution with Virtual-logit Matching**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10807\n- 代码/Code: None\n\n**Delving into the Estimation Shift of Batch Normalization in a Network**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10778\n- 代码/Code: None\n\n**Depth Estimation by Combining Binocular Stereo and Monocular Structured-Light**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10493\n- 代码/Code: None\n\n**TVConv: Efficient Translation Variant Convolution for Layout-aware Visual Processing**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10489\n- 代码/Code: None\n\n**Portrait Eyeglasses and Shadow Removal by Leveraging 3D Synthetic Data**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10474\n- 代码/Code: None\n\n**Discovering Objects that Can Move**\n\n- 论文/Paper: http://arxiv.org/abs/2203.10159\n- 代码/Code: None\n\n**φ-SfT: Shape-from-Template with a Physics-Based Deformation Model**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11938\n- 代码/Code: None\n\n**Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11483\n- 代码/Code: None\n\n**Mixed Differential Privacy in Computer Vision**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11481\n- 代码/Code: None\n\n**Global Matching with Overlapping Attention for Optical Flow Estimation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.11335\n- 代码/Code: None\n\n**DR.VIC: Decomposition and Reasoning for Video Individual Counting**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12335\n- 代码/Code: https://github.com/taohan10200/drnet\n\n**DTFD-MIL: Double-Tier Feature Distillation Multiple Instance Learning for Histopathology Whole Slide Image Classification**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12081\n- 代码/Code:https://github.com/hrzhang1123/DTFD-MIL\n\n**Efficient Translation Variant Convolution for Layout-aware Visual Processing**\n\n- 论文/Paper：http://arxiv.org/abs/2203.10489\n- 代码/Code：https://github.com/JierunChen/TVConv\n\n**Moving Window Regression: A Novel Approach to Ordinal Regression**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13122\n- 代码/Code: None\n\n**Egocentric Prediction of Action Target in 3D**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13116\n- 代码/Code: None\n\n**Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12997\n- 代码/Code: None\n\n**Neural Reflectance for Shape Recovery with Shadow Handling**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12909\n- 代码/Code: None\n\n**DyRep: Bootstrapping Training with Dynamic Re-parameterization**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12868\n- 代码/Code: None\n\n**Multidimensional Belief Quantification for Label-Efficient Meta-Learning**\n\n- 论文/Paper: http://arxiv.org/abs/2203.12768\n- 代码/Code: None\n\n**Give Me Your Attention: Dot-Product Attention Considered Harmful for Adversarial Patch Robustness**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13639\n- 代码/Code: None\n\n**Unsupervised Pre-training for Temporal Action Localization Tasks**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13609\n- 代码/Code: None\n\n**Continual Test-Time Domain Adaptation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13591\n- 代码/Code: None\n\n**Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13412\n- 代码/Code: None\n\n**NPBG++: Accelerating Neural Point-Based Graphics**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13318\n- 代码/Code: None\n\n**Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13309\n- 代码/Code: None\n\n**Probing Representation Forgetting in Supervised and Unsupervised Continual Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.13381\n- 代码/Code: None\n\n**Energy-based Latent Aligner for Incremental Learning**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14952\n- 代码/Code: None\n\n**Controllable Dynamic Multi-Task Architectures**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14949\n- 代码/Code: None\n\n**Attributable Visual Similarity Learning**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14932\n- 代码/Code: None\n\n**Learning Where to Learn in Cross-View Self-Supervised Learning**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14898\n- 代码/Code: None\n\n**Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14843\n- 代码/Code: None\n\n**Partially Does It: Towards Scene-Level FG-SBIR with Partial Input**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14804\n- 代码/Code: None\n\n**Bi-level Doubly Variational Learning for Energy-based Latent Variable Models**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14702\n- 代码/Code: None\n\n**Sketch3T: Test-Time Training for Zero-Shot SBIR**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14691\n- 代码/Code: None\n\n**Brain-inspired Multilayer Perceptron with Spiking Neurons**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14679\n- 代码/Code: None\n\n**Catching Both Gray and Black Swans: Open-set Supervised Anomaly Detection**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14506\n- 代码/Code: None\n\n**NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14499\n- 代码/Code: None\n\n**ARCS: Accurate Rotation and Correspondence Search**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14493\n- 代码/Code: None\n\n**iPLAN: Interactive and Procedural Layout Planning**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14412\n- 代码/Code: None\n\n**Locality-Aware Inter-and Intra-Video Reconstruction for Self-Supervised Correspondence Learning**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14333\n- 代码/Code: None\n\n**Local-Adaptive Face Recognition via Graph-based Meta-Clustering and Regularized Adaptation**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14327\n- 代码/Code: None\n\n**Unsupervised Vision-Language Parsing: Seamlessly Bridging Visual Scene Graphs with Language Structures via Dependency Relationships**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14260\n- 代码/Code: None\n\n**Knowledge Mining with Scene Text for Fine-Grained Recognition**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14215\n- 代码/Code: None\n\n**Long-Tailed Recognition via Weight Balancing**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14197\n- 代码/Code: None\n\n**HINT: Hierarchical Neuron Concept Explainer**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14196\n- 代码/Code: None\n\n**Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14104\n- 代码/Code: None\n\n**Learning to Answer Questions in Dynamic Audio-Visual Scenarios**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14072\n- 代码/Code: None\n\n**Neural MoCon: Neural Motion Control for Physically Plausible Human Motion Capture**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14065\n- 代码/Code: None\n\n**Visual Abductive Reasoning**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14040\n- 代码/Code: None\n\n**RSCFed: Random Sampling Consensus Federated Semi-supervised Learning**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13993\n- 代码/Code: None\n\n**GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13954\n- 代码/Code: None\n\n**Sylph: A Hypernetwork Framework for Incremental Few-shot Object Detection**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13903\n- 代码/Code: None\n\n**Causality Inspired Representation Learning for Domain Generalization**\n\n- 论文/Paper: http://arxiv.org/abs/2203.14237\n- 代码/Code: None\n\n**Transformer-empowered Multi-scale Contextual Matching and Aggregation for Multi-contrast MRI Super-resolution**\n\n- 论文/Paper: http://arxiv.org/abs/2203.13963\n- 代码/Code: None\n\n**CHEX: CHannel EXploration for CNN Model Compression**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15794\n- 代码/Code: None\n\n**FisherMatch: Semi-Supervised Rotation Regression via Entropy-based Filtering**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15765\n- 代码/Code: None\n\n**EnvEdit: Environment Editing for Vision-and-Language Navigation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15685\n- 代码/Code: None\n\n**Exploring Frequency Adversarial Attacks for Face Forgery Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15674\n- 代码/Code: None\n\n**BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15536\n- 代码/Code: None\n\n**Learning Structured Gaussians to Approximate Deep Ensembles**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15485\n- 代码/Code: None\n\n**Quantifying Societal Bias Amplification in Image Captioning**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15395\n- 代码/Code: None\n\n**Alignment-Uniformity aware Representation Learning for Zero-shot Video Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15381\n- 代码/Code: https://github.com/ShipuLoveMili/CVPR2022-AURL\n\n**Self-Supervised Image Representation Learning with Geometric Set Consistency**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15361\n- 代码/Code: None\n\n**Nested Collaborative Learning for Long-Tailed Visual Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15359\n- 代码/Code: None\n\n**Online Continual Learning on a Contaminated Data Stream with Blurry Task Boundaries**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15355\n- 代码/Code: None\n\n**CNN Filter DB: An Empirical Investigation of Trained Convolutional Filters**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15331\n- 代码/Code: None\n\n**Dressing in the Wild by Watching Dance Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15320\n- 代码/Code: None\n\n**Eigencontours: Novel Contour Descriptors Based on Low-Rank Approximation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15259\n- 代码/Code: None\n\n**Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15235\n- 代码/Code: None\n\n**Zero-Query Transfer Attacks on Context-Aware Object Detectors**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15230\n- 代码/Code: None\n\n**ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15187\n- 代码/Code: None\n\n**Registering Explicit to Implicit: Towards High-Fidelity Garment mesh Reconstruction from Single Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15007\n- 代码/Code: None\n\n**Clean Implicit 3D Structure from Noisy 2D STEM Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15434\n- 代码/Code: None\n\n**Equivariance Allows Handling Multiple Nuisance Variables When Analyzing Pooled Neuroimaging Datasets**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15234\n- 代码/Code: None\n\n**Large-Scale Pre-training for Person Re-identification with Noisy Labels**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16533\n- 代码/Code: https://github.com/dengpanfu/luperson-nl\n\n**Understanding 3D Object Articulation in Internet Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16531\n- 代码/Code: None\n\n**CaDeX: Learning Canonical Deformation Coordinate Space for Dynamic Surface Representation via Neural Homeomorphism**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16529\n- 代码/Code: None\n\n**Unseen Classes at a Later Time? No Problem**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16517\n- 代码/Code: https://github.com/sumitramalagi/unseen-classes-at-a-later-time\n\n**Fast Light-Weight Near-Field Photometric Stereo**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16515\n- 代码/Code: None\n\n**AdaMixer: A Fast-Converging Query-Based Object Detector**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16507\n- 代码/Code: https://github.com/mcg-nju/adamixer\n\n**Fast, Accurate and Memory-Efficient Partial Permutation Synchronization**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16505\n- 代码/Code: None\n\n**Balanced MSE for Imbalanced Visual Regression**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16427\n- 代码/Code: None\n\n**Multi-Robot Active Mapping via Neural Bipartite Graph Matching**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16319\n- 代码/Code: None\n\n**Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16258\n- 代码/Code: https://github.com/valeoai/slidr\n\n**FLOAT: Factorized Learning of Object Attributes for Improved Multi-object Multi-part Scene Parsing**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16168\n- 代码/Code: None\n\n**STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution Video Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16084\n- 代码/Code: None\n\n**Learning Program Representations for Food Images and Cooking Recipes**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16071\n- 代码/Code: None\n\n**AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16062\n- 代码/Code: None\n\n**Progressively Generating Better Initial Guesses Towards Next Stages for High-Quality Human Motion Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16051\n- 代码/Code: None\n\n**Iterative Deep Homography Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15982\n- 代码/Code: https://github.com/imdumpl78/ihn\n\n**PSMNet: Position-aware Stereo Merging Network for Room Layout Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15965\n- 代码/Code: None\n\n**Disentangled3D: Learning a 3D Generative Model with Disentangled Geometry and Appearance from Monocular Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15926\n- 代码/Code: None\n\n**Learning to Detect Mobile Objects from LiDAR Scans Without Labels**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15882\n- 代码/Code: https://github.com/yurongyou/modest\n\n**Proactive Image Manipulation Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15880\n- 代码/Code: https://github.com/vishal3477/proactive_imd\n\n**NICGSlowDown: Evaluating the Efficiency Robustness of Neural Image Caption Generation Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.15859\n- 代码/Code: https://github.com/seekingdream/nicgslowdown\n\n**Practical Learned Lossless JPEG Recompression with Multi-Level Cross-Channel Entropy Model in the DCT Domain**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16357\n- 代码/Code: None\n\n**Bringing Old Films Back to Life**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17276\n- 代码/Code: https://github.com/raywzy/Bringing-Old-Films-Back-to-Life\n\n**Generating High Fidelity Data from Low-density Regions using Diffusion Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17260\n- 代码/Code: None\n\n**Continuous Scene Representations for Embodied AI**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17251\n- 代码/Code: None\n\n**SimVQA: Exploring Simulated Environments for Visual Question Answering**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17219\n- 代码/Code: None\n\n**Leverage Your Local and Global Representations: A New Self-Supervised Learning Strategy**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17205\n- 代码/Code: None\n\n**AEGNN: Asynchronous Event-based Graph Neural Networks**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17149\n- 代码/Code: None\n\n**It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.17008\n- 代码/Code: None\n\n**Towards Robust Rain Removal Against Adversarial Attacks: A Comprehensive Benchmark Analysis and Beyond**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16931\n- 代码/Code: None\n\n**End-to-End Trajectory Distribution Prediction Based on Occupancy Grid Maps**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16910\n- 代码/Code: None\n\n**Reflection and Rotation Symmetry Detection via Equivariant Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16787\n- 代码/Code: None\n\n**Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16755\n- 代码/Code: None\n\n**Personalized Image Aesthetics Assessment with Rich Attributes**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16754\n- 代码/Code: None\n\n**Constrained Few-shot Class-incremental Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16588\n- 代码/Code: None\n\n**Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16586\n- 代码/Code: None\n\n**Exploiting Explainable Metrics for Augmented SGD**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16723\n- 代码/Code: None\n\n**Task Adaptive Parameter Sharing for Multi-Task Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2203.16708\n- 代码/Code: None\n\n**D-Grasp: Physically Plausible Dynamic Grasp Synthesis for Hand-Object Interactions**\n\n- 论文/Paper: http://arxiv.org/pdf/2112.03028\n- 代码/Code: None\n\n**On the Importance of Asymmetry for Siamese Representation Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00613\n\n- 代码/Code: https://github.com/facebookresearch/asym-siam\n\n**DIP: Deep Inverse Patchmatch for High-Resolution Optical Flow**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00330\n\n- 代码/Code: https://github.com/zihuazheng/dip\n\n**Unimodal-Concentrated Loss: Fully Adaptive Label Distribution Learning for Ordinal Regression**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00309\n\n- 代码/Code: None\n\n**Perception Prioritized Training of Diffusion Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00227\n\n- 代码/Code: https://github.com/jychoi118/p2-weighting\n\n**Bridging the Gap between Classification and Localization for Weakly Supervised Object Localization**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00220\n\n- 代码/Code: None\n\n**GraftNet: Towards Domain Generalized Stereo Matching with a Broad-Spectrum and Task-Oriented Feature**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00179\n\n- 代码/Code: https://github.com/spadeliu/graft-psmnet\n\n**LASER: LAtent SpacE Rendering for 2D Visual Localization**\n\n论文/Paper: http://arxiv.org/pdf/2204.00157\n\n代码/Code: None\n\n**TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00097\n\n- 代码/Code: https://github.com/jeff-zilence/transgeo2022\n\n**Investigating Top-$k$ White-Box and Transferable Black-box Attack**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00089\n\n- 代码/Code: None\n\n**Efficient Maximal Coding Rate Reduction by Variational Forms**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00077\n- 代码/Code: None \n\n**Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01696\n\n- 代码/Code: None\n\n**LISA: Learning Implicit Shape and Appearance of Hands**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01695\n\n- 代码/Code: None\n\n**Exemplar-bsaed Pattern Synthesis with Implicit Periodic Field Network**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01671\n\n- 代码/Code: None\n\n**Degradation-agnostic Correspondence from Resolution-asymmetric Stereo**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01429\n\n- 代码/Code: None\n\n**RayMVSNet: Learning Ray-based 1D Implicit Fields for Accurate Multi-View Stereo**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01320\n\n- 代码/Code: None\n\n**Exploiting Temporal Relations on Radar Perception for Autonomous Driving**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01184\n\n- 代码/Code: None\n\n**BNV-Fusion: Dense 3D Reconstruction using Bi-level Neural Volume Fusion**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01139\n\n- 代码/Code: None\n\n**Neural Global Shutter: Learn to Restore Video from a Rolling Shutter Camera with Global Reset Feature**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00974\n\n- 代码/Code: https://github.com/lightchaserx/neural-global-shutter\n\n**DST: Dynamic Substitute Training for Data-free Black-box Attack**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00972\n\n- 代码/Code: None\n\n**Progressive Minimal Path Method with Embedded CNN**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00944\n\n- 代码/Code: None\n\n**Online Convolutional Re-parameterization**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00826\n\n- 代码/Code: None\n\n**SIMBAR: Single Image-Based Scene Relighting For Effective Data Augmentation For Automated Driving Vision Tasks**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.00644\n- 代码/Code: None \n\n**Rethinking Visual Geo-localization for Large-Scale Applications**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02287\n- 代码/Code: None\n\n**IRON: Inverse Rendering by Optimizing Neural SDFs and Materials from Photometric Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02232\n- 代码/Code: None\n\n**SNUG: Self-Supervised Neural Dynamic Garments**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02219\n- 代码/Code: None\n\n**Leveraging Equivariant Features for Absolute Pose Regression**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02163\n- 代码/Code: None\n\n**MonoTrack: Shuttle trajectory reconstruction from monocular badminton video**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01899\n- 代码/Code: None\n\n**Revisiting Near/Remote Sensing with Geospatial Attention**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.01807\n- 代码/Code: None\n\n**Temporal Alignment Networks for Long-term Video**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02968\n- 代码/Code: None\n\n**\"The Pedestrian next to the Lamppost\" Adaptive Object Graphs for Better Instantaneous Mapping**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02944\n- 代码/Code: None\n\n**Masking Adversarial Damage: Finding Adversarial Saliency for Robust and Sparse Network**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02738\n- 代码/Code: None\n\n**Aesthetic Text Logo Synthesis via Content-aware Layout Inferring**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02701\n- 代码/Code: https://github.com/yizhiwang96/TextLogoLayout\n\n**Learning to Anticipate Future with Dynamic Context Removal**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02587\n- 代码/Code: https://github.com/AllenXuuu/DCR.\n\n**SqueezeNeRF: Further factorized FastNeRF for memory-efficient inference**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02585\n- 代码/Code: None\n\n**Gait Recognition in the Wild with Dense 3D Representations and A Benchmark**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02569\n- 代码/Code: None\n\n**MixFormer: Mixing Features across Windows and Dimensions**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02557\n- 代码/Code: https://github.com/PaddlePaddle/PaddleClas\n\n**RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02553\n- 代码/Code: None\n\n**Adversarial Robustness through the Lens of Convolutional Filters**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02481\n- 代码/Code: website:\n  https://github.com/paulgavrikov/cvpr22w_RobustnessThroughTheLens\n\n**Learning Optimal K-space Acquisition and Reconstruction using Physics-Informed Neural Networks**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02480\n- 代码/Code: None\n\n**Total Variation Optimization Layers for Computer Vision**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03643\n- 代码/Code: https://github.com/raymondyeh07/tv_layers_for_cv\n\n**Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D Reconstruction**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03642\n- 代码/Code: None\n\n**Class-Incremental Learning with Strong Pre-trained Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03634\n- 代码/Code: None\n\n**AutoRF: Learning 3D Object Radiance Fields from Single View Observations**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03593\n- 代码/Code: None\n\n**Deep Visual Geo-localization Benchmark**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03444\n- 代码/Code: None\n\n**Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03162\n- 代码/Code: None\n\n**UIGR: Unified Interactive Garment Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03111\n- 代码/Code: https://github.com/brandonhanx/compfashion\n\n**AUV-Net: Learning Aligned UV Maps for Texture Transfer and Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03105\n- 代码/Code: None\n\n**Hierarchical Self-supervised Representation Learning for Movie Understanding**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03101\n- 代码/Code: None\n\n**Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.03017\n- 代码/Code: None\n\n**Multi-Scale Memory-Based Video Deblurring**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.02977\n- 代码/Code: https://github.com/jibo27/memdeblur\n\n**Gravitationally Lensed Black Hole Emission Tomography**\n\n- 论文/Paper: http://arxiv.org/abs/2204.03715\n- 代码/Code: None\n\n**General Incremental Learning with Domain-aware Categorical Representations**\n\n- 论文/Paper: http://arxiv.org/abs/2204.04078\n- 代码/Code: None\n\n**Identifying Ambiguous Similarity Conditions via Semantic Matching**\n\n- 论文/Paper: http://arxiv.org/abs/2204.04053\n- 代码/Code: None\n\n**Does Robustness on ImageNet Transfer to Downstream Tasks?**\n\n- 论文/Paper: http://arxiv.org/abs/2204.03934\n- 代码/Code: None\n\n**Deep Hyperspectral-Depth Reconstruction Using Single Color-Dot Projection**\n\n- 论文/Paper: http://arxiv.org/abs/2204.03929\n- 代码/Code: None\n\n**CD$^2$-pFed: Cyclic Distillation-guided Channel Decoupling for Model Personalization in Federated Learning**\n\n- 论文/Paper: http://arxiv.org/abs/2204.03880\n- 代码/Code: None\n\n**Reusing the Task-specific Classifier as a Discriminator: Discriminator-free Adversarial Domain Adaptation**\n\n- 论文/Paper: http://arxiv.org/abs/2204.03838\n- 代码/Code: https://github.com/xiaoachen98/DALN\n\n**TorMentor: Deterministic dynamic-path, data augmentations with fractals**\n\n- 论文/Paper: http://arxiv.org/abs/2204.03776\n- 代码/Code: None\n\n**TemporalUV: Capturing Loose Clothing with Temporally Coherent UV Coordinates**\n\n- 论文/Paper: http://arxiv.org/abs/2204.03671\n- 代码/Code: None\n\n**Single-Photon Structured Light**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05300\n- 代码/Code: None\n\n**Pyramid Grafting Network for One-Stage High Resolution Saliency Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05041\n- 代码/Code: None\n\n**Structure-Aware Motion Transfer with Deformable Anchor Model**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05018\n- 代码/Code: None\n\n**Reasoning with Multi-Structure Commonsense Knowledge in Visual Dialog**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04680\n- 代码/Code: None\n\n**NAN: Noise-Aware NeRFs for Burst-Denoising**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04668\n- 代码/Code: None\n\n**Learning Pixel-Level Distinctions for Video Highlight Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04615\n- 代码/Code: None\n\n**Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04601\n- 代码/Code: None\n\n**DeepLIIF: An Online Platform for Quantification of Clinical Pathology Slides**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04494\n- 代码/Code: None\n\n**ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04428\n- 代码/Code: None\n\n**FedCorr: Multi-Stage Federated Learning for Label Noise Correction**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04677\n- 代码/Code: https://github.com/Xu-Jingyi/FedCorr\n\n**Adaptive Differential Filters for Fast and Communication-Efficient Federated Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04424\n- 代码/Code: None\n\n**The Two Dimensions of Worst-case Training and the Integrated Effect for Out-of-domain Generalization**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.04384\n- 代码/Code: None\n\n**Continual Predictive Learning from Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05624\n- 代码/Code: https://github.com/jc043/CPL\n\n**Few-shot Learning with Noisy Labels**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05494\n- 代码/Code: None\n\n**Out-Of-Distribution Detection In Unsupervised Continual Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05462\n- 代码/Code: None\n\n**Generalizing Adversarial Explanations with Grad-CAM**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.05427\n- 代码/Code: None\n\n**Recognition of Freely Selected Keypoints on Human Limbs**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06326\n- 代码/Code: None\n\n**3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06272\n- 代码/Code: None\n\n**Defensive Patches for Robust Recognition in the Physical World**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06213\n- 代码/Code: https://github.com/nlsde-safety-team/DefensivePatch\n\n**COAP: Compositional Articulated Occupancy of People**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06184\n- 代码/Code: None\n\n**What's in your hands? 3D Reconstruction of Generic Objects in Hands**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07153\n- 代码/Code: None\n\n**GIFS: Neural Implicit Function for General Shape Representation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07126\n- 代码/Code: None\n\n**The multi-modal universe of fast-fashion: the Visuelle 2.0 benchmark**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06972\n- 代码/Code: None\n\n**Semi-Supervised Training to Improve Player and Ball Detection in Soccer**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06859\n- 代码/Code: https://github.com/rvandeghen/SST\n\n**Pyramidal Attention for Saliency Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06788\n- 代码/Code: https://github.com/tanveer-hussain/EfficientSOD2\n\n**OccAM's Laser: Occlusion-based Attribution Maps for 3D Object Detectors on LiDAR Data**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.06577\n- 代码/Code: https://github.com/dschinagl/occam\n\n**Patch-wise Contrastive Style Learning for Instagram Filter Removal**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07486\n- 代码/Code: None\n\n**Guiding Attention using Partial-Order Relationships for Image Captioning**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07476\n- 代码/Code: None\n\n**MetaSets: Meta-Learning on Point Sets for Generalizable Representations**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07311\n- 代码/Code: None\n\n**Pushing the Limits of Simple Pipelines for Few-Shot Learning: External Data and Fine-Tuning Make a Difference**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07305\n- 代码/Code: None\n\n**Imposing Consistency for Optical Flow Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07262\n- 代码/Code: None\n\n**Measuring Compositional Consistency for Video Question Answering**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07190\n- 代码/Code: None\n\n**Deep Equilibrium Optical Flow Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08442\n- 代码/Code: None\n\n**Unsupervised domain adaptation and super resolution on drone images for autonomous dry herbage biomass estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08271\n- 代码/Code: None\n\n**OMG: Observe Multiple Granularities for Natural Language-Based Vehicle Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08209\n- 代码/Code: https://github.com/dyhBUPT/OMG.\n\n**Towards a Deeper Understanding of Skeleton-based Gait Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07855\n- 代码/Code: None\n\n**Interactiveness Field in Human-Object Interactions**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07718\n- 代码/Code: https://github.com/Foruck/Interactiveness-Field.\n\n**It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07660\n- 代码/Code: None\n\n**Deep Unlearning via Randomized Conditionally Independent Hessians**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.07655\n- 代码/Code: https://github.com/vsingh-group/LCODEC-deep-unlearning\n\n**Learning to Imagine: Diversify Memory for Incremental Learning using Unlabeled Data**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08932\n- 代码/Code: https://github.com/TOM-tym/Learn-to-Imagine\n\n**An Efficient Domain-Incremental Learning Approach to Drive in All Weather Conditions**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08817\n- 代码/Code: None\n\n**Incorporating Semi-Supervised and Positive-Unlabeled Learning for Boosting Full Reference Image Quality Assessment**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08763\n- 代码/Code: https://github.com/happycaoyue/JSPL\n\n\n**Self-Supervised Equivariant Learning for Oriented Keypoint Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.08613\n- 代码/Code: None\n\n**GazeOnce: Real-Time Multi-Person Gaze Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09480\n- 代码/Code: None\n\n**Epistemic Uncertainty-Weighted Loss for Visual Bias Mitigation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09389\n- 代码/Code: None\n\n**Reinforced Structured State-Evolution for Vision-Language Navigation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09280\n- 代码/Code: None\n\n**SpiderNet: Hybrid Differentiable-Evolutionary Architecture Search via Train-Free Metrics**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09320\n- 代码/Code: None\n\n**A Deeper Look into Aleatoric and Epistemic Uncertainty Disentanglement**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09308\n- 代码/Code: None\n\n**Does Interference Exist When Training a Once-For-All Network?**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09210\n- 代码/Code: https://github.com/Jordan-HS/RSS-Interference-CVPRW2022.\n\n**Importance is in your attention: agent importance prediction for autonomous driving**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09121\n- 代码/Code: None\n\n**SelfD: Self-Learning Large-Scale Driving Policies From the Web**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10320\n- 代码/Code: None\n\n**SmartPortraits: Depth Powered Handheld Smartphone Dataset of Human Portraits for State Estimation, Reconstruction and Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10211\n- 代码/Code: None\n\n**A case for using rotation invariant features in state of the art feature matchers**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10144\n- 代码/Code: None\n\n**Toward Fast, Flexible, and Robust Low-Light Image Enhancement**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10137\n- 代码/Code: https://github.com/vis-opt-group/SCI\n\n**OSSO: Obtaining Skeletal Shape from Outside**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10129\n- 代码/Code: None\n\n**Is Neuron Coverage Needed to Make Person Detection More Robust?**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10027\n- 代码/Code: None\n\n**Progressive Training of A Two-Stage Framework for Video Restoration**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09924\n- 代码/Code: None\n\n**CNLL: A Semi-supervised Approach For Continual Noisy Label Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09881\n- 代码/Code: None\n\n**Persistent-Transient Duality in Human Behavior Modeling**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09875\n- 代码/Code: None\n\n**Self-Supervised Learning to Guide Scientifically Relevant Categorization of Martian Terrain Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.09854\n- 代码/Code: https://github.com/TejasPanambur/mastcam\n\n**Exposure Correction Model to Enhance Image Quality**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10648\n- 代码/Code: https://github.com/yamand16/exposurecorrection\n\n**Spacing Loss for Discovering Novel Categories**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10595\n- 代码/Code: https://github.com/josephkj/awesome-novel-class-discovery\n\n**DiRA: Discriminative, Restorative, and Adversarial Learning for Self-supervised Medical Image Analysis**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10437\n- 代码/Code: https://github.com/jlianglab/dira\n\n**The 6th AI City Challenge**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10380\n- 代码/Code: None\n\n**Contrastive Test-Time Adaptation**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10377\n- 代码/Code: None\n\n**Proto2Proto: Can you recognize the car, the way I do?**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11830\n- 代码/Code: None\n\n**Multi-Layer Modeling of Dense Vegetation from Aerial LiDAR Scans**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11620\n- 代码/Code: https://github.com/ekalinicheva/multi_layer_vegetation.\n\n**Surpassing the Human Accuracy: Detecting Gallbladder Cancer from USG Images with Curriculum Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11433\n- 代码/Code: None\n\n**Can domain adaptation make object recognition work for everyone?**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11122\n- 代码/Code: None\n\n**Investigating Neural Architectures by Synthetic Dataset Design**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11045\n- 代码/Code: None\n\n**Revealing Occlusions with 4D Neural Fields**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10916\n- 代码/Code: None\n\n**Identity Preserving Loss for Learned Image Compression**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.10869\n- 代码/Code: None\n\n**Towards Data-Free Model Stealing in a Hard Label Setting**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.11022\n- 代码/Code: None\n\n**Context-Aware Sequence Alignment using 4D Skeletal Augmentation**\n\n- 论文/Paper: http://arxiv.org/abs/2204.12223\n- 代码/Code: None\n\n**Few-Shot Head Swapping in the Wild**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13100\n- 代码/Code: None\n\n**Attention Consistency on Visual Corruptions for Single-Source Domain Generalization**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13091\n- 代码/Code: None\n\n**Collaborative Learning for Hand and Object Reconstruction with Attention-guided Graph Convolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13062\n- 代码/Code: None\n\n**A Scalable Combinatorial Solver for Elastic Geometrically Consistent 3D Shape Matching**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.12805\n- 代码/Code: http://github.com/paul0noah/sm-comb\n\n**Leveraging Unlabeled Data for Sketch-based Understanding**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.12522\n- 代码/Code: None\n\n**Towards assessing agricultural land suitability with causal machine learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.12956\n- 代码/Code: None\n\n**Conformer and Blind Noisy Students for Improved Image Quality Assessment**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.12819\n- 代码/Code: None\n\n**NeurMiPs: Neural Mixture of Planar Experts for View Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13696\n- 代码/Code: None\n\n**Learning from Pixel-Level Noisy Label : A New Perspective for Light Field Saliency Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13456\n- 代码/Code: https://github.com/OLobbCode/NoiseLF.\n\n**A Challenging Benchmark of Anime Style Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.14034\n- 代码/Code: https://github.com/nkjcqvcpi/asr\n\n**AdaInt: Learning Adaptive Intervals for 3D Lookup Tables on Real-time Image Enhancement**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13983\n- 代码/Code: https://github.com/ImCharlesY/AdaInt.\n\n**SCS-Co: Self-Consistent Style Contrastive Learning for Image Harmonization**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13962\n- 代码/Code: https://github.com/ychang686/scs-co\n\n**Learning Adaptive Warping for Real-World Rolling Shutter Correction**\n\n- 论文/Paper: http://arxiv.org/pdf/2204.13886\n- 代码/Code: https://github.com/ljzycmd/bsrsc\n\n\n\n**Stability-driven Contact Reconstruction From Monocular Color Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.00848\n- 代码/Code: None\n\n**GPUNet: Searching the Deployable Convolution Neural Networks for GPUs**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.00841\n- 代码/Code: None\n\n**MUTR3D: A Multi-camera Tracking Framework via 3D-to-2D Queries**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.00613\n- 代码/Code: https://github.com/a1600012888/MUTR3D\n\n**LayoutBERT: Masked Language Layout Model for Object Insertion**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.00347\n- 代码/Code: None\n\n**Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.00272\n- 代码/Code: https://github.com/yangli18/vltvg\n\n**Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Discriminator**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.00146\n- 代码/Code: None\n\n**Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02151\n- 代码/Code: None\n\n**Self-Taught Metric Learning without Labels**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.01903\n- 代码/Code: None\n\n**Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02834\n- 代码/Code: None\n\n**Holistic Approach to Measure Sample-level Adversarial Vulnerability and its Utility in Building Trustworthy Systems**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02604\n- 代码/Code: None\n\n**P3IV: Probabilistic Procedure Planning from Instructional Videos with Weak Supervision**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.02300\n- 代码/Code: None\n\n**Prompt Distribution Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2205.03340\n- 代码/Code: None\n\n [返回目录/back](#Contents)\n\n\n\n\n\n\n\n\n\n参考：\n\n[如何评价 CVPR2022 的论文接收结果？](https://www.zhihu.com/question/519162597/)\n\n"
  },
  {
    "path": "CVPR2023.md",
    "content": "# CVPR2023-Papers-with-Code-Demo\n\n\n\n :star_and_crescent:**CVPR2022论文下载：添加微信: nvshenj125, 备注 CVPR 2022 即可获取全部论文pdf**\n\n :star_and_crescent:**福利 注册即可领取 200 块计算资源 : https://www.bkunyun.com/wap/console?source=aistudy**\n [使用说明](https://mp.weixin.qq.com/s?__biz=MzU4NTY4Mzg1Mw==&amp;mid=2247521550&amp;idx=1&amp;sn=db4c7f609bd61ae7734b9e012a763f98&amp;chksm=fd8413eccaf39afa686f69f2df2463f4a6a8233ba3b3edf698513bbee556c9f6c21e835b8eb8&token=705359263&lang=zh_CN#rd)\n\n\n欢迎关注公众号：AI算法与图像处理\n\n:star2: [CVPR 2023](https://cvpr2023.thecvf.com/) 持续更新最新论文/paper和相应的开源代码/code！\n\n\n\nB站demo：https://space.bilibili.com/288489574\n\n> :hand: ​注：欢迎各位大佬提交issue，分享CVPR 2022论文/paper和开源项目！共同完善这个项目\n>\n> 往年顶会论文汇总：\n>\n> [CVPR2021](https://github.com/DWCTOD/CVPR2023-Papers-with-Code-Demo/blob/main/CVPR2021.md)\n>\n> [CVPR2022](https://github.com/DWCTOD/CVPR2023-Papers-with-Code-Demo/blob/main/CVPR2022.md)\n>\n> [ICCV2021](https://github.com/DWCTOD/ICCV2021-Papers-with-Code-Demo)\n>\n> [ECCV2022](https://github.com/DWCTOD/ECCV2022-Papers-with-Code-Demo)\n\n### **:fireworks: 欢迎进群** | Welcome\n\nCVPR 2023 论文/paper交流群已成立！已经收录的同学，可以添加微信：**nvshenj125**，请备注：**CVPR+姓名+学校/公司名称**！一定要根据格式申请，可以拉你进群。\n\n<a name=\"Contents\"></a>\n\n\n\n### :hammer: **目录 |Table of Contents（点击直接跳转）**\n\n<details open>\n<summary> 目录（右侧点击可折叠）</summary>\n\n- [Backbone](#Backbone)\n- [数据集/Dataset](#Dataset)\n- [Diffusion Model](#DiffusionModel)\n- [NAS](#NAS)\n- [NeRF](#NeRF)\n- [Knowledge Distillation](#KnowledgeDistillation)\n- [多模态 / Multimodal ](#Multimodal )\n- [对比学习/Contrastive Learning](#ContrastiveLearning)\n- [图神经网络 / Graph Neural Networks](#GNN)\n- [胶囊网络 / Capsule Network](#CapsuleNetwork)\n- [图像分类 / Image Classification](#ImageClassification)\n- [目标检测/Object Detection](#ObjectDetection)\n- [目标跟踪/Object Tracking](#ObjectTracking)\n- [轨迹预测/Trajectory Prediction](#TrajectoryPrediction)\n- [语义分割/Segmentation](#Segmentation)\n- [弱监督语义分割/Weakly Supervised Semantic Segmentation](#WSSS)\n- [医学图像分割](#MedicalImageSegmentation)\n- [视频目标分割/Video Object Segmentation](#VideoObjectSegmentation)\n- [交互式视频目标分割/Interactive Video Object Segmentation](#InteractiveVideoObjectSegmentation)\n- [Visual Transformer](#VisualTransformer)\n- [深度估计/Depth Estimation](#DepthEstimation)\n- [人脸识别/Face Recognition](#FaceRecognition)\n- [人脸检测/Face Detection](#FaceDetection)\n- [人脸活体检测/Face Anti-Spoofing](#FaceAnti-Spoofing)\n- [人脸年龄估计/Age Estimation](#AgeEstimation)\n- [人脸表情识别/Facial Expression Recognition](#FacialExpressionRecognition)\n- [人脸属性识别/Facial Attribute Recognition](#FacialAttributeRecognition)\n- [人脸编辑/Facial Editing](#FacialEditing)\n- [人脸重建/Face Reconstruction](#FaceReconstruction)\n- [Talking Face](#TalkingFace)\n- [换脸/Face Swap](#FaceSwap)\n- [人体姿态估计/Human Pose Estimation](#HumanPoseEstimation)\n- [6D位姿估计 /6D Pose Estimation](#6DPoseEstimation)\n- [手势姿态估计（重建）/Hand Pose Estimation( Hand Mesh Recovery)](#HandPoseEstimation)\n- [视频动作检测/Video Action Detection](#VideoActionDetection)\n- [手语翻译/Sign Language Translation](#SignLanguageTranslation)\n- [3D人体重建](#3D人体重建)\n- [行人重识别/Person Re-identification](#PersonRe-identification)\n- [行人搜索/Person Search](#PersonSearch)\n- [人群计数 / Crowd Counting](#CrowdCounting)\n- [GAN](#GAN)\n- [彩妆迁移 / Color-Pattern Makeup Transfer](#CPM)\n- [字体生成 / Font Generation](#FontGeneration)\n- [场景文本检测、识别/Scene Text Detection/Recognition](#OCR)\n- [图像、视频检索 / Image Retrieval/Video retrieval](#Retrieval)\n- [Image Animation](#ImageAnimation)\n- [抠图/Image Matting](#ImageMatting)\n- [超分辨率/Super Resolution](#SuperResolution)\n- [图像复原/Image Restoration](#ImageRestoration)\n- [图像补全/Image Inpainting](#ImageInpainting)\n- [图像去噪/Image Denoising](#ImageDenoising)\n- [图像编辑/Image Editing](#ImageEditing)\n- [图像拼接/Image stitching](#Imagestitching)\n- [图像匹配/Image Matching](#ImageMatching)\n- [图像融合/Image Blending](#ImageBlending)\n- [图像去雾/Image Dehazing](#ImageDehazing)\n- [图像压缩/Image Compression](#ImageCompression)\n- [反光去除/Reflection Removal](#ReflectionRemoval)\n- [车道线检测/Lane Detection](#LaneDetection)\n- [自动驾驶 / Autonomous Driving](#AutonomousDriving)\n- [流体重建/Fluid Reconstruction](#FluidReconstruction)\n- [场景重建 / Scene Reconstruction](#SceneReconstruction)\n- [3D Reconstruction](#3DReconstruction)\n- [视频插帧/Frame Interpolation](#FrameInterpolation)\n- [视频超分 / Video Super-Resolution](#VideoSuper-Resolution)\n- [3D点云/3D point cloud](#3DPointCloud)\n- [标签噪声 / Label-Noise](#Label-Noise)\n- [对抗样本/Adversarial Examples](#AdversarialExamples)\n- [Anomaly Detection](#AnomalyDetection)\n- [其他/Other](#Other)\n\n\n</details>\n\n<a name=\"Backbone\"></a>\n\n## Backbone\n\n**Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks**\n\n- 论文/Paper:https://arxiv.org/abs/2303.03667\n- 代码/Code:https://github.com/JierunChen/FasterNet\n\n[返回目录/back](#Contents)\n\n<a name=\"Dataset\"></a> \n\n## 数据集/Dataset\n\n**Spring: A High-Resolution High-Detail Dataset and Benchmark for Scene Flow, Optical Flow and Stereo**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01943\n- 代码/Code: None\n\n**Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02760\n- 代码/Code: None\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"DiffusionModel\"></a> \n\n# Diffusion Model\n\n**Unifying Layout Generation with a Decoupled Diffusion Model**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05049\n- 代码/Code: None\n\n**DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06885\n- 代码/Code: None\n\n**LayoutDM: Discrete Diffusion Model for Controllable Layout Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08137\n- 代码/Code: https://github.com/CyberAgentAILab/layout-dm\n\n**Controllable Mesh Generation Through Sparse Latent Point Diffusion Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07938\n- 代码/Code: None\n\n**Decomposed Diffusion Models for High-Quality Video Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08320\n- 代码/Code: None\n\n**Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09119\n- 代码/Code: https://github.com/advocate99/diffgesture\n\n**Leapfrog Diffusion Model for Stochastic Trajectory Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10895\n- 代码/Code: https://github.com/mediabrain-sjtu/led\n\n**Conditional Image-to-Video Generation with Latent Flow Diffusion Models**\n\n- 论文/Paper: https://arxiv.org/pdf/2303.13744.pdf\n- 代码/Code: https://github.com/nihaomiao/CVPR23_LFDM\n\n[返回目录/back](#Contents)\n\n<a name=\"NAS\"></a> \n\n## NAS\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"NeRF\"></a> \n\n# NeRF\n\n**Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervisio**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03361\n- 代码/Code: None\n\n**NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06919\n- 代码/Code: None\n\n**PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09554\n- 代码/Code: None\n\n**StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10598\n- 代码/Code: None\n\n**SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13277\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"KnowledgeDistillation\"></a> \n\n## Knowledge Distillation\n\n**Paper title: Generic-to-Specific Distillation of Masked Autoencoders**\n\n- 论文/Paper: https://arxiv.org/abs/2302.14771\n- 代码/Code: https://github.com/pengzhiliang/G2SD\n\n**X$^3$KD: Knowledge Distillation Across Modalities, Tasks and Stages for Multi-Camera 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02203\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"Multimodal\"></a> \n\n## 多模态 / Multimodal\n\n**PolyFormer: Referring Image Segmentation as Sequential Polygon Generation**\n\n- 论文/Paper: https://arxiv.org/abs/2302.14771\n- 代码/Code: None\n\n**Multimodal Industrial Anomaly Detection via Hybrid Fusion**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00601\n- 代码/Code: https://github.com/nomewang/m3dm\n\n**Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00462\n- 代码/Code: https://github.com/toytiny/cmflow\n\n**AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00865\n- 代码/Code: None\n\n**Multimodal Prompting with Missing Modalities for Visual Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03369\n- 代码/Code: https://github.com/yilunlee/missing_aware_prompts\n\n**FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02483\n- 代码/Code: None\n\n**Virtual Sparse Convolution for Multimodal 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02314\n- 代码/Code: https://github.com/hailanyi/virconv\n\n**LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03595\n- 代码/Code: https://github.com/sankin97/LoGoNet\n\n**Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05952\n- 代码/Code: None\n\n**Align and Attend: Multimodal Summarization with Dual Contrastive Losses**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07284\n- 代码/Code: None\n\n**Multimodal Feature Extraction and Fusion for Emotional Reaction Intensity Estimation and Expression Classification in Videos with Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09164\n- 代码/Code: None\n\n**Self-Supervised Learning for Multimodal Non-Rigid 3D Shape Matching**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10971\n- 代码/Code: https://github.com/dongliangcao/Self-Supervised-Multimodal-Shape-Matching\n\n**Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12501\n\n[返回目录/back](#Contents)\n\n<a name=\"ContrastiveLearning\"></a> \n\n## Contrastive Learning\n\n**Twin Contrastive Learning with Noisy Labels**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06930\n- 代码/Code: https://github.com/Hzzone/TCL\n\n**TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06819\n- 代码/Code: https://github.com/kali-hac/transg\n\n**MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07815\n- 代码/Code: None\n\n**Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11302\n- 代码/Code: \\url{https://github.com/weixuansun/FNAC-AVL}.\n\n**Actionlet-Dependent Contrastive Learning for Unsupervised Skeleton-Based Action Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10904\n- 代码/Code: None\n\n**Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10323\n- 代码/Code: https://github.com/mlii0117/dcl\n\n**CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12793\n- 代码/Code: https://github.com/FangyunWei/SLRT\n\n**MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12756\n- 代码/Code: https://github.com/MrChenFeng/MaskCon_CVPR2023\n\n[返回目录/back](#Contents)\n\n<a name=\"CapsuleNetwork\"></a> \n\n# 胶囊网络 / Capsule Network\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageClassification\"></a> \n\n# 图像分类 / Image Classification\n\n**Fine-Grained Classification with Noisy Labels**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02404\n- 代码/Code: None\n\n**Task-specific Fine-tuning via Variational Information Bottleneck for Weakly-supervised Pathology Whole Slide Image Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08446\n- 代码/Code: https://github.com/invoker-LL/WSI-finetuning\n\n**Boosting Verified Training for Robust Image Classifications via Abstraction**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11552\n- 代码/Code: https://github.com/zhangzhaodi233/abscert\n\n**Curvature-Balanced Feature Manifold Learning for Long-Tailed Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12307\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"ObjectDetection\"></a> \n\n## 目标检测/Object Detection\n\n**Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01686\n- 代码/Code: None\n\n**Virtual Sparse Convolution for Multimodal 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02314\n- 代码/Code: https://github.com/hailanyi/virconv\n\n**LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03595\n- 代码/Code: https://github.com/sankin97/LoGoNet\n\n**NIFF: Alleviating Forgetting in Generalized Few-Shot Object Detection via Neural Instance Feature Forging**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.04958\n- 代码/Code: None\n\n**Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05892\n- 代码/Code: https://github.com/LutingWang/OADP.\n\n**Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05886\n- 代码/Code: https://github.com/PJLabADG/3DTrans\n\n**Uni3D: A Unified Baseline for Multi-dataset 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06880\n- 代码/Code: https://github.com/PJLab-ADG/3DTrans\n\n**Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07335\n- 代码/Code: https://github.com/IDEA-Research/Lite-DETR\n\n**PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08129\n- 代码/Code: https://github.com/blvlab/pimae\n\n**Weakly Supervised Monocular 3D Object Detection using Multi-View Projection and Direction Consistency**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08686\n- 代码/Code: https://github.com/weakmono3d/weakmono3d\n\n**Active Teacher for Semi-Supervised Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08348\n- 代码/Code: https://github.com/hunterj-lin/activeteacher\n\n**MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08316\n- 代码/Code: https://github.com/skyhehe123/MSF\n\n**MixTeacher: Mining Promising Labels with Mixed Scale Teacher for Semi-Supervised Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09061\n- 代码/Code: https://github.com/lliuz/MixTeacher\n\n**DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09674\n- 代码/Code: https://github.com/Phoenix-V/DiGeo\n\n**VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11301\n- 代码/Code: https://github.com/dvlab-research/VoxelNeXt\n\n**Benchmarking Robustness of 3D Object Detection to Common Corruptions in Autonomous Driving**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11040\n- 代码/Code: https://github.com/kkkcx/3D_Corruptions_AD.\n\n**CAPE: Camera View Position Embedding for Multi-View 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10209\n- 代码/Code: https://github.com/PaddlePaddle/Paddle3D\n\n**STDLens: Model Hijacking-resilient Federated Learning for Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11511\n- 代码/Code: https://github.com/git-disl/stdlens\n\n**MonoATT: Online Monocular 3D Object Detection with Adaptive Token Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13018\n- 代码/Code: None\n\n**Dense Distinct Query for End-to-End Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12776\n- 代码/Code: https://github.com/jshilong/ddq\n\n**OcTr: Octree-based Transformer for 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12621\n- 代码/Code: None\n\n**Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection**\n\n- 论文/Paper: https://arxiv.org/abs/2209.01589\n- 代码/Code: https://github.com/Adamdad/ConsistentTeacher\n\n[返回目录/back](#Contents)\n\n\n\n<a name=\"ObjectTracking\"></a> \n\n# 目标跟踪/Object Tracking\n\n**Referring Multi-Object Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03366\n- 代码/Code: https://github.com/wudongming97/rmot\n\n**Unsupervised Contour Tracking of Live Cells by Mechanical and Cycle Consistency Losses**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08364\n- 代码/Code: https://github.com/junbongjang/contour-tracking\n\n**VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11301\n- 代码/Code: https://github.com/dvlab-research/VoxelNeXt\n\n**Visual Prompt Multi-Modal Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10826\n- 代码/Code: https://github.com/jiawen-zhu/ViPT.\n\n**MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10404\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n# 3D Object Tracking\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"TrajectoryPrediction\"></a> \n\n## 轨迹预测/Trajectory Prediction\n\n**IPCC-TP: Utilizing Incremental Pearson Correlation Coefficient for Joint Multi-Agent Trajectory Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00575\n- 代码/Code: None\n\n**Trajectory-Aware Body Interaction Transformer for Multi-Person Pose Forecasting**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05095\n- 代码/Code: None\n\n**Leapfrog Diffusion Model for Stochastic Trajectory Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10895\n- 代码/Code: https://github.com/mediabrain-sjtu/led\n\n[返回目录/back](#Contents)\n\n<a name=\"Segmentation\"></a> \n\n## 语义分割/Segmentation\n\n**Interactive Segmentation as Gaussian Process Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14578\n- 代码/Code: None\n\n**Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14250\n- 代码/Code: None\n\n**PolyFormer: Referring Image Segmentation as Sequential Polygon Generation**\n\n- 论文/Paper: https://arxiv.org/abs/2302.14771\n- 代码/Code: None\n\n**ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00246\n- 代码/Code: None\n\n**Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss**\n\n- 论文/Paper: https://arxiv.org/abs/2301.05709\n- 代码/Code: None\n\n**Delivering Arbitrary-Modal Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01480\n- 代码/Code: None\n\n**Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01276\n- 代码/Code: https://github.com/xiaoyao3302/CCVC\n\n**Token Contrast for Weakly-Supervised Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01267\n- 代码/Code: https://github.com/rulixiang/toco \n\n**Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.04803\n- 代码/Code: None\n\n**MP-Former: Mask-Piloted Transformer for Image Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07336\n- 代码/Code: https://github.com/IDEA-Research/MP-Former\n\n**Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07224\n- 代码/Code: https://github.com/thu-lyj-lab/ar-seg\n\n**InstMove: Instance Motion for Object-centric Video Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08132\n- 代码/Code: https://github.com/wjf5203/vnext\n\n**DynaMask: Dynamic Mask Selection for Instance Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07868\n- 代码/Code: https://github.com/lslrh/dynamask\n\n**MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07815\n- 代码/Code: None\n\n**MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08600\n- 代码/Code: https://github.com/jialeli1/lidarseg3d\n\n**FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08594\n- 代码/Code: https://github.com/junjiehe96/FastInst\n\n**SIM: Semantic-aware Instance Mask Generation for Box-Supervised Instance Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08578\n- 代码/Code: https://github.com/lslrh/sim\n\n**Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10100\n- 代码/Code: https://github.com/0liliulei/Mask-VOS\n\n**Generative Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11316\n- 代码/Code: https://github.com/fudan-zvg/gss\n\n**Reliability in Semantic Segmentation: Are We on the Right Track?**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11298\n- 代码/Code: https://github.com/naver/relis\n\n**Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11203\n- 代码/Code: https://github.com/l1997i/lim3d\n\n**Explicit Visual Prompting for Low-Level Structure Segmentations**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10883\n- 代码/Code: https://github.com/nifangbaage/explict-visual-prompt\n\n**Two-shot Video Object Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12078\n- 代码/Code: https://github.com/yk-pku/Two-shot-Video-Object-Segmentation\n\n**Focused and Collaborative Feedback Integration for Interactive Image Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11880\n- 代码/Code: https://github.com/veizgyauzgyauz/fcfi\n\n**Orthogonal Annotation Benefits Barely-supervised Medical Image Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13090\n- 代码/Code: https://github.com/hengcai-nju/desco\n\n[返回目录/back](#Contents)\n\n\n\n<a name=\"WSSS\"></a>\n\n## 弱监督语义分割/Weakly Supervised Semantic Segmentation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"MedicalImageSegmentation\"></a>\n\n# 医学图像分割/Medical Image Segmentation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"VideoObjectSegmentation\"></a>\n\n# 视频目标分割/Video Object Segmentation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"InteractiveVideoObjectSegmentation\"></a>\n\n# 交互式视频目标分割/Interactive Video Object Segmentation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"VisualTransformer\"></a>\n\n# Visual Transformer\n\n**Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14746\n- 代码/Code: None\n\n**ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14435\n- 代码/Code: https://github.com/I2-Multimedia-Lab/ProxyFormer\n\n**Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01112\n- 代码/Code: None\n\n**MP-Former: Mask-Piloted Transformer for Image Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07336\n- 代码/Code: https://github.com/IDEA-Research/MP-Former\n\n**TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06819\n- 代码/Code: https://github.com/kali-hac/transg\n\n**BiFormer: Vision Transformer with Bi-Level Routing Attention**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08810\n- 代码/Code: https://github.com/rayleizhu/biformer\n\n**Making Vision Transformers Efficient from A Token Sparsification View**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08685\n- 代码/Code: None\n\n**Rotation-Invariant Transformer for Point Cloud Matching**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08231\n- 代码/Code: None\n\n**Graph Transformer GANs for Graph-Constrained House Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08225\n- 代码/Code: None\n\n**PSVT: End-to-End Multi-person 3D Pose and Shape Estimation with Progressive Video Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09187\n- 代码/Code: None\n\n**Multimodal Feature Extraction and Fusion for Emotional Reaction Intensity Estimation and Expression Classification in Videos with Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09164\n- 代码/Code: None\n\n**Dual-path Adaptation from Image to Video Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09857\n- 代码/Code: https://github.com/park-jungin/DualPath\n\n**Patch-Mix Transformer for Unsupervised Domain Adaptation: A Game Perspective**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13434\n- 代码/Code: None\n\n**POTTER: Pooling Attention Transformer for Efficient Human Mesh Recovery**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13357\n- 代码/Code: None\n\n**MonoATT: Online Monocular 3D Object Detection with Adaptive Token Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13018\n- 代码/Code: None\n\n**MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13009\n- 代码/Code: https://github.com/mlvlab/MELTR\n\n**Spherical Transformer for LiDAR-based 3D Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12766\n- 代码/Code: https://github.com/dvlab-research/sphereformer\n\n**OcTr: Octree-based Transformer for 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12621\n- 代码/Code: None\n\n**Text with Knowledge Graph Augmented Transformer for Video Captioning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12423\n- 代码/Code: None\n\n**MAGVLT: Masked Generative Vision-and-Language Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12208\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"DepthEstimation\"></a>\n\n## 深度估计/Depth Estimation\n\n**Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation**\n\n- 论文/Paper: https://arxiv.org/abs/2211.13202\n- 代码/Code:https://github.com/noahzn/Lite-Mono\n\n**Fully Self-Supervised Depth Estimation from Defocus Clue**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10752\n- 代码/Code: https://github.com/ehzoahis/dered\n\n[返回目录/back](#Contents)\n\n<a name=\"Retrieval\"></a>\n\n# 图像、视频检索 / Image Retrieval/Video retrieval\n\n**Data-Free Sketch-Based Image Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07775\n- 代码/Code: https://github.com/abhrac/data-free-sbir\n\n**CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13440\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"SuperResolution\"></a>\n\n## 超分辨率/Super Resolution\n\n**OPE-SR: Orthogonal Position Encoding for Designing a Parameter-free Upsampling Module in Arbitrary-scale Image Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01091\n- 代码/Code: None\n\n**Super-Resolution Neural Operator**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02584\n- 代码/Code: https://github.com/2y7c3/super-resolution-neural-operator\n\n**Local Implicit Normalizing Flow for Arbitrary-Scale Image Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05156\n- 代码/Code: None\n\n**Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08331\n- 代码/Code: https://github.com/coulsonlee/STDO-CVPR2023.git\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageDenoising\"></a>\n\n## 图像去噪/Image Denoising - 1 篇 \n\n**Masked Image Training for Generalizable Deep Image Denoising**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13132\n- 代码/Code: https://github.com/haoyuc/maskeddenoising\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageEditing\"></a>\n\n# 图像编辑/Image Editing\n\n**CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05031\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageCompression\"></a>\n\n# 图像压缩/Image Compression\n\n**Context-Based Trit-Plane Coding for Progressive Image Compression**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05715\n- 代码/Code: https://github.com/seungminjeon-github/ctc\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceRecognition\"></a>\n\n# 人脸识别/Face Recognition\n\n**Attribute-preserving Face Dataset Anonymization via Latent Code Optimization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11296\n- 代码/Code: https://github.com/chi0tzp/falco\n\n**Graphics Capsule: Learning Hierarchical 3D Face Representations from 2D Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10896\n- 代码/Code: None\n\n**Sibling-Attack: Rethinking Transferable Adversarial Attacks against Face Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12512\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceDetection\"></a>\n\n# 人脸检测/Face Detection\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceAnti-Spoofing\"></a>\n\n# 人脸活体检测/Face Anti-Spoofing\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceReconstruction\"></a>\n\n## 人脸重建/Face Reconstruction\n\n**DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06885\n- 代码/Code: None\n\n**A Hierarchical Representation Network for Accurate and Detailed Face Reconstruction from In-The-Wild Images**\n\n- 论文/Paper: https://arxiv.org/abs/2302.14434\n- 代码/Code: https://younglbw.github.io/HRN-homepage/\n\n[返回目录/back](#Contents)\n\n<a name=\"VideoActionDetection\"></a>\n\n# 视频动作检测/Video Action Detection\n\n**TriDet: Temporal Action Detection with Relative Boundary Modeling**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07347\n- 代码/Code: https://github.com/sssste/tridet\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"SignLanguageTranslation\"></a>\n\n# 手语翻译/Sign Language Translation\n\n**Continuous Sign Language Recognition with Correlation Network**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03202\n- 代码/Code: None\n\n**Natural Language-Assisted Sign Language Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12080\n- 代码/Code: https://github.com/FangyunWei/SLRT\n\n[返回目录/back](#Contents)\n\n<a name=\"PersonRe-identification\"></a>\n\n# 行人重识别/Person Re-identification\n\n**TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06819\n- 代码/Code: https://github.com/kali-hac/transg\n\n**Diverse Embedding Expansion Network and Low-Light Cross-Modality Benchmark for Visible-Infrared Person Re-identification**\n\n- 论文/Paper: https://arxiv.org/abs/2303.14481\n- 代码/Code: https://github.com/ZYK100/LLCM\n\n[返回目录/back](#Contents)\n\n<a name=\"TalkingFace\"></a>\n\n## Talking Face\n\n**SadTalker： Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation**\n\n- 论文/Paper: https://arxiv.org/abs/2211.12194\n- 代码/Code: https://github.com/Winfredy/SadTalker\n\n[返回目录/back](#Contents)\n\n<a name=\"HumanPoseEstimation\"></a>\n\n# 人体姿态估计/Human Pose Estimation\n\n**PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07337\n- 代码/Code: https://github.com/qihao067/poseexaminer\n\n**Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08475\n- 代码/Code: None\n\n**SLOPER4D: A Scene-Aware Dataset for Global 4D Human Pose Estimation in Urban Environments**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09095\n- 代码/Code: None\n\n**Self-Correctable and Adaptable Inference for Generalizable Human Pose Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11180\n- 代码/Code: None\n\n**3D Human Mesh Estimation from Virtual Markers**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11726\n- 代码/Code: https://github.com/ShirleyMaxx/VirtualMarker.\n\n**Rigidity-Aware Detection for 6D Object Pose Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12396\n- 代码/Code: None\n\n**Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12246\n- 代码/Code: None\n\n## \n\n[返回目录/back](#Contents)\n\n<a name=\"GAN\"></a>\n\n# GAN\n\n**Improving GAN Training via Feature Space Shrinkage**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01559\n- 代码/Code: https://github.com/WentianZhang-ML/AdaptiveMix\n\n**Scaling up GANs for Text-to-Image Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05511\n- 代码/Code: None\n\n**Graph Transformer GANs for Graph-Constrained House Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08225\n- 代码/Code: None\n\n**Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10774\n- 代码/Code: https://github.com/mattolson93/cross_gan_auditing\n\n## \n\n[返回目录/back](#Contents)\n\n<a name=\"AgeEstimation\"></a>\n\n# 人脸年龄估计/Age Estimation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"FacialExpressionRecognition\"></a>\n\n# 人脸表情识别/Facial Expression Recognition\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"HandPoseEstimation\"></a>\n\n## 手势姿态估计（重建）/Hand Pose Estimation( Hand Mesh Recovery)\n\n**Im2Hands: Learning Attentive Implicit Representation of Interacting Two-Hand Shapes**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14348\n- 代码/Code: https://github.com/jyunlee/Im2Hands\n\n**ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05938\n- 代码/Code: https://github.com/zhengdiyu/arbitrary-hands-3d-reconstruction\n\n[返回目录/back](#Contents)\n\n<a name=\"3DReconstruction\"></a>\n\n## 3D Reconstruction\n\n**Unsupervised 3D Shape Reconstruction by Part Retrieval and Assembly**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01999\n- 代码/Code: None\n\n**MobileBrick: Building LEGO for 3D Reconstruction on Mobile Devices**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01932\n- 代码/Code: None\n\n**HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02700\n- 代码/Code: None\n\n**NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02375\n- 代码/Code: None\n\n**Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05937\n- 代码/Code: None\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"FrameInterpolation\"></a>\n\n## 视频插帧/Frame Interpolation\n\n**Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00440\n- 代码/Code: https://github.com/MCG-NJU/EMA-VFI\n\n[返回目录/back](#Contents)\n\n<a name=\"3DPointCloud\"></a>\n\n## 3D点云/3D point cloud\n\n**ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00246\n- 代码/Code: None\n\n**Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss**\n\n- 论文/Paper: https://arxiv.org/abs/2301.05709\n- 代码/Code: None\n\n**Neural Intrinsic Embedding for Non-rigid Point Cloud Matching**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01038\n- 代码/Code: None\n\n**ACL-SPC: Adaptive Closed-Loop system for Self-Supervised Point Cloud Completion**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01979\n- 代码/Code: https://github.com/Sangminhong/ACL-SPC_PyTorch\n\n**PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01959\n- 代码/Code: None\n\n**SCPNet: Semantic Scene Completion on Point Cloud**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06884\n- 代码/Code: None\n\n**Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08134\n- 代码/Code: https://github.com/zrrskywalker/point-nn\n\n**PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08129\n- 代码/Code: https://github.com/blvlab/pimae\n\n**Frequency-Modulated Point Cloud Rendering with Easy Editing**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07596\n- 代码/Code: None\n\n**MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08316\n- 代码/Code: \\url{https://github.com/skyhehe123/MSF}.\n\n**Rotation-Invariant Transformer for Point Cloud Matching**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08231\n- 代码/Code: None\n\n**Deep Graph-based Spatial Consistency for Robust Non-rigid Point Cloud Registration**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09950\n- 代码/Code: https://github.com/qinzheng93/GraphSCNet\n\n**Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11203\n- 代码/Code: https://github.com/l1997i/lim3d\n\n**Novel Class Discovery for 3D Point Cloud Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11610\n- 代码/Code: https://github.com/luigiriz/nops\n\n**Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13290\n- 代码/Code: https://github.com/gfmei/udpreg\n\n[返回目录/back](#Contents)\n\n<a name=\"AnomalyDetection\"></a>\n\n# Anomaly Detection\n\n**Diversity-Measurable Anomaly Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05047\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"Other\"></a>\n\n## 其他/Other\n\n**PA&DA: Jointly Sampling PAth and DAta for Consistent NAS**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14772\n- 代码/Code: https://github.com/ShunLu91/PA-DA\n\n**Generic-to-Specific Distillation of Masked Autoencoders**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14771\n- 代码/Code: https://github.com/pengzhiliang/G2SD.\n\n**Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14677\n- 代码/Code: None\n\n**Turning a CLIP Model into a Scene Text Detector**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14338\n- 代码/Code: None\n\n**Adversarial Attack with Raindrops**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14267\n- 代码/Code: None\n\n**Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14115\n- 代码/Code: None\n\n**DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14685\n- 代码/Code: None\n\n**Neural Video Compression with Diverse Contexts**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14402\n- 代码/Code: https://github.com/microsoft/DCVC\n\n**Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation**\n\n- 论文/Paper: http://arxiv.org/pdf/2302.14290\n- 代码/Code: None\n\n**Efficient and Explicit Modelling of Image Hierarchies for Image Restoration**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00748\n- 代码/Code: https://github.com/ofsoundof/grl-image-restoration\n\n**Quality-aware Pre-trained Models for Blind Image Quality Assessment**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00521\n- 代码/Code: None\n\n**Renderable Neural Radiance Map for Visual Navigation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00304\n- 代码/Code: None\n\n**Single Image Backdoor Inversion via Robust Smoothed Classifiers**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00215\n- 代码/Code: https://github.com/locuslab/smoothinv\n\n**Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00040\n- 代码/Code: None\n\n**Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01311\n- 代码/Code: None\n\n**MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01239\n- 代码/Code: https://github.com/jingjing12110/MixPHM\n\n**Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00971\n- 代码/Code: https://github.com/zhijieshen-bjtu/dopnet\n\n**Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00914\n- 代码/Code: None\n\n**Towards Trustable Skin Cancer Diagnosis via Rewriting Model's Decision**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00885\n- 代码/Code: None\n\n**Geometric Visual Similarity Learning in 3D Medical Image Self-supervised Pre-training**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00874\n- 代码/Code: https://github.com/yutinghe-list/gvsl\n\n**Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01052\n- 代码/Code: None\n\n**UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.00938\n- 代码/Code: None\n\n**Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02151\n- 代码/Code: https://github.com/ZrrSkywalker/CaFo.\n\n**Zero-shot Object Counting**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02001\n- 代码/Code: https://github.com/cvlab-stonybrook/zero-shot-counting\n\n**EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01904\n- 代码/Code: None\n\n**Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01903\n- 代码/Code: https://github.com/MILVLG/prophet\n\n**Intrinsic Physical Concepts Discovery with Object-Centric Predictive Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01869\n- 代码/Code: None\n\n**Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01788\n- 代码/Code: None\n\n**Diverse 3D Hand Gesture Prediction from Body Dynamics by Bilateral Hand Disentanglement**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01765\n- 代码/Code: None\n\n**Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01669\n- 代码/Code: None\n\n**Hierarchical discriminative learning improves visual representations of biomedical microscopy**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01605\n- 代码/Code: None\n\n**A Meta-Learning Approach to Predicting Performance and Data Requirements**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01598\n- 代码/Code: None\n\n**DejaVu: Conditional Regenerative Learning to Enhance Dense Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.01573\n- 代码/Code: None\n\n**Detecting Human-Object Contact in Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03373\n- 代码/Code: None\n\n**MACARONS: Mapping And Coverage Anticipation with RGB Online Self-Supervision**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03315\n- 代码/Code: None\n\n**Masked Images Are Counterfactual Samples for Robust Fine-tuning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03052\n- 代码/Code: None\n\n**UniHCP: A Unified Model for Human-Centric Perceptions**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02936\n- 代码/Code: None\n\n**PyramidFlow: High-Resolution Defect Contrastive Localization using Pyramid Normalizing Flow**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02595\n- 代码/Code: None\n\n**CapDet: Unifying Dense Captioning and Open-World Detection Pretraining**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02489\n- 代码/Code: None\n\n**DistilPose: Tokenized Pose Regression with Heatmap Distillation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02455\n- 代码/Code: None\n\n**DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.02165\n- 代码/Code: https://github.com/alibaba/lightweight-neural-architecture-search\n\n**Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03108\n- 代码/Code: None\n\n**Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.04077\n- 代码/Code: None\n\n**Guiding Pseudo-labels with Uncertainty Estimation for Test-Time Adaptation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03770\n- 代码/Code: None\n\n**Learning Discriminative Representations for Skeleton Based Action Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03729\n- 代码/Code: None\n\n**MOSO: Decomposing MOtion, Scene and Object for Video Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.03684\n- 代码/Code: None\n\n**RM-Depth: Unsupervised Learning of Recurrent Monocular Depth in Dynamic Scenes**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.04456\n- 代码/Code: https://github.com/twhui/rm-depth\n\n**A Light Weight Model for Active Speaker Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.04439\n- 代码/Code: https://github.com/junhua-liao/light-asd\n\n**Where We Are and What We're Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.04249\n- 代码/Code: None\n\n**CUDA: Convolution-based Unlearnable Datasets**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.04278\n- 代码/Code: None\n\n**Masked Image Modeling with Local Multi-Scale Reconstruction**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05251\n- 代码/Code: None\n\n**Revisiting Rotation Averaging: Uncertainties and Robust Losses**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05195\n- 代码/Code: https://github.com/zhangganlin/globalsfmpy\n\n**Text-Visual Prompting for Efficient 2D Temporal Video Grounding**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.04995\n- 代码/Code: None\n\n**MVImgNet: A Large-scale Dataset of Multi-view Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06042\n- 代码/Code: None\n\n**Neuron Structure Modeling for Generalizable Remote Physiological Measurement**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05955\n- 代码/Code: https://github.com/lupaopao/nest\n\n**3D Cinemagraphy from a Single Image**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05724\n- 代码/Code: None\n\n**HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05675\n- 代码/Code: https://github.com/OpenGVLab/HumanBench\n\n**TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.05762\n- 代码/Code: https://github.com/chenweixin107/trojdiff\n\n**Modality-Agnostic Debiasing for Single Domain Generalization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07123\n- 代码/Code: None\n\n**Upcycling Models under Domain and Category Shift**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07110\n- 代码/Code: https://github.com/ispc-lab/glc\n\n**Prototype-based Embedding Network for Scene Graph Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07096\n- 代码/Code: None\n\n**MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07065\n- 代码/Code: https://github.com/vimar-gu/MSINet\n\n**Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06949\n- 代码/Code: None\n\n**Progressive Open Space Expansion for Open-Set Model Attribution**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06877\n- 代码/Code: https://github.com/tianyunyoung/pose\n\n**Interventional Bag Multi-Instance Learning On Whole-Slide Pathological Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06873\n- 代码/Code: https://github.com/HHHedo/IBMIL\n\n**Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06870\n- 代码/Code: https://github.com/megvii-research/US3L-CVPR2023.\n\n**Adaptive Data-Free Quantization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06869\n- 代码/Code: https://github.com/hfutqian/adadfq\n\n**Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06859\n- 代码/Code: https://github.com/lixinustc/casual-ir-dil\n\n**Dynamic Neural Network for Multi-Task Learning Searching across Diverse Network Topologies**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06856\n- 代码/Code: None\n\n**Universal Instance Perception as Object Discovery and Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06674\n- 代码/Code: https://github.com/MasterBin-IIAU/UNINEXT\n\n**Iterative Geometry Encoding Volume for Stereo Matching**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06615\n- 代码/Code: https://github.com/gangweix/igev\n\n**Regularized Vector Quantization for Tokenized Image Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06424\n- 代码/Code: None\n\n**Semi-supervised Hand Appearance Recovery via Structure Disentanglement and Dual Adversarial Discrimination**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06380\n- 代码/Code: None\n\n**CASP-Net: Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual Perspective**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06357\n- 代码/Code: None\n\n**DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.06285\n- 代码/Code: https://github.com/yueming6568/deltaedit\n\n**Diversity-Aware Meta Visual Prompting**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08138\n- 代码/Code: https://github.com/shikiw/dam-vp\n\n**Blind Video Deflickering by Neural Filtering with a Flawed Atlas**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08120\n- 代码/Code: https://github.com/chenyanglei/all-in-one-deflicker\n\n**Non-Contrastive Unsupervised Learning of Physiological Signals from Video**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07944\n- 代码/Code: None\n\n**DAA: A Delta Age AdaIN operation for age estimation via binary code transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07929\n- 代码/Code: None\n\n**You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07863\n- 代码/Code: None\n\n**NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07653\n- 代码/Code: None\n\n**I$^2$-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07634\n- 代码/Code: None\n\n**V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.07601\n- 代码/Code: https://github.com/ucla-mobility/V2V4Real\n\n**Bi-directional Distribution Alignment for Transductive Zero-Shot Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08698\n- 代码/Code: https://github.com/Zhicaiwww/Bi-VAEGAN\n\n**Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08658\n- 代码/Code: https://github.com/Kebii/R2ET.\n\n**Lana: A Language-Capable Navigator for Instruction Following and Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08409\n- 代码/Code: https://github.com/wxh1996/lana-vln\n\n**Rethinking Optical Flow from Geometric Matching Consistent Perspective**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08384\n- 代码/Code: https://github.com/dqiaole/matchflow\n\n**Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.08536\n- 代码/Code: https://github.com/joannahong/av-relscore\n\n**Hubs and Hyperspheres: Reducing Hubness and Improving Transductive Few-shot Learning with Hyperspherical Embeddings**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09352\n- 代码/Code: https://github.com/uitml/nohub\n\n**A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09165\n- 代码/Code: https://github.com/huitangtang/on_the_utility_of_synthetic_data\n\n**Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09483\n- 代码/Code: https://github.com/kim-sanghwan/ancl\n\n**TBP-Former: Learning Temporal Bird's-Eye-View Pyramid for Joint Perception and Prediction in Vision-Centric Autonomous Driving**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09998\n- 代码/Code: None\n\n**Adversarial Counterfactual Visual Explanations**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09962\n- 代码/Code: None\n\n**A Dynamic Multi-Scale Voxel Flow Network for Video Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09875\n- 代码/Code: None\n\n**TeSLA: Test-Time Self-Learning With Automatic Adversarial Augmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09870\n- 代码/Code: None\n\n**Video Dehazing via a Multi-Range Temporal Alignment Network with Physical Prior**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09757\n- 代码/Code: https://github.com/jiaqixuac/MAP-Net\n\n**LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09665\n- 代码/Code: None\n\n**On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.09877\n- 代码/Code: https://github.com/DanielTrosten/DeepMVC\n\n**3D Concept Learning and Reasoning from Multi-View Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11327\n- 代码/Code: None\n\n**Picture that Sketch: Photorealistic Image Generation from Abstract Sketches**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11162\n- 代码/Code: None\n\n**Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11101\n- 代码/Code: None\n\n**Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11066\n- 代码/Code: https://github.com/megvii-research/FullMatch.\n\n**Feature Alignment and Uniformity for Test Time Adaptation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10902\n- 代码/Code: None\n\n**EqMotion: Equivariant Multi-agent Motion Prediction with Invariant Interaction Reasoning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10876\n- 代码/Code: https://github.com/mediabrain-sjtu/eqmotion\n\n**Trainable Projected Gradient Method for Robust Fine-tuning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10720\n- 代码/Code: \\url{https://github.com/PotatoTian/TPGM}.\n\n**Partial Network Cloning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10597\n- 代码/Code: https://github.com/jngwenye/pncloning\n\n**Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasoning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10482\n- 代码/Code: https://github.com/szzexpoi/poem\n\n**Uncertainty-Aware Optimal Transport for Semantically Coherent Out-of-Distribution Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10449\n- 代码/Code: https://github.com/lufan31/et-ood\n\n**DeAR: Debiasing Vision-Language Models with Additive Residuals**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10431\n- 代码/Code: None\n\n**3DQD: Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10406\n- 代码/Code: https://github.com/colorful-liyu/3dqd\n\n**Sharpness-Aware Gradient Matching for Domain Generalization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10353\n- 代码/Code: https://github.com/wang-pengfei/sagm\n\n**Extracting Class Activation Maps from Non-Discriminative Features as well**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10334\n- 代码/Code: https://github.com/zhaozhengchen/lpcam\n\n**Make Landscape Flatter in Differentially Private Federated Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11242\n- 代码/Code: None\n\n**Computationally Budgeted Continual Learning: What Does Matter?**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11165\n- 代码/Code: https://github.com/drimpossible/BudgetCL.\n\n**TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11135\n- 代码/Code: https://github.com/ziquanliu/cvpr2023-twins\n\n**Efficient Map Sparsification Based on 2D and 3D Discretized Grids**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.10882\n- 代码/Code: https://github.com/fishmarch/SLAM_Map_Compression.\n\n**ProphNet: Efficient Agent-Centric Motion Forecasting with Anchor-Informed Proposals**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12071\n- 代码/Code: None\n\n**Joint Visual Grounding and Tracking with Natural Language Specification**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12027\n- 代码/Code: https://github.com/lizhou-cs/JointNLT.\n\n**Automatic evaluation of herding behavior in towed fishing gear using end-to-end training of CNN and attention-based networks**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12016\n- 代码/Code: None\n\n**Learning A Sparse Transformer Network for Effective Image Deraining**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11950\n- 代码/Code: https://github.com/cschenxiang/drsformer\n\n**Context De-confounded Emotion Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11921\n- 代码/Code: None\n\n**Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11906\n- 代码/Code: None\n\n**The Treasure Beneath Multiple Annotations: An Uncertainty-aware Edge Detector**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11828\n- 代码/Code: https://github.com/zhoucx117/uaed\n\n**Propagate And Calibrate: Real-time Passive Non-line-of-sight Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11791\n- 代码/Code: None\n\n**Detecting Everything in the Open World: Towards Universal Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11749\n- 代码/Code: None\n\n**Data-efficient Large Scale Place Recognition with Graded Similarity Supervision**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11739\n- 代码/Code: https://github.com/marialeyvallina/generalized_contrastive_loss\n\n**Abstract Visual Reasoning: An Algebraic Approach for Solving Raven's Progressive Matrices**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11730\n- 代码/Code: https://github.com/xu-jingyi/algebraicmr\n\n**Learning a 3D Morphable Face Reflectance Model from Low-cost Data**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11686\n- 代码/Code: https://github.com/yxuhan/reflectancemm\n\n**Full or Weak annotations? An adaptive strategy for budget-constrained annotation campaigns**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11678\n- 代码/Code: None\n\n**ALOFT: A Lightweight MLP-like Architecture with Dynamic Low-frequency Transform for Domain Generalization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11674\n- 代码/Code: https://github.com/lingeringlight/aloft\n\n**Visibility Constrained Wide-band Illumination Spectrum Design for Seeing-in-the-Dark**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11642\n- 代码/Code: Available:https://github.com/MyNiuuu/VCSD.\n\n**Human Pose as Compositional Tokens**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11638\n- 代码/Code: https://github.com/Gengzigang/PCT\n\n**Equiangular Basis Vectors**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11637\n- 代码/Code: https://github.com/njust-vipgroup/equiangular-basis-vectors\n\n**HRDFuse: Monocular 360°Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11616\n- 代码/Code: None\n\n**Boundary Unlearning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11570\n- 代码/Code: None\n\n**One-to-Few Label Assignment for End-to-End Dense Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11567\n- 代码/Code: https://github.com/strongwolf/o2f.\n\n**Fix the Noise: Disentangling Source Feature for Controllable Domain Translation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11545\n- 代码/Code: https://github.com/LeeDongYeun/FixNoise\n\n**PRISE: Demystifying Deep Lucas-Kanade with Strongly Star-Convex Constraints for Multimodel Image Alignment**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11526\n- 代码/Code: fromhttps://github.com/Zhang-VISLab.\n\n**Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11502\n- 代码/Code: None\n\n**Polynomial Implicit Neural Representations For Large Diverse Datasets**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.11424\n- 代码/Code: https://github.com/rajhans0/poly_inr\n\n**Persistent Nature: A Generative Model of Unbounded 3D Worlds**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13515\n- 代码/Code: None\n\n**MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13510\n- 代码/Code: https://github.com/smartbot-pjlab/mv-jar\n\n**NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13483\n- 代码/Code: None\n\n**Egocentric Audio-Visual Object Localization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13471\n- 代码/Code: https://github.com/wikichao/ego-av-loc\n\n**Improving Generalization with Domain Convex Game**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13297\n- 代码/Code: None\n\n**Visual-Language Prompt Tuning with Knowledge-guided Context Optimization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13283\n- 代码/Code: https://github.com/htyao89/kgcoop\n\n**TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13273\n- 代码/Code: https://github.com/plusmultiply/taps3d\n\n**A Bag-of-Prototypes Representation for Dataset-Level Applications**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13251\n- 代码/Code: None\n\n**CrOC: Cross-View Online Clustering for Dense Visual Representation Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13245\n- 代码/Code: https://github.com/stegmuel/croc\n\n**Transforming Radiance Field with Lipschitz Network for Photorealistic 3D Scene Stylization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13232\n- 代码/Code: None\n\n**Exploring Structured Semantic Prior for Multi Label Recognition with Incomplete Labels**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13223\n- 代码/Code: https://github.com/jameslahm/SCPNet.\n\n**Marching-Primitives: Shape Abstraction from Signed Distance Function**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13190\n- 代码/Code: https://github.com/ChirikjianLab/Marching-Primitives.git.\n\n**CP$^3$: Channel Pruning Plug-in for Point-based Networks**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13097\n- 代码/Code: None\n\n**Box-Level Active Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13089\n- 代码/Code: https://github.com/lyumengyao/blad.\n\n**Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13087\n- 代码/Code: https://github.com/zhuohuangai/sharpdro\n\n**CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13076\n- 代码/Code: https://github.com/tgxs002/cora\n\n**PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360$^{\\circ}$**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13071\n- 代码/Code: None\n\n**Human Guided Ground-truth Generation for Realistic Image Super-resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13069\n- 代码/Code: https://github.com/chrisdud0257/hggt\n\n**SIEDOB: Semantic Image Editing by Disentangling Object and Background**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13062\n- 代码/Code: https://github.com/wuyangluo/siedob\n\n**Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13051\n- 代码/Code: None\n\n**Top-Down Visual Attention from Analysis by Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13043\n- 代码/Code: None\n\n**Semantic Ray: Learning a Generalizable Semantic Field with Cross-Reprojection Attention**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13014\n- 代码/Code: None\n\n**Backdoor Defense via Adaptively Splitting Poisoned Dataset**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12993\n- 代码/Code: https://github.com/kuofenggao/asd\n\n**LightPainter: Interactive Portrait Relighting with Freehand Scribble**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12950\n- 代码/Code: None\n\n**Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12930\n- 代码/Code: None\n\n**Don't FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor Poisoned Samples in DNNs**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13211\n- 代码/Code: None\n\n**Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.13031\n- 代码/Code: https://github.com/andreguo/hdrtvdm\n\n**Open Set Action Recognition via Multi-Label Evidential Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12698\n- 代码/Code: None\n\n**Dense Network Expansion for Class Incremental Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12696\n- 代码/Code: None\n\n**VecFontSDF: Learning to Reconstruct and Synthesize High-quality Vector Fonts via Signed Distance Functions**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12675\n- 代码/Code: None\n\n**Correlational Image Modeling for Self-Supervised Visual Pre-Training**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12670\n- 代码/Code: None\n\n**An Extended Study of Human-like Behavior under Adversarial Training**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12669\n- 代码/Code: None\n\n\n**RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-consistent Dataset**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12564\n- 代码/Code: None\n\n**Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12513\n- 代码/Code: None\n\n\n\n\n**BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12419\n- 代码/Code: https://github.com/xu5zhao/bicro\n\n**Balanced Spherical Grid for Egocentric View Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12408\n- 代码/Code: https://github.com/changwoonchoi/EgoNeRF\n\n\n**Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12370\n- 代码/Code: https://github.com/svip-lab/weaksvr\n\n**Re-thinking Federated Active Learning based on Inter-class Diversity**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12317\n- 代码/Code: https://github.com/raymin0223/logo\n\n\n\n\n**Learning a Depth Covariance Function**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12157\n- 代码/Code: None\n\n**Positive-Augmented Constrastive Learning for Image and Video Captioning Evaluation**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12112\n- 代码/Code: https://github.com/aimagelab/pacscore\n\n**Music-Driven Group Choreography**\n\n- 论文/Paper: http://arxiv.org/pdf/2303.12337\n- 代码/Code: None\n\n**Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks**\n\n- 论文/Paper: https://arxiv.org/abs/2303.17602\n- 代码/Code: https://github.com/tinyvision/SOLIDER\n\n**DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects**\n\n- 论文/Paper: https://www.chenbao.tech/dexart/\n- 代码/Code:  https://github.com/Kami-code/dexart-release\n\n[返回目录/back](#Contents)"
  },
  {
    "path": "LICENSE",
    "content": "                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      \"License\" shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      \"Licensor\" shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      \"Legal Entity\" shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      \"control\" means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      \"You\" (or \"Your\") shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      \"Source\" form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      \"Object\" form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      \"Work\" shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      \"Derivative Works\" shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      \"Contribution\" shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, \"submitted\"\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as \"Not a Contribution.\"\n\n      \"Contributor\" shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a \"NOTICE\" text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an \"AS IS\" BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets \"[]\"\n      replaced with your own identifying information. (Don't include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same \"printed page\" as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright [yyyy] [name of copyright owner]\n\n   Licensed under the Apache License, Version 2.0 (the \"License\");\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an \"AS IS\" BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n"
  },
  {
    "path": "README.md",
    "content": "# CVPR2024-Papers-with-Code-Demo\n\n :star_and_crescent:**添加微信: nvshenj125, 备注方向，进交流学习群**\n\n\n欢迎关注公众号：AI算法与图像处理\n\n:star2: [CVPR 2024](https://cvpr.thecvf.com/Conferences/2024) 持续更新最新论文/paper和相应的开源代码/code！\n\n\n\nB站demo：https://space.bilibili.com/288489574\n\n> :hand: ​注：欢迎各位大佬提交issue，分享CVPR 2022论文/paper和开源项目！共同完善这个项目\n>\n> 往年顶会论文汇总：\n>\n> [CVPR2021](https://github.com/DWCTOD/CVPR2023-Papers-with-Code-Demo/blob/main/CVPR2021.md)\n>\n> [CVPR2022](https://github.com/DWCTOD/CVPR2023-Papers-with-Code-Demo/blob/main/CVPR2022.md)\n>\n> [CVPR2023](https://github.com/DWCTOD/CVPR2024-Papers-with-Code-Demo/blob/main/CVPR2023.md)\n>\n> [ICCV2021](https://github.com/DWCTOD/ICCV2021-Papers-with-Code-Demo)\n>\n> [ECCV2022](https://github.com/DWCTOD/ECCV2022-Papers-with-Code-Demo)\n\n### **:fireworks: 欢迎进群** | Welcome\n\nCVPR 2024 论文/paper交流群已成立！已经收录的同学，可以添加微信：**nvshenj125**，请备注：**CVPR+姓名+学校/公司名称**！一定要根据格式申请，可以拉你进群。\n\n<a name=\"Contents\"></a>\n\n\n\n### :hammer: **目录 |Table of Contents（点击直接跳转）**\n\n<details open>\n<summary> 目录（右侧点击可折叠）</summary>\n\n- [Backbone](#Backbone)\n- [数据集/Dataset](#Dataset)\n- [Diffusion Model](#DiffusionModel)\n- [Text-to-Image](#T2I)\n- [NAS](#NAS)\n- [NeRF](#NeRF)\n- [Knowledge Distillation](#KnowledgeDistillation)\n- [多模态 / Multimodal ](#Multimodal)\n- [对比学习/Contrastive Learning](#ContrastiveLearning)\n- [图神经网络 / Graph Neural Networks](#GNN)\n- [胶囊网络 / Capsule Network](#CapsuleNetwork)\n- [图像分类 / Image Classification](#ImageClassification)\n- [目标检测/Object Detection](#ObjectDetection)\n- [目标跟踪/Object Tracking](#ObjectTracking)\n- [轨迹预测/Trajectory Prediction](#TrajectoryPrediction)\n- [语义分割/Segmentation](#Segmentation)\n- [弱监督语义分割/Weakly Supervised Semantic Segmentation](#WSSS)\n- [医学图像分割](#MedicalImageSegmentation)\n- [视频目标分割/Video Object Segmentation](#VideoObjectSegmentation)\n- [交互式视频目标分割/Interactive Video Object Segmentation](#InteractiveVideoObjectSegmentation)\n- [Visual Transformer](#VisualTransformer)\n- [深度估计/Depth Estimation](#DepthEstimation)\n- [人脸识别/Face Recognition](#FaceRecognition)\n- [人脸检测/Face Detection](#FaceDetection)\n- [人脸活体检测/Face Anti-Spoofing](#FaceAnti-Spoofing)\n- [人脸年龄估计/Age Estimation](#AgeEstimation)\n- [人脸表情识别/Facial Expression Recognition](#FacialExpressionRecognition)\n- [人脸属性识别/Facial Attribute Recognition](#FacialAttributeRecognition)\n- [人脸编辑/Facial Editing](#FacialEditing)\n- [人脸重建/Face Reconstruction](#FaceReconstruction)\n- [Talking Face](#TalkingFace)\n- [换脸/Face Swap](#FaceSwap)\n- [姿态估计/Pose Estimation](#HumanPoseEstimation)\n- [手势姿态估计（重建）/Hand Pose Estimation( Hand Mesh Recovery)](#HandPoseEstimation)\n- [视频动作检测/Video Action Detection](#VideoActionDetection)\n- [手语翻译/Sign Language Translation](#SignLanguageTranslation)\n- [3D人体重建](#3D人体重建)\n- [行人重识别/Person Re-identification](#PersonRe-identification)\n- [行人搜索/Person Search](#PersonSearch)\n- [人群计数 / Crowd Counting](#CrowdCounting)\n- [GAN](#GAN)\n- [彩妆迁移 / Color-Pattern Makeup Transfer](#CPM)\n- [字体生成 / Font Generation](#FontGeneration)\n- [场景文本检测、识别/Scene Text Detection/Recognition](#OCR)\n- [图像、视频检索 / Image Retrieval/Video retrieval](#Retrieval)\n- [Image Animation](#ImageAnimation)\n- [抠图/Image Matting](#ImageMatting)\n- [超分辨率/Super Resolution](#SuperResolution)\n- [图像复原/Image Restoration](#ImageRestoration)\n- [图像补全/Image Inpainting](#ImageInpainting)\n- [图像去噪/Image Denoising](#ImageDenoising)\n- [图像编辑/Image Editing](#ImageEditing)\n- [图像拼接/Image stitching](#Imagestitching)\n- [图像匹配/Image Matching](#ImageMatching)\n- [图像融合/Image Blending](#ImageBlending)\n- [图像去雾/Image Dehazing](#ImageDehazing)\n- [图像去模糊/Image Deblur](#ImageDeblur)\n- [图像压缩/Image Compression](#ImageCompression)\n- [反光去除/Reflection Removal](#ReflectionRemoval)\n- [车道线检测/Lane Detection](#LaneDetection)\n- [自动驾驶 / Autonomous Driving](#AutonomousDriving)\n- [流体重建/Fluid Reconstruction](#FluidReconstruction)\n- [场景重建 / Scene Reconstruction](#SceneReconstruction)\n- [3D Reconstruction](#3DReconstruction)\n- [视频插帧/Frame Interpolation](#FrameInterpolation)\n- [视频超分 / Video Super-Resolution](#VideoSuper-Resolution)\n- [3D点云/3D point cloud](#3DPointCloud)\n- [标签噪声 / Label-Noise](#Label-Noise)\n- [对抗样本/Adversarial Examples](#AdversarialExamples)\n- [Anomaly Detection](#AnomalyDetection)\n- [其他/Other](#Other)\n\n\n</details>\n\n<a name=\"Backbone\"></a>\n\n## Backbone\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"Dataset\"></a> \n\n## 数据集/Dataset\n\n**HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02640\n- 代码/Code: None\n\n**Traffic Scene Parsing through the TSP6K Dataset**\n\n- 论文/Paper: https://arxiv.org/pdf/2303.02835.pdf\n- 代码/Code: https://github.com/PengtaoJiang/TSP6K\n\n[返回目录/back](#Contents)\n\n<a name=\"DiffusionModel\"></a> \n\n# Diffusion Model\n\n**Balancing Act: Distribution-Guided Debiasing in Diffusion Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18206\n- 代码/Code: None\n\n**DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19481\n- 代码/Code: https://github.com/mit-han-lab/distrifuser\n\n**DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19302\n- 代码/Code: https://github.com/iit-pavis/diffassemble\n\n**Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.00644\n- 代码/Code: None\n\n**Few-shot Learner Parameterization by Diffusion Time-steps**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02649\n- 代码/Code: https://github.com/yue-zhongqi/tif\n\n**MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04290\n- 代码/Code: None\n\n**DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations**\n\n- 论文/Paper: https://arxiv.org/abs/2403.06951\n- 代码/Code: https://github.com/Tianhao-Qi/DEADiff_code\n\n**Face2Diffusion for Fast and Editable Face Personalization**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05094\n- 代码/Code: https://github.com/mapooon/Face2Diffusion\n\n**DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06951\n- 代码/Code: None\n\n**MACE: Mass Concept Erasure in Diffusion Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06135\n- 代码/Code: https://github.com/Shilin-LU/MACE\n\n**It's All About Your Sketch: Democratising Sketch Control in Diffusion Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07234\n- 代码/Code: https://github.com/subhadeepkoley/demosketch2rgb\n\n**SemCity: Semantic Scene Generation with Triplane Diffusion**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07773\n- 代码/Code: https://github.com/zoomin-lee/semcity\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"T2I\"></a> \n\n## Text-to-Image\n\n**RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.00483\n- 代码/Code: None\n\n**NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03485\n- 代码/Code: https://github.com/univ-esuty/noisecollage\n\n**Discriminative Probing and Tuning for Text-to-Image Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04321\n- 代码/Code: None\n\n**Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05239\n- 代码/Code: None\n\n**Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06452\n- 代码/Code: https://github.com/mulns/Text2QR\n\n**Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07214\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"NAS\"></a> \n\n## NAS\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"NeRF\"></a> \n\n# NeRF\n\n**GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03608\n- 代码/Code: None\n\n**DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06912\n- 代码/Code: https://github.com/fictionarry/dngaussian\n\n**S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06205\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"KnowledgeDistillation\"></a> \n\n## Knowledge Distillation\n\n**PromptKD: Unsupervised Prompt Distillation for Vision-Language Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02781\n- 代码/Code: https://github.com/zhengli97/PromptKD\n\n**Logit Standardization in Knowledge Distillation**\n\n- 论文/Paper: https://arxiv.org/abs/2403.01427\n- 代码/Code: https://github.com/sunshangquan/logit-standardization-KD\n\n**RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05061\n- 代码/Code: None\n\n**$V_kD:$ Improving Knowledge Distillation using Orthogonal Projections**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06213\n- 代码/Code: https://github.com/roymiles/vkd\n\n[返回目录/back](#Contents)\n\n<a name=\"Multimodal\"></a> \n\n## 多模态 / Multimodal\n\n**MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception**\n\n- 论文/Paper: https://arxiv.org/abs/2312.07472\n- 代码/Code: https://github.com/IranQin/MP5\n- 主页/Website：https://iranqin.github.io/MP5.github.io/\n\n**Polos: Multimodal Metric Learning from Human Feedback for Image Captioning**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18091\n- 代码/Code: None\n\n**MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02991\n- 代码/Code: None\n\n**Learning to Rematch Mismatched Pairs for Robust Cross-Modal Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05105\n- 代码/Code: https://github.com/hhc1997/L2RM\n\n**MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07839\n- 代码/Code: None\n\n**Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Matching Framework**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07636\n- 代码/Code: https://github.com/hieuphan33/mavl\n\n**Calibrating Multi-modal Representations: A Pursuit of Group Robustness without Annotations**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07241\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"ContrastiveLearning\"></a> \n\n## Contrastive Learning\n\n**Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06122\n- 代码/Code: https://github.com/root0yang/blindnet\n\n[返回目录/back](#Contents)\n\n<a name=\"CapsuleNetwork\"></a> \n\n# 胶囊网络 / Capsule Network\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageClassification\"></a> \n\n# 图像分类 / Image Classification\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"ObjectDetection\"></a> \n\n## 目标检测/Object Detection\n\n**UniMODE: Unified Monocular 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18573\n- 代码/Code: None\n\n**CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04198\n- 代码/Code: https://github.com/SerCharles/CN-RMA\n\n**Memory-based Adapters for Online 3D Scene Perception**\n\n- 论文/Paper: https://arxiv.org/abs/2403.06974\n- 代码/Code:https://github.com/xuxw98/Online3D\n\n **Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement**\n\n- 论文/Paper: https://arxiv.org/abs/2403.16131\n\n- 代码/Code:https://github.com/xiuqhou/Salience-DETR\n\n**Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06093\n- 代码/Code: https://github.com/nullmax-vision/QAF2D\n\n**SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05817\n- 代码/Code: https://github.com/zhanggang001/hednet\n\n[返回目录/back](#Contents)\n\n\n\n<a name=\"ObjectTracking\"></a> \n\n# 目标跟踪/Object Tracking\n\n**DeconfuseTrack:Dealing with Confusion for Multi-Object Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02767\n- 代码/Code: None\n\n**Delving into the Trajectory Long-tail Distribution for Muti-object Tracking**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04700\n- 代码/Code: https://github.com/chen-si-jia/Trajectory-Long-tail-Distribution-for-MOT\n\n[返回目录/back](#Contents)\n\n# 3D Object Tracking\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"TrajectoryPrediction\"></a> \n\n## 轨迹预测/Trajectory Prediction\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"Segmentation\"></a> \n\n## 语义分割/Segmentation\n\n**PEM: Prototype-based Efficient MaskFormer for Image Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19422\n- 代码/Code: https://github.com/niccolocavagnero/pem\n\n**Towards the Uncharted: Density-Descending Feature Perturbation for Semi-supervised Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06462\n- 代码/Code: https://github.com/Gavinwxy/DDFP\n\n**Text-Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06247\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"WSSS\"></a>\n\n## 弱监督语义分割/Weakly Supervised Semantic Segmentation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"MedicalImageSegmentation\"></a>\n\n# 医学图像/Medical Image\n\n**Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18933\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"VideoObjectSegmentation\"></a>\n\n# 视频目标分割/Video Object Segmentation\n\n**Depth-aware Test-Time Training for Zero-shot Video Object Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04258\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"InteractiveVideoObjectSegmentation\"></a>\n\n# 交互式视频目标分割/Interactive Video Object Segmentation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"VisualTransformer\"></a>\n\n# Visual Transformer\n\n**Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05419\n- 代码/Code: https://github.com/techmn/satmae_pp\n\n[返回目录/back](#Contents)\n\n<a name=\"DepthEstimation\"></a>\n\n## 深度估计/Depth Estimation\n\n**Representations for Recognition and Retrieval**\n\n- 论文/Paper: https://arxiv.org/pdf/2403.07535.pdf\n- 代码/Code: https://github.com/Junda24/AFNet\n\n[返回目录/back](#Contents)\n\n<a name=\"Retrieval\"></a>\n\n# 图像、视频检索 / Image Retrieval/Video retrieval\n\n**Dual Pose-invariant Embeddings: Learning Category and Object-specific Discriminative Representations for Recognition and Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.00272\n- 代码/Code: None\n\n**Learning to Rematch Mismatched Pairs for Robust Cross-Modal Retrieval**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05105\n- 代码/Code: https://github.com/hhc1997/L2RM\n\n**How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07203\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"SuperResolution\"></a>\n\n## 超分辨率/Super Resolution\n\n**SeD: Semantic-Aware Discriminator for Image Super-Resolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19387\n- 代码/Code: None\n\n**Training Generative Image Super-Resolution Models by Wavelet-Domain Losses Enables Better Control of Artifacts**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19215\n- 代码/Code: https://github.com/mandalinadagi/wgsr\n\n**CAMixerSR: Only Details Need More \"Attention\"**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19289\n- 代码/Code: https://github.com/icandle/camixersr\n\n**Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02601\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageRestoration\"></a>\n\n## 图像复原/Image Restoration\n\n**Boosting Image Restoration via Priors from Pre-trained Models**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06793\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageDenoising\"></a>\n\n## 图像去噪/Image Denoising\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageEditing\"></a>\n\n# 图像编辑/Image Editing\n\n**Doubly Abductive Counterfactual Inference for Text-based Image Editing**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02981\n- 代码/Code: https://github.com/xuesong39/DAC\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageCompression\"></a>\n\n# 图像压缩/Image Compression\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"ImageDeblur\"></a>\n\n## 图像去模糊/Image Deblur\n\n**A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02611\n- 代码/Code: https://github.com/PieceZhang/MPT-CataBlur\n\n[返回目录/back](#Contents)\n\n<a name=\"AutonomousDriving\"></a>\n\n## 自动驾驶 / Autonomous Driving\n\n**Abductive Ego-View Accident Video Understanding for Safe Driving Perception**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.00436\n- 代码/Code: None\n\n**Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07535\n- 代码/Code: website:https://github.com/Junda24/AFNet/\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceRecognition\"></a>\n\n# 人脸识别/Face Recognition\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceDetection\"></a>\n\n# 人脸检测/Face Detection\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceAnti-Spoofing\"></a>\n\n# 人脸活体检测/Face Anti-Spoofing\n\n**Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19298\n- 代码/Code: https://github.com/omggggg/mmdg\n\n[返回目录/back](#Contents)\n\n<a name=\"FaceReconstruction\"></a>\n\n## 人脸重建/Face Reconstruction\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"VideoActionDetection\"></a>\n\n# 视频动作检测/Video Action Detection\n\n\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"SignLanguageTranslation\"></a>\n\n# 手语翻译/Sign Language Translation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"PersonRe-identification\"></a>\n\n# 行人重识别/Person Re-identification\n\n\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"TalkingFace\"></a>\n\n# Talking Face\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"HumanPoseEstimation\"></a>\n\n# 姿态估计/Pose Estimation\n\n**FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03221\n- 代码/Code: None\n\n**Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04381\n- 代码/Code: https://github.com/MickeyLLG/S2DHand\n\n**Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation**\n\n- 论文/Paper: https://arxiv.org/pdf/2311.12028.pdf\n- 代码/Code: https://github.com/NationalGAILab/HoT\n\n[返回目录/back](#Contents)\n\n\n\n<a name=\"GAN\"></a>\n\n# GAN\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"AgeEstimation\"></a>\n\n# 人脸年龄估计/Age Estimation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"FacialExpressionRecognition\"></a>\n\n# 人脸表情识别/Facial Expression Recognition\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"HandPoseEstimation\"></a>\n\n## 手势姿态估计（重建）/Hand Pose Estimation( Hand Mesh Recovery)\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"3DReconstruction\"></a>\n\n## 3D Reconstruction\n\n**UFORecon: Generalizable Sparse-View Surface Reconstruction from Arbitrary and UnFavOrable Data Sets**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05086\n- 代码/Code: https://github.com/Youngju-Na/UFORecon\n\n**DITTO: Dual and Integrated Latent Topologies for Implicit 3D Reconstruction**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05005\n- 代码/Code: None\n\n**Memory-based Adapters for Online 3D Scene Perception**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06974\n- 代码/Code: None\n\n**Bayesian Diffusion Models for 3D Shape Reconstruction**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06973\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n<a name=\"FrameInterpolation\"></a>\n\n## 视频插帧/Frame Interpolation\n\n\n\n[返回目录/back](#Contents)\n\n<a name=\"3DPointCloud\"></a>\n\n## 3D点云/3D point cloud\n\n**Rethinking Few-shot 3D Point Cloud Semantic Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.00592\n- 代码/Code: https://github.com/ZhaochongAn/COSeg\n\n**Extend Your Own Correspondences: Unsupervised Distant Point Cloud Registration by Progressive Distance Extension**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03532\n- 代码/Code: https://github.com/liuquan98/eyoc\n\n**Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05247\n- 代码/Code: https://github.com/TRLou/HiT-ADV\n\n[返回目录/back](#Contents)\n\n<a name=\"AnomalyDetection\"></a>\n\n# Anomaly Detection\n\n**Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06495\n- 代码/Code: https://github.com/mala-lab/inctrl\n\n**RealNet: A Feature Selection Network with Realistic Synthetic Anomaly for Anomaly Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05897\n- 代码/Code: https://github.com/cnulab/realnet\n\n[返回目录/back](#Contents)\n\n<a name=\"Other\"></a>\n\n## 其他/Other\n\n**DisCo: Disentangled Control for Realistic Human Dance Generation**\n\n- 论文/Paper: https://arxiv.org/abs/2307.00040\n- 代码/Code: https://github.com/Wangt-CN/DisCo\n\n**Gradient Reweighting: Towards Imbalanced Class-Incremental Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18528\n- 代码/Code: None\n\n**TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18490\n- 代码/Code: None\n\n**Attention-Propagation Network for Egocentric Heatmap to 3D Pose Lifting**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18330\n- 代码/Code: https://github.com/tho-kn/egotap\n\n**Attentive Illumination Decomposition Model for Multi-Illuminant White Balancing**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18277\n- 代码/Code: None\n\n**Misalignment-Robust Frequency Distribution Loss for Image Transformation**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18192\n- 代码/Code: https://github.com/eezkni/FDL\n\n**3DSFLabelling: Boosting 3D Scene Flow Estimation by Pseudo Auto-labelling**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18146\n- 代码/Code: https://github.com/jiangchaokang/3dsflabelling\n\n**OccTransformer: Improving BEVFormer for 3D camera-only occupancy prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18140\n- 代码/Code: None\n\n**UniVS: Unified and Universal Video Segmentation with Prompts as Queries**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18115\n- 代码/Code: https://github.com/minghanli/univs\n\n**Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18078\n- 代码/Code: https://github.com/YanzuoLu/CFLD\n\n**Boosting Neural Representations for Videos with a Conditional Decoder**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18152\n- 代码/Code: None\n\n**Classes Are Not Equal: An Empirical Study on Image Recognition Fairness**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18133\n- 代码/Code: None\n\n**QN-Mixer: A Quasi-Newton MLP-Mixer Model for Sparse-View CT Reconstruction**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.17951\n- 代码/Code: None\n\n**Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19479\n- 代码/Code: None\n\n**SeMoLi: What Moves Together Belongs Together**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19463\n- 代码/Code: None\n\n**Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19326\n- 代码/Code: None\n\n**CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19231\n- 代码/Code: https://github.com/lu-feng/cricavpr\n\n**MemoNav: Working Memory Model for Visual Navigation**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19161\n- 代码/Code: None\n\n**VideoMAC: Video Masked Autoencoders Meet ConvNets**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19082\n- 代码/Code: https://github.com/nust-machine-intelligence-laboratory/videomac\n\n**Theoretically Achieving Continuous Representation of Oriented Bounding Boxes**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18975\n- 代码/Code: https://github.com/Jittor/JDet\n\n**OHTA: One-shot Hand Avatar via Data-driven Implicit Priors**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18969\n- 代码/Code: None\n\n**WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concepts**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18956\n- 代码/Code: None\n\n**Spectral Meets Spatial: Harmonising 3D Shape Matching and Interpolation**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18920\n- 代码/Code: None\n\n**SwitchLight: Co-design of Physics-driven Architecture and Pre-training Framework for Human Portrait Relighting**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18848\n- 代码/Code: None\n\n**ViewFusion: Towards Multi-View Consistency via Interpolated Denoising**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18842\n- 代码/Code: None\n\n**OpticalDR: A Deep Optical Imaging Model for Privacy-Protective Depression Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18786\n- 代码/Code: None\n\n**NARUTO: Neural Active Reconstruction from Uncertain Target Observations**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18771\n- 代码/Code: None\n\n**Towards Generalizable Tumor Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.19470\n- 代码/Code: None\n\n**Rethinking Multi-domain Generalization with A General Learning Objective**\n\n- 论文/Paper: http://arxiv.org/pdf/2402.18853\n- 代码/Code: None\n\n**Rethinking Inductive Biases for Surface Normal Estimation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.00712\n- 代码/Code: https://github.com/baegwangbin/DSINE\n\n**SURE: SUrvey REcipes for building reliable and robust deep networks**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.00543\n- 代码/Code: https://github.com/YutingLi0606/SURE\n\n**Selective-Stereo: Adaptive Frequency Information Selection for Stereo Matching**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.00486\n- 代码/Code: https://github.com/Windsrain/Selective-Stereo.\n\n**Deformable One-shot Face Stylization via DINO Semantic Guidance**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.00459\n- 代码/Code: https://github.com/zichongc/DoesFS\n\n**CustomListener: Text-guided Responsive Interaction for User-friendly Listening Head Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.00274\n- 代码/Code: None\n\n**NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03122\n- 代码/Code: None\n\n**Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02782\n- 代码/Code: None\n\n**HUNTER: Unsupervised Human-centric 3D Detection via Transferring Knowledge from Synthetic Instances to Real Scenes**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02769\n- 代码/Code: None\n\n**Learning Group Activity Features Through Person Attribute Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02753\n- 代码/Code: https://github.com/chihina/GAFL-CVPR2024.\n\n**Interactive Continual Learning: Fast and Slow Thinking**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02628\n- 代码/Code: None\n\n**NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03122\n- 代码/Code: None\n\n**Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02782\n- 代码/Code: None\n\n**HUNTER: Unsupervised Human-centric 3D Detection via Transferring Knowledge from Synthetic Instances to Real Scenes**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02769\n- 代码/Code: None\n\n**Learning Group Activity Features Through Person Attribute Prediction**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02753\n- 代码/Code: https://github.com/chihina/GAFL-CVPR2024.\n\n**Interactive Continual Learning: Fast and Slow Thinking**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.02628\n- 代码/Code: None\n\n**Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03890\n- 代码/Code: None\n\n**DART: Implicit Doppler Tomography for Radar Novel View Synthesis**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03896\n- 代码/Code: None\n\n**MeaCap: Memory-Augmented Zero-shot Image Captioning**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03715\n- 代码/Code: https://github.com/joeyz0z/MeaCap\n\n**HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03561\n- 代码/Code: None\n\n**Continual Segmentation with Disentangled Objectness Learning and Class Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03477\n- 代码/Code: https://github.com/jordangong/CoMasTRe\n\n**HDRFlow: Real-Time HDR Video Reconstruction with Large Motions**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03447\n- 代码/Code: None\n\n**LEAD: Learning Decomposition for Source-free Universal Domain Adaptation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03421\n- 代码/Code: https://github.com/ispc-lab/lead\n\n**F$^3$Loc: Fusion and Filtering for Floorplan Localization**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03370\n- 代码/Code: None\n\n**Enhancing Vision-Language Pre-training with Rich Supervisions**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03346\n- 代码/Code: None\n\n**Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04765\n- 代码/Code: None\n\n**Discriminative Sample-Guided and Parameter-Efficient Feature Space Adaptation for Cross-Domain Few-Shot Learning**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04492\n- 代码/Code: https://github.com/rashindrie/dipa\n\n**Learning to Remove Wrinkled Transparent Film with Polarized Prior**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04368\n- 代码/Code: https://github.com/jqtangust/filmremoval\n\n**LORS: Low-rank Residual Structure for Parameter-Efficient Network Stacking**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04303\n- 代码/Code: None\n\n**Active Generalized Category Discovery**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04272\n- 代码/Code: https://github.com/mashijie1028/activegcd\n\n**MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04149\n- 代码/Code: https://github.com/ispc-lab/map\n\n**A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.04245\n- 代码/Code: https://github.com/dalision/modalbiasavsr\n\n**Seamless Human Motion Composition with Blended Positional Encodings**\n\n- 论文/Paper: https://arxiv.org/abs/2402.15509\n- 代码/Code:https://github.com/BarqueroGerman/FlowMDM\n\n**DiffusionLight: Light Probes for Free by Painting a Chrome Ball**\n\n- 论文/Paper: https://arxiv.org/abs/2312.09168\n- 代码/Code:https://github.com/DiffusionLight/DiffusionLight\n\n**SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05087\n- 代码/Code: https://github.com/initialneil/SplattingAvatar\n\n**Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06946\n- 代码/Code: https://github.com/tl-uestc/unimos\n\n**Real-Time Simulated Avatar from Head-Mounted Sensors**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06862\n- 代码/Code: None\n\n**DiaLoc: An Iterative Approach to Embodied Dialog Localization**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06846\n- 代码/Code: None\n\n**FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06775\n- 代码/Code: https://github.com/modelscope/facechain\n\n**EarthLoc: Astronaut Photography Localization by Indexing Earth from Space**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06758\n- 代码/Code: https://github.com/gmberton/earthloc\n\n**CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization Perspective**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06676\n- 代码/Code: https://github.com/snskysk/cam-back-again\n\n**Distributionally Generative Augmentation for Fair Facial Attribute Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06606\n- 代码/Code: https://github.com/heqianpei/diga\n\n**Exploiting Style Latent Flows for Generalizing Deepfake Detection Video Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06592\n- 代码/Code: None\n\n**MoST: Motion Style Transformer between Diverse Action Contents**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06225\n- 代码/Code: https://github.com/Boeun-Kim/MoST.\n\n**Coherent Temporal Synthesis for Incremental Action Segmentation**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06102\n- 代码/Code: None\n\n**Is Vanilla MLP in Neural Radiance Field Enough for Few-shot View Synthesis?**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06092\n- 代码/Code: None\n\n**LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.05854\n- 代码/Code: None\n\n**PeerAiD: Improving Adversarial Distillation from a Specialized Peer Tutor**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.06668\n- 代码/Code: None\n\n**SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.03170\n- 代码/Code: None\n\n**Multi-Task Dense Prediction via Mixture of Low-Rank Experts**\n\n- 论文/Paper: https://arxiv.org/abs/2403.17749\n- 代码/Code: https://github.com/YuqiYang213/MLoRE\n\n**Beyond Text: Frozen Large Language Models in Visual Signal Comprehension**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07874\n- 代码/Code: https://github.com/zh460045050/v2l-tokenizer\n\n**Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07719\n- 代码/Code: https://github.com/wonderlandxd/wikg\n\n**Robust Synthetic-to-Real Transfer for Stereo Matching**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07705\n- 代码/Code: https://github.com/jiaw-z/dkt-stereo\n\n**CuVLER: Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised Transformers**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07700\n- 代码/Code: https://github.com/shahaf-arica/cuvler\n\n**Masked AutoDecoder is Effective Multi-Task Vision Generalist**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07692\n- 代码/Code: https://github.com/hanqiu-hq/mad\n\n**PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07589\n- 代码/Code: None\n\n**Unleashing Network Potentials for Semantic Scene Completion**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07560\n- 代码/Code: https://github.com/fereenwong/ammnet\n\n**Open-World Semantic Segmentation Including Class Similarity**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07532\n- 代码/Code: https://github.com/PRBonn/ContMAV\n\n**ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07392\n- 代码/Code: https://github.com/Traffic-X/ViT-CoMer\n\n**FSC: Few-point Shape Completion**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07359\n- 代码/Code: None\n\n**Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07347\n- 代码/Code: https://github.com/jiafei127/fd4mm\n\n**A Bayesian Approach to OOD Robustness in Image Classification**\n\n- 论文/Paper: http://arxiv.org/pdf/2403.07277\n- 代码/Code: None\n\n[返回目录/back](#Contents)\n\n"
  }
]